At work, we have a git process with pull requests. This is a good process and enforces validation and has the benefit of sharing knowledge of recent modifications.

Why use a code formater?

As with any rereading, things are easier when you can focus on differences only. Usually, a pull request shows only differences. In the real world, those differences are spoiled by unwanted differences. Those differences usually come from the IDE making unwanted changes, such as reformating or reencoding.

side by side

We used to have formatting guidelines somewhere on our version control and it was up to developers to import it into their IDE. This is bad because not everyone imports it and most forget to import it when upgrading their IDE.

This is why I've set up a .idea/codeStyleSettings.xml file as well as a .editorConfig file for basic tab and encoding features and pushed these into version control.

Manage code style settings in version control

It is a best practice to use code style in version control. The same thing goes with linting. It's also best to choose a code style that's been widely tested such as Google's Code Style, which exists for a wide variety of languages.

Formatting a code base

The big project I'm working on has three active release branches on top of the production branch, and about a hundred feature branches. As well, these feature branches are never rebased, usually, they will have target branches merged into them a couple times. It basically looks like this.

branches

I've managed to format all the release branches and keep them properly merged into one another. The downside is that previous feature branches would have all new formatted text in diff on PR. This is even worse since now nearly all code base appears in PR.

Let's format all feature branches!

Since it would be impossible to reformat all of the feature branches and hope for developers not to have conflicts with their unpushed code, I've made a script to help them do this task since most developers are not familiar enough with git and it will save them time and reduce failures.

This script does the following :

  • Format current branch
  • Find common ancestor with PR destination branch
  • Squash in a single commit on common ancestor with destination branch
  • Rebase onto destination branch

This script is meant to be used with git-bash on Windows.

#!/bin/bash

# format current branch, squash in one commit on common ancestor with destination
# then rebase on the destination
#
# Before launching :
# clean generated directories
# git fetch (to update origin)

function usage {
  echo "$(basename "$0") [-i <idea_home>] -o <onto>"
  echo "format current branch, squash into one commit and rebase onto <onto>"
}

while getopts ":i:o:" opt; do
  case $opt in
    i)
      IDEA_HOME=$OPTARG
      ;;
    o)
      ONTO=$OPTARG
      ;;
    :)
      echo "Option -$OPTARG requires an argument." >&2
      exit 1
      ;;
  esac
done

if [[ -z "$ONTO" ]]; then
  usage
  exit 1
fi
if [[ -n "$IDEA_HOME" ]]; then
  echo -e "\e[0;36mIDEA_HOME : $IDEA_HOME \e[0m"
fi
echo -e "\e[0;36mONTO : $ONTO \e[0m"

ORIGINAL_BRANCH=$(git rev-parse --abbrev-ref HEAD)
FORMAT_BRANCH="${ORIGINAL_BRANCH}-reformated"
echo -e "\e[0;33mCreated branch $FORMAT_BRANCH\e[0m"
git branch -qD $FORMAT_BRANCH 2>/dev/null
git checkout -b $FORMAT_BRANCH

MASKS="*.htm,*.html,*.java,*.js,pom.xml,*.yml,*.zul"

if [[ -z "$IDEA_HOME" ]]; then
  echo -e "\e[0;33mReload project in Intellij to get the new formater.\e[0m"
  echo -e "\e[0;33mSelect project in Intellij and format with ctrl-alt-l and the following options:\e[0m"
  echo -e "\e[0;33m- Optimize imports\e[0m"
  echo -e "\e[0;33m- Scope Project Files\e[0m"
  echo -e "\e[0;33m- Mask \"$MASKS\"\e[0m"
  echo -e "\e[0;33m\e[0m"
  printf "\e[0;33mHave you formated the project ? [Y/N]\e[0m "
  read isFormated
  while [ "$isFormated" != "Y" ]; do
    printf "\e[0;33mHave you formated the project ? [Y/N]\e[0m "
    read isFormated
  done
else
  if [ $(tasklist | grep idea | wc -l) -ne 0 ]; then
    echo -e "\e[0;31mPlease quit Intellij.\e[0m"
    git checkout $ORIGINAL_BRANCH
    git branch -D $FORMAT_BRANCH
    exit 1
  fi
  echo -e "\e[0;33mFormating files\e[0m"
  "${IDEA_HOME}/bin/format.bat" -settings .idea/codeStyleSettings.xml -m $MASKS -R .
  if [ $? -ne 0 ]; then
    echo -e "\e[0;31mError during formating.\e[0m"
  fi
fi

echo -e "\e[0;33mFormating commit\e[0m"
git commit -am "formating code base"

FROM=$(git merge-base $ORIGINAL_BRANCH $ONTO)
echo -e "\e[0;33mSquash $FORMAT_BRANCH into a single commit on $FROM\e[0m"
git log --pretty=format:"%s" $FROM..HEAD > /tmp/commit-message
git reset --soft $FROM
git commit -a --file /tmp/commit-message
rm /tmp/commit-message

echo -e "\e[0;33mRebase onto $ONTO\e[0m"
git rebase --onto $ONTO $FORMAT_BRANCH~ -s recursive -X ours

if [ $? -eq 0 ]; then
  echo -e "\e[0;32mFormating done.\e[0m"
  echo -e "\e[0;32m\e[0m"
  echo -e "\e[0;32mBranche $FORMAT_BRANCH was created upon $ORIGINAL_BRANCH.\e[0m"
  echo -e "\e[0;32mFiles on $FORMAT_BRANCH have been formated.\e[0m"
  echo -e "\e[0;32mBranche $FORMAT_BRANCH has been rebase on a single commit from $FROM onto $ONTO\e[0m"
  echo -e "\e[0;32mCheck the content of $FORMAT_BRANCH and if correct, execute \"git checkout $ORIGINAL_BRANCH && git reset --hard $FORMAT_BRANCH && git branch -d $FORMAT_BRANCH\"\e[0m"
else
  echo -e "\e[0;31mFormat in progress.\e[0m"
  echo -e "\e[0;31m\e[0m"
  echo -e "\e[0;31mBranch $FORMAT_BRANCH was created from $ORIGINAL_BRANCH.\e[0m"
  echo -e "\e[0;31mFiles on $FORMAT_BRANCH have been formated.\e[0m"
  echo -e "\e[0;31mBranch $FORMAT_BRANCH should be rebase in a single commitfrom $FROM onto $ONTO\e[0m"
  echo -e "\e[0;31mRebase hasn't finished as some conflicts need to be resolved between $ORIGINAL_BRANCH and $ONTO\e[0m"
  echo -e "\e[0;31mOnce all conflicts are resolved, check the content of $FORMAT_BRANCH. If all is fine, run \"git checkout $ORIGINAL_BRANCH && git reset --hard $FORMAT_BRANCH && git branch -d $FORMAT_BRANCH\"\e[0m"
fi

I'm aware that this could have been avoided by using a cleaner branching style, or even a leaner branching style such as feature toggle, but it was easier and had less impact this way.