Better Gitflow

Published on Sat, Dec 17, 2022 by jons mostovojs, working from London, UK.

TL;DRπŸ”—

  • We as developers should write more detailed commit messages using a rigid format.
  • If we do, then our git log becomes valuable knowledge base.
  • This can be transferred to the development process itself, not just to main branch management.
  • Then we have to improve gitflow not to delete important data about the process.
  • Experiment branches are a workflow to preserve development process artefacts.

It may be more beneficial to embrace transparency and learning from mistakes by keeping the git history intact

GitflowπŸ”—

We love our Git. Despite a bit rough UX, it delivers version control with ruthless efficiency. While we play around with darcs for our fork of passveil distributed secret manager, we still use Git for everything but. Thanks to GitHub, most of the techies have seen the light and are using Git.

Git was originally designed to work with mailing lists, consolidating patches from multiple independent forks into a single repository. However, with the centralization of Git into GitHub, branches have become the primary unit of work organization. They allow developers to isolate new development work from the main codebase, which is centralised to one GitHub repository.

Sad or not, that's how humans are. We converge to convenient, clean, easy to use, centralised systems. If you would like to argue, ask yourself "am I currently living in a city?" (I applaud you if the answer to this question is "no"). There are services like sourcehut, which modernise the old-school forky/patchy way to use Git though. We like them and we were using those at first. But as a tiny company, we simply can't afford to sacrifice the convenience and efficiency for the principled choice of tools. Needless to say, GitHub offers a decent amount of exposure, which is invaluable at the start.

A big side-effect of the prominence of branches, was that people started longing for "clean" repositories. I guess, running git branch --sort=committerdate | tac is too hard for some (GitHub UI also has a way to filter out stale branches)?..

This is why Gitflow rose to prominence. It's all about creating separate branches for different stages of development, like features, releases, and hotfixes.

So, here's how it works: when you're working on a new feature, you create a new branch for it. This helps to isolate your new code from the main branch while you're working on it.

Once you're finished with your feature, you can merge it back into the main branch. But before you do, you'll want to squash your commits into a single commit. This makes the Git history more readable and helps to avoid cluttering up the main branch with a bunch of unnecessary commits.

To get your code merged into the main branch, you'll create a pull request (PR). Your team will review the PR and, if everything looks good, they'll approve it and your feature branch will be merged into the main branch.

Once your feature branch has been merged, gitflow suggests you delete it.

This sounds like a great workflow, except...

The most valuable data is lostπŸ”—

...which is such a bummer. What's worse is that this concept is very obscured and not immediately obvious. Otherwise so many companies, including us, wouldn't be all about Gitflow at different stages of their existence. To us as well the idea of squashing seemed like a no-brainer. Git log of main looked so tidy and pretty. Delete feature branches and your git graph --decorate -n99 doesn't look like the map of London Underground.

But then we realized the major drawback of this approach: we're basically tossing out valuable information about the development process. Why would we want to throw away valuable insights and lessons learned? It's just so frustrating to see all that knowledge go to waste.

Instead of constantly repeating mistakes and discoveries, we could be leveraging the git log to learn from our past. Squashing in Gitflow might help us keep the codebase clean, but at what cost? We're sacrificing valuable information about how the software was built and the challenges we faced along the way. It's just not worth it.

We should be embracing transparency and learning from our mistakes. Deleting feature branches is a shortcut that doesn't serve us in the long run. So how do we commit (pun intended) to both keeping our git history intact, and having a pleasant time while working?..

We'll talk about it in a little bit, but first we must address possible confusion of some readers. Git history often gets a bad rap as just a list of greyed out commit messages next to some file names, but let's be real - it's a goldmine of information about the evolution of a project. Don't sleep on the value of reviewing git history to understand the development process and learn about the project. It can provide crucial context and insights, especially when debugging or troubleshooting issues, assuming commit messages are good enough. Here's how to make your git log the most valuable development byproduct, second only to the code itself.

C4: Collective Code Construction ContractπŸ”—

There's a great project called zeromq. Many products within this project are developed using the "Collective Code Construction Contract". This social protocol has an amazingly useful constraint 2.3.7., here it is verbatim:

A patch commit message MUST consist of a single short (less than 50 characters) line stating the problem (β€œProblem: …") being solved, followed by a blank line and then the proposed solution (β€œSolution: …").

Since in Doma, we're using a version of agile, we also allow for a well-written "Also included" section at the end of the commit. This section allows for good pushing velocity while telegraphing the reviewer or a "digital archeologist" which changes are auxiliary and which are primary for the solution.

Here's an example of a commit message written by a junior developer from our Unity3D test task product:

Problem: AI too easy; neutral cube too similar to players'

Solution:

  • Increased size of the neutral moving cube to improve usability by making it easier to distinguish from player cubes.
  • Made Attacker2 AI more difficult than Attacker1
  • Improved AI calculations and added more challenging behavior

Also included:

  • Added two toggles (checkboxes) to allow user control over AI and game restart behavior:
    • Training - if on, use Simple AI. If off, use new, more difficult AI.
    • AutoRestart - if off, player input is required to start new game.
  • These options can be accessed from the main menu or in-game pause menu.

It may be hard to write such commit messages at first, but it's unsustainable not to. The amount of times where I would look up some hack in git log this year is way more than a dozen. I hate to throw claims like "type signatures are free documentaion", but in this case it's rather close. At least, git log in our company forms a design document for every commit.

A very important point we feel important to bring across is the following:

The code is written once, but read hundreds of times.

Thus, five minutes spent writing a commit message while you still remember what you wrote and why is at least hundreds of minutes of time saved for the team and yourself in the future. Taking the time to craft descriptive and informative commit messages helps to accurately capture the changes made, and enhance the usefulness of git history. Embrace the power of git history, my friends.

I hope that section was at least food for thought for y'all. Let's get back to the main topic of the post: how to adapt a Git workflow, which doesn't sugar-coat development by forgetting failure and implementation details... Enter "experiment branches".

Experiment branchesπŸ”—

Experiment branches are the ultimate way for your team to test out new ideas and approaches without messing with the main codebase. They give your programmers the freedom to try out different solutions and techniques without worrying about breaking anything, and provide a way to document and track the progress of these experiments. Plus, by using a specific naming convention like programmer/date/feature-name and not deleting them, you can keep the history of these experiments easily accessible and organized. When the experimentation is done, you can copy the branch and squash the changes. Looks familiar? You got feature branches from gitflow for free! Now you can create a pull request to review and discuss the changes before they get added to the main codebase. There's really no need to delete feature branches either. Remember: GitHub shows recent branches and filters out stale branches, and in CLI you can alias git branch --sort=committerdate | tac. It's a win-win situation - your team gets to be creative and experimental, while still keeping the codebase clean and organized.

Here are our guidelines of how to extract the most usefulness from the experiment branches. Obviously, it is important to follow the principles of "commit early, commit often" during the experimentation process. This we consider "an attempt to validate a hypothesis" to be an atom of work and pushed to the branch. If something builds and pass tests, it should be allowed to be pushed into an experiment branch, even if it includes temporary solutions such as stubbing. It's OK to write one-line commit messages here as long as the commit is just an attempt to experiment. However, it is also important to provide a more detailed and descriptive commit message when an experiment is complete.

If the experiment was successful in solving the original issue, your commit message should include "Problem // Solution" lines. Indeed, if you had a hypothesis that something can be implemented a certain way, and it turned out to be true, and you can present the code that does it, kudos! On the other hand, if the experiment was an intermediate step in an investigation of the solution space or resulted in a negative outcome, your commit message should include "Hypothesis // Experiment" lines. Semantics of hypothesis // experiment commits are analagous to the problem // solution commits.

ExampleπŸ”—

To give you an example of one such commit message, here's an invaluable insight found in one of the archived Wasm.lean, a library I've developed for a customer, which compiles WebAssembly code into a minimal runtime of Lurk zero-knowledge runtime, which currently doesn't have a floating point number support.

Hypothesis: delete float code to parse wasm in Lurk

We think that if we remove floating point support from the code, we'll be able to compile our wasm parser itself to Lurk and run it in Lurk runtime.

Outcome: negative

Experiment:

  • Remove all the float data types and subtypes
  • Remove all the functions and cases that work over float data types and subtypes
  • Run a full featured test case "Yati32.lean", and see it fail with $
  • Run a minimal test case "Yati32Unlabeled.lean", and see it fail with Could not read string
  • See both fail with just -r vs. -rs (-r signifies that we are running the code in lean implementation of Lurk runtime).

The result of that call is:

     expected numeric values, got
       ("Megaparsec.ParserState.State"
           0
           "lcErasedType"
           "lcErasedType"
           "lcErasedType"
           "(module
               (func $main (export "main")
               * snip *

Follow-up hypothesis: patch out parsing machinery entirely will allow us to run the supported subset of Wasm on Lurk.

We know that this may seem like a lot of work, but trust us, it's worth it. By following these rigid guidelines, we can ensure that the experimentation process is clearly documented and can inform future work. Heck, we can even do statistical analyses on experiment branches, as well as git repos as a whole. (And we already do in Doma to plan better and provide internal estimates as well as external ones, which is our dirty little secret). So let's get to it, startup wizards! It's time to step up our commit game.

ConclusionπŸ”—

Gitflow is a duely popular workflow for organizing development in Git, but it has its drawbacks. While it can help to keep the codebase clean, the process of squashing commits and deleting feature branches can result in the loss of valuable information about the development process. Instead, it may be more beneficial for startups and other organizations to embrace transparency and learning from mistakes by keeping the git history intact. By doing so, teams can leverage the git log to learn from their past and improve their development process over time. So let's ditch the shortcuts and commit to transparency and continuous learning! It's the best way to stay ahead of the game and build software that truly stands the test of time.

CreditsπŸ”—

Thanks to Peter Hintjens, the author of ZeroMQ and C4. A friend gone too soon. Big thanks to George Agapov for his burning hate towards C4 and stalwart defense of the messy branches. This pushback allowed me to iterate a lot on the design until I found this sweet-spot. And of course, huge thanks to Ilona Prikule, who came up with the squash-merge as a bridge between gitflow and archeology-flow.

Share