Empathetic Infrastructure Code Management: An Introduction

This is part 1 of many in my "Open-source as a project model for internal work" series.

In order for your project to thrive beyond its initial set of releases, it needs a flexible upgrade model. The benefits both the maintainer of the project and the consumer. What works fine with only one or two users will turn in to trouble at 10+.

For open-source projects, this is a harsh reality. Every change they make seems to be a breaking one. How can they continue to evolve their product without causing endless headaches for those who helped them get to where they are?

Thankfully, there are some decent solutions from a code management perspective that help with this. They're not perfect, because code isn't magic. Work is required at some point and no best practice can avoid that. However, there are ways to ease the pain.

I label the following content "Empathetic Code Management", as it revolves around understanding the needs of your consumer. The suggestions I make require effort to implement and maintain. In my experience as an infrastructure maintainer, the Return On Investement on this effort is large. A little bit of upfront work pays off with quicker releases and fewer emergencies. All while making your users happier as you aren't constantly breaking their app in production.

The Case for a Separate Infrastructure Codebase

Most projects, large-scale and small, have a set of features that are common throughout app. Examples include the overall site template, a set of re-usable JavaScript components, or some common styles to help theme HTML elements. I'll use the term "infrastructure code" for these common bits, as they build out the structure of the site.

It's common for this infrastructure code to live inside of the overall application. It makes a lot of sense to bundle them together, since they're used together. In my experience though, this pattern of bundling can cause a considerable amount of trouble when it's time for improvements.

By tacking the two together, every change to the infrastructure codebase requires a full regression test of your app. Again, this may work when you only have a few pages depending on the code as you can quickly validate your changes across the codebase.

However, with a growing app comes a longer and longer regression test cycle. At a certain size, the app will be split up in to separate teams with separate responsibilities. The responsibility to regression test will also be divied up between them. Without planning ahead, whose job it is to regression test infrastructure changes becomes a confusing mess.

If Team A updates the infrastructure with a fix for their part of the app, do they have to fix the part of Team B's code that broke because it relied on the bug that was squashed? Even if Team B agrees to fix the code, does Team A have to wait for Team B to reprioritize their work to get the release out?

This may be temporarily solved through a recurring meeting between the teams where they plan out work and communicate status, but as the project grows and more and more teams are added, meetings become extremely ineffecient and ineffective.

At this point, you really need dedicated developers who own the infrastructure codebase. And as you have a separate infrastructure team, that team deserves a separate codebase. Without it, the team will struggle with the same issues that plagued the original two teams. Every change they make will either require extensive regression testing or it will risk breaking production due to incomplete testing.

By separating out the infrastructure code, you can install some release management tenets used by open-source projects, which will allow for a more sane upgrade route for both contributors and consumers.

The first major change to the model comes with the idea of "pulling" releases down, versus having them "pushed" to every team. I cover that in detail in my next post: Shipping Updates: Pushing Them Out vs. Pulling Them Down