CI: The CI Coordinate System

(over the past few years, I have collected a series of description aids, analogies and concepts to explain continuous integration to others. My goal is to share them for discussion and expansion in simplified form in a series of blog entries of no particular order. Today's is the first of these: an explanation aid for navigating the options of choosing a CI implementation approach...)

Continuous Integration (CI), "The rebuild, retest and (optional) redeploy of a system with every change to that system" is practiced in many different ways, by many different types of teams, for many different types of systems. In light of these variations, it can be difficult for a team new to CI to choose an appropriate implementation. Indeed even experienced teams may simply adopt the implementation of earlier projects without evalutation, resulting in a process mismatch that creates as many problems as it solves.

While, there are many forces that affect the decision, I often find it useful to review the situation in light of two primary forces: The number of people working on the system and and a shorthand technical influence I call integration complexity. I choose the first because the number of people working on a system is a reasonble heuristic for gauging the human communication needs of the situation (and CI is fundamentally a communication practice) while the latter serves as a useful shorthand representation of the many compiliation, build speed, testing speed, deployment costs, platform configuration, hosting needs and run-time requirements of the system.

By reducing the discussion to two dimensions, I find it easier to compare implemenation options to one another -- IOW this may be an oversimplification of the issues, but its a useful one. Like so many two-factor assessments, we can represent it visually using a 2-dimensional graph:

The X-Axis represents the number of people on the project practicing CI together, while the Y-Axis represents the integration complexity (an imprecise representation of the build, deployment, installation and run-time requirements of the application). Projects located close to the origin (in the lower-left corner) have the fewest developers and simplest integration considerations, while those in the upper right-hand corner are those projects with the largest number of developers and most complex integration requirements.

Now let's graph a few example implementations.

The CI Automation tools CruiseControl and CruiseControl.NET were developed at ThoughtWorks to address the communication challenges of practicing CI with a large number of developers (including projects with multiple locations and upwards of 100 people). These tools have also proven useful for smaller teams (e.g. in the 5 - 25 person range) with more complex integration needs, or other communication challenges (e.g. distributed teams). The design goals for these tools is to supplement the existing communcation of moderate to very complex single projects (which may include several sub-teams). Thus, we could plot CruiseControl's sweet spot as something like this:

The C3 project (often referred to as the first XP project) (and many other XP projects since) used a CI solution where a single integration machine was employed: a programmer had to physically occupy the integration workstation to introduce their changes into the system image. The approach requires colocation of course (which brings more benefits than just a simpler CI solution), and also implies an upper limit to team size, but could conceivably handle arbitrarily complex integration needs as this approach does not necessarily imply manual integration, simply manual -- and physical -- initiation of the process. Thus, the integration machine approach's sweet spot might look more like this:

Another common approach -- using a physical integration token -- has similar limitations but seems to scale a bit further (The ThoughtWorks team who originally wrote the code that later became CruiseControl, did so when they reached the limits of this approach). This implies a useful range like this:

Finally (and to demonstrate that the graph is useful for plotting individual projects as well), I know of several projects where the team found an automated solution desirable, but CruiseControl was overkill, or otherwise inappropriate for their needs. The QuickFIX project is a particularly interesting example, since for much of its life it had but one active programmer, but its multi-platform integration needs (it is a cross-platform C++ project) made a different kind of automation -- one with simple reporting but the ability to run simultaneous identical-source builds on several build machines, the best solution. QuickFIX's plot would look like this:

This doesn't create a cookbook solution; its a mistake to simply assume that because your project sounds similar to one of these that you should automatically use that implementation (although it can serve as reinforcement that you are the right track). But what it does do is help people see that one size doesn't fit all -- even for the same project over time.

Instead, to get the optimal CI implementation for your environment I suggest that you first address your communication needs (to include stakeholders, management, and culture needs, in addition to other developers). In doing so, its better to simplify your communication requirements, before introducing a more complex implementation (e.g. work to move people together, before installing a CI automation tool).

Try to achieve this communication goal for the smallest level of integration (e.g. perhaps building and testing in your immediate environment). Once that's working, incrementally push the integration point toward production and/or other groups. Repeat the last two steps (always getting to a stable state) until you reach the right balance of responsive feedback and upkeep costs.

Notice too that there is signifcant overlap of appropriate techniques...

...particularly for projects of 10 - 15 people with moderate integration complexity. This isn't particularly surprising since CI has been practiced the most on projects of this size (as it is the suggested sweet spot for Extreme Programming and CI is one of the core XP practices). If you feel your team falls into this area, my suggestion is to choose the simplest solution that fits your culture and personal preference, and revisit the decision every few iterations to see if one of the other approaches might be a better fit.

In short, a one size CI solution does not fit all. The key (as in many software development scenarios) is to make your CI solution as simple as possible, but no simpler -- or as I like to think of it: Start with the simplest solution (you might get away with it) then add processes and tools until you achieve a continuous integration implementation that works for your environment.

And how do you know that its working? That, is a topic for another time.