Thoughts On... Automatic Discipline

Automatic Discipline

William E. Caputo, January 26 2006

"Consciousness is afraid of its own spontaneity, because it feels itself to be beyond freedom." - Jean Paul Sartre Transcendence of the Ego

UPDATE:updated quote based on different translation version of quote

Summary: Given a choice between automation opt-out and automation opt-in, choose the former.

The spirit of build process automation is finding ways to remove reliance on individual discipline, memory and technique; replacing it with tools that help us to remember to do the right thing -- usually by ensuring that when we forget a critical step in the process, the effect is immediate and local (rather than deferred and team-wide). That is: when I mess up, I know right away, and only I am affected (usually by my changes not being integrated).

A lot of people don't like this. Some believe in the better nature of the team, and feel that its too restrictive to have certain process activities happen automatically -- particularly when those activities can be more detailed, or sophisticated if done manually. Others fear that this will slow them down too much (my experience is that its the opposite). I also have encountered others who simply don't like that much discipline constraining their actions. I question the professionalism of this point of view, since unless they make no mistakes, it means that they simply defer or shift their impact - usually to their teammates or customers, but to each his own (as long as I don't have to work with them).

But mainly its the first group that is the most challenging -- the better nature people. They argue that people need to be disciplined -- its part of being a professional. The thing is: I'd rather the discipline come early (in setting up the process) than be hoped for in the heat of the critical 11th hour bug-fix. By shifting the notion of discipline from "make sure you do x, y and z on each check-in" to: "maek sure that y and z happen automatically, that way if you forget x, the build will break -- oh and its more likely to happen if you put it off (so make sure you do it often)" we get discipline with teeth.

A recent example from my current situation illustrates the difference. Like many teams practicing continuous integration, our team rebuilds the database as part of the build. If a programmer makes changes to the database schema, those changes are not reflected in the code base unless they update and check-in a binary "back-up" of the database. This file is then used (if there are changes) to rebuild the database on the build machine, and other development machines. There is no way to overlook this step -- one can (with effort) circumvent it, but one can't simply forget since one's changes won't make it into the build, and it will probably break (having story tests helps here too, but that's a different entry).

We used to use a text representation, but by switching to the binary format, we greatly reduced our build time -- at the expense of indivdiual change tracking (before the changes to each database object were versioned in the source code repository). Recently, we decided to add back (but not use to rebuild) the generation of the text representation. Unfortunately, our first pass requires opt-in on the part of the programmer (one needs to remember to run the old script and annotate the changes on check-in, as well as run the binary update). Its opt-in because unlike before, those scripts aren't used for anything (other than documentation), so failure to update them will not fail the build.

I discussed this with the programmer who implemented the solution (his name is Brian), and we debated the merits of the alternatives. My preference is to make the script run as part of the official build (managed by CruiseControl), purely as a housekeeping step. His counter was that having it done by the programmer on check-in allows detailed descriptions of what changed. We haven't -- as of this writing -- changed it yet, however Brian informed me earlier this week that he woke up in a cold sweat the other night thinking that he had in fact forgotten to run this script, and is now working out how to include it in our build. (Thus, I have officially caused someone nightmares by talking about continuous integration!)

Its the right choice. The extra documentation potential is simply far outweighed by the risk of omission. By moving the execution into the build, we get an automated change-log of each database object (each one is stored separately allowing a diff), and the guarantee that it will happen if anyone wants their changes actually submitted into the code base.

Brian's intuition is understandable -- its a bit simpler to implement, and its tempting when one has a high regard for the quality of one's teammates (as we do) to rely on their professionalism, but experience shows that no one is perfect, someone will forget to run the script, and they'll forget when we need it the most: in the heat of a critical fix with time pressure.

The only answer that allows me to sleep at night: use the build process to keep ourselves honest, that's what its there for.