Sunday, November 23, 2008

The Supply Chain of Continuous Integration

When I was first introduced to Continuous Integration, I viewed it as a black box with a well defined interface. It was the same kind throw-it-over-the-wall mentality that some people have with testing: When the code is "done", give it to the testing shop and mark the checkbox complete. It may or may not return with feedback attached.

I got a different perspective of how CI should work, however, while working on a CI team. More than a little of this new perspective is probably due to working in a company that believes in and practices agile development and scrum management. With agile, we are much better at involving testers into day-to-day activities rather than the cold baton hand-off as sprint review time rolls near.

What got me thinking about a CI supply chain concept was the story from The Earth is Flat about how UPS has evolved from a package delivery company to an integral part of many companies day-to-day business operations. They started with a core business of picking up and shipping packages. But it turns out, there are inefficiencies to this bolted on approach. In order for the shipping to be efficient for both UPS and the contracting company, the entire supply chain had to be prepped in advance. This includes both pre and post shipment. The visibility into business processes had to be bidirectional.

I think the same is true with continuous integration. It is based on the old computer adage, GIGO, Garbage In Garbage Out. So a team that writes plenty of code, but no tests. What is the value of CI for them? The feedback is minimal. To get the most value out of an automated build system, there must be some forethought into what kind of quality feedback you want to see, and then what teams can do to define and integrate the right reporting tools into the build.

For example, even after automated tests written and incorporated into the build, you may want to get feedback on some quality measures on your code. In our case, we wanted Checkstyle and PMD reporting on our Java code. We use maven as our build tool, so adding the reporting into our builds was simple. But then the question becomes, what coding standards do we want to compare to? What PMD rulesets represent a sane minimum that teams can deal with and learn from at the same time?

So now, a team dedicated to providing CI services is recommending quality reports to both developers and stakeholders AND helping to define the rules that code should be compared against. My first take on this approach, deviating from a traditional CI definition was a sense of invasiveness. Upon further reflection, I have embraced the blurred lines between teams for several reasons. Firstly, it breaks artificial team boundaries and keeps communication lines open. CI is no longer a black box, it is a visible, value contributing component of system development. Secondly, developers won't (at least consistently across an organization) take the time to inject QA into their processes. A CI team, however, does have the time to take the first steps and get the ball rolling. The conversation is more constructive when you have something in place actually working then just a bunch of talking going on about what could be.

I think a simple diagram emphasizes the point. If "A" is a development team, "B" is a CI team, and "C" represents the stakeholders of the solution being developed, then the shaded areas pinpoint a missed opportunity unless you adopt the supply chain analogy. These are not hard control points, but juicy overlap, waiting to be optimized by as AB and BC working together.

So far, I've talked about the pre-shipment benefit leading into CI. I also now believe the supply chain post-CI is equally important. That is, where does the feedback go and how do you make is so easy, there is no reason not to use it. In our case, this encompassed two audiences with two different needs.

1) Developers. They need the low level test results, test coverage, Checkstyle and PMD reports.

2) Stakeholders (Project Managers, Solution Owners, Scrum Masters). They need to know that the developer teams are using the system. What is the source control commit frequency? How long do CI builds stay broken? How long are the automated builds taking? In short, the interest is around are they getting their monies worth on CI investment and knowing if the teams "get it". It provides them with just enough information to start asking the right questions to the right people.

So the supply chain from CI into management visibility, in our case, ended up being a enterprise level portal aggregating project CI metrics together into an at-a-glance view of how well their agile teams are performing. This could be stop light charts on current CI conditions or simple graphs of build times or number of tests as plotted over the last 30 days.

As always seem to be the case, the benefit of this proposed supply chain approach is inceased communication.

Again, this initially seemed to stray from traditional core CI functionality, but in reality it is simply providing the visibility into the process that scrum promises. Warts and all.