Tuesday, December 4, 2007

Continuous Integration Strategies (Part III)

How do you get your code to talk to you? Continuous integration is all about automated feedback. Beyond test reports and self-describing-code there are many techniques and tools that you can use to find out if you are producing "quality" code.

When you pick the right reports and integration them into the software build, the level of effort required to use these tools becomes very low (and we all know that developers are legendarily lazy).

Ideally, you should not have to remember to ask, the code should tell.

It's like being the parent of a teenager. At the dinner table you ask,

"So, how was school today?"
"I dunno."

"What did you learn?"
"Nothin'"

"Did anything interesting happen?"
"I dunno."

"How did you do on your history test"
(shrug)

You would rather he came home with the enthusiasm of a kindergartener eager to tell you about his day as soon as he gets home. No effort on your part to ask, he just gushes unprompted,

"Dad!! Guess what happened today at school?? It was so cool, I got an A+ on my spelling test!!"
That's what continuous integration can do for you.

So, recalling the computer adage, Garbage In, Garbage Out, we have carefully picked a select few reports that we feel give us a good measure of quality and integrated them into our build so that the feedback from CI is meaningful. Although several CI engines (we currently use Hudson) do have plugins to generate quality reports such as unit test results and test coverage, we feel it is important that developers have full access to run all the same reports that CI will run. Since we run maven, it is easy to plugin the reports of interest into the <reporting> section of the pom. That keeps the report configurations in source control along with the code.

So which reports do we run? Here's the rundown:

  • Checkstyle. Validates source code against coding standards and reports any violations. We customized the ruleset, packaged it in a versioned jar and deployed it to our maven repository. Additionally, we forced it to run as part of every build in the validate phase, and configured it to fail the build if violations are found. It's possible for an individual project to override the custom ruleset, but we don't encourage that.
  • PMD. Performs design time analysis of the source code against a standard ruleset. We customized the ruleset, packaged it in a versioned jar and deployed it to our maven repository. Additionally, we forced it to run as part of every build in the validate phase, and configured it to fail the build if violations are found.
  • Test Results (surefire). This is a no-brainer. You want to see the results of your tests.
  • Code Coverage (cobertura). Shows branch and line coverage for your source files. I think the key to using this metric is to not to set a percentage that the project team must meet, but to be smart about interpreting the trends. "Metrics are meant to help you think, not to do the thinking for you."
  • Javadoc. Creates the API documentation for your project.
  • Dashboard. Aggregates maven multiproject build reports into a single report page. This is critical so that developers and stakeholders don't have to hunt and drill down page after page to find the meaningful metrics you worked so hard to set up.


In addition, make full use of Maven's pom to declare all the sections that feed the default generated site, including:

  • <scm> Source control management. New team members, for example will need to know the subversion URL for checkout.
  • <developers> Identifies subject matter experts and feeds developer activity reports.
  • <ciManagement> Identifies the CI engine being used and the URL to see the live status and force new builds.
  • <issueManagement> Identifies the issue tracking system. I think there are maven plugins that will map issues to source control commits, providing bi-traceability of code to requirements.


Also, don't forget for maven multiproject builds to review the Dependency Convergence report. It will show all dependencies for all projects along with the versions of those dependencies. This will help you find dependencies your using inadvertently using multiple versions of.

Once you have these reports baked in, make the results easy to find. Use your CI engine to generate the maven site on a nightly basis and publish the results to a web server where developers or project stakeholders can find them.

After you spend the time and effort to identify which reports you want and get them configured and working correctly, make it repeatable for new projects by creating an archetype template project. This sets up a model pom.xml and project directory structure right off the bat for new projects. When it's there from the start, with no effort on the team's part, good things happen.

With a little up front effort, your code (with a little help from continuous integration) can talk to you. What are your strategies? How do you use CI to reveal code quality? I would be interested in hearing your strategies.