Wednesday, October 31, 2007

Continuous Integration Strategies (Part II)

Your CI environment is reporting a broken build. Now what?

I would like to stress that the faster you jump on the problem, the easier it is to solve. The changeset will be smaller and the person who most likely committed the offending code will have the changes he made freshly in mind.
It is a good policy that your team does not commit any additional changes to source control until the build is fixed.
At times, it is too easy to ignore a broken build message, whether it is an email notification or a flashing red light or a lava lamp. Sometimes, the team will assume someone else is working the issue. I will always recommend having immediate and active communication that the problem is being worked so that positive control in maintained.

There were some additional words of wisdom that recently circulated here that I would like to share with you. Props to Chad for nailing the importance of the entire team owning CI and having active communication about its status. The emphasis is mine.

Make sure you're at least running unit tests before you commit. You can also have a buddy immediately update and build if you want feedback before waiting for Cruise Control to build. Also, after you commit, watch your email for a notification that the build has broken. If you're not going to be around, use a check-in buddy as mentioned previously.

If you've just committed a change and receive a build failure notification email, look into what's causing the problem asap. If it's a quick fix, just make the change and re-commit. Optionally reply to the build failure notification so that the team is aware you're putting in the fix. If it's not a quick fix, reply to the build failure notification stating that you're working the issue; again so the team is aware.

...The build is everyone's responsibility.

If you see that the build is remaining broken for a period of time, take it upon yourself to investigate. Find out if anyone is working the issue. If not, try to identify the problem and notify the party responsible so that the build can be fixed quickly. Now if you can't figure out what's wrong and identify who is responsible, take it upon yourself to fix the issue. If you are too busy, find someone that can. If you can't figure it out, ask for help. Use it as an opportunity to learn something. When you do finally find the problem, let the responsible party know what happened. Then they can learn something as well.

To sum it up, there shouldn't be any duration of time where the build is broken but isn't being looked into. While everyone should be watching after they commit to see if they've broken the build, it won't always get caught. You shouldn't be saying to yourself "I didn't break it so it's not my job to fix it".
And that's the word.

Friday, October 19, 2007

Continuous Integration Strategies (Part I.I)

At the end of each successful continuous integration build and test suite, we label the workspace with a certified build tag within source control. This allows for bi-traceability from build sequence number to the tag name for QA purposes. Additionally, we can also do a simple lookup on the build number in Hudson to get a subversion revision number.

Below are two examples of how we have accomplished this.

Maven 1 and cruisecontrol and cvs:

We wrote a custom jelly goal that called ant's cvs task.

<goal name="nct:createcertifiedtag">
<ant:cvs command="tag certified-build-${label}" />
</goal>

Notice the "label" property. Cruisecontrol provides maven that property to use at runtime with the value set to the build number. We use this custom goal at the end of the cruisecontrol project's maven goal element:

<maven projectfile="${PROJECT_ROOT}/project.xml" goal=clean install nct:certifiedtag" />

Maven 2 and hudson and subversion:

In the maven pom.xml (or the parent pom.xml), specify the all the source control details so things become easier later. For example, our build includes:

<scm>
<connection>scm:svn:https://svnhost/svn/sto/trunk</connection>
<developerConnection>scm:svn:https://svnhost/svn/sto/trunk</developerConnection>
<url>https://svnhost/svn/sto/trunk</url>
</scm>

Hudson provides maven a "hudson.build.number" property to use at runtime populated with the build number. We use it by referencing that on the Goals line in the Hudson job configuration. Additionally, we made an improve over using an external process call to 'svn' by using the maven 2 SCM plugin.
clean install scm:tag -Dtag=certified-build-${hudson.build.number} 


[update: ${hudson.build.number} seems to be buggy. I have successfully used ${BUILD_NUMBER} in its place]

Thursday, October 18, 2007

Continuous Integration Strategies (Part I)

Continuous integration is a powerful concept, usually associated with only compilation and unit testing. However, there is additional benefit to be had if you look beyond unit testing. I would like to present some strategies that I have tried that allow full suites of tests to be ran in orderly stages, from unit tests, to integration tests to acceptance tests. For this series unit tests are defined as single class tests with no external dependencies on network or container resources. Integration tests are white box testing of class interactions and acceptance tests are black box system tests.

This first post on the subject deals with strategies using maven and cruisecontrol. Later posts will move on to maven 2 and hudson.

First off, a lesson learned. When we first migrated from ant to maven, we were not sure how best to configure cruisecontrol to handle CI. Our code base is a large selection of components that comprise a toolset of capabilities. There are many small projects that build on each other, so there are many dependencies on our on artifacts. In fact, from a maven point of view, we could build our toolset with a single, rather large, multiproject build.

It seemed logical to map each component maven project (itself a multiproject consisting of api + implementations + tests) to a cruisecontrol project. That presented a nice one-to-one view of the system on the build status page. Each project was independently triggered via cvs commits. This seemed to work for awhile, but it became clear that this is highly unstable because commits spanning multiple cruisecontrol projects would trigger the builds in an unpredictable order, causing the build to break or tests to fail.

The lesson learned and correction we took was to not fight maven, but let it determine the build order from start to finish. So we created a single cruisecontrol project, pointed it at the top-most maven project.xml with the goal multiproject:install and the property -Dmaven.test.failure.ignore=true. Then for each component project we wanted test status granularity on, we created a cruisecontrol project that ran a custom maven plugin that scanned the test-results directory and failed that project if test failures were found. Additionally, that cruisecontrol project also used <merge> to aggregate maven's test-results files so our developers could drill down and see which test failed and the details why.



As a quick aside, Hudson has very nice maven integration that mimics (and improves on) this kind of setup automatically.
Our next step was to enable a controlled progression of testing, where all unit test would run first, followed by integration tests only if all unit tests passed. This was accomplished in three steps: 1) one maven multiproject build responsible for compiling and unit testing, 2) a custom test failure check plugin (basically find + grep) serving as the go-no-go gate, then 3) another maven multiproject build running only integration tests. This three step orchestration was handled by a custom maven plugin running a mix of jelly and shell scripting.

The different types of tests were in different directories, as maven subprojects, under the component, so we were able to use maven.multiproject.includes and maven.multiproject.excludes on the directory names to achieve steps 1) and 3) above.

To facilitate the the including and excluding for the unit test pass, the ~/build.properties included these properties:
  maven.multiproject.includes=**/project.xml
maven.multiproject.excludes=project.xml,*/project.xml,**/inttest/project.xml,**/tck/project.xml
For the integration test pass, the ~/build.properties included these properties:
  gcp.integration.multiproject.includes=**/inttest/project.xml
gcp.integration.multiproject.excludes=**/tck/project.xml
and the plugin goal to actually run the integration tests looked like this:
  <goal name="gcp:integration-tests">
<j:set var="usethese" scope="parent" value="${gcp.integration.multiproject.includes}"/>
<j:set var="notthese" scope="parent" value="${gcp.integration.multiproject.excludes}"/>
<j:set var="thisgoal" scope="parent" value="test:test"/>
${systemScope.put('maven.multiproject.includes', usethese)}
${systemScope.put('maven.multiproject.excludes', notthese)}
${systemScope.put('goal', thisgoal)}
<maven:maven descriptor="${CC_HOME}/checkout/gcp/project.xml" goals="multiproject:goal"/>
</goal>
The crazy jelly maneuvering to get the includes and excludes properties to stick after the call to maven:maven is a story for another day. (If you want a sneek peek, however, start here.) The concept of dynamically using maven properties to properly setup the integration test run should still be clear.

Hopefully, this first post in a series will help you think about ways to get more out of your CI environment. I'm curious to know what you think about the strategy we took and I invite you to share how you accomplish CI for your projects.

Wednesday, October 10, 2007

Hudson, At Your Continuous Integration Service


I love Hudson. I have previously been a CruiseControl fan, but no longer. Hudson just works and has some excellent integration features with Maven2 and Subversion.

Generally, Hudson does this kind of stuff you would expect from a CI engine:

  1. Easy installation: Just java -jar hudson.war, or deploy it in a servlet container. No additional install, no database.
  2. Easy configuration: Hudson can be configured entirely from its friendly web GUI with extensive on-the-fly error checks and inline help. There's no need to tweak XML manually anymore, although if you'd like to do so, you can do that, too.
  3. Change set support: Hudson can generate a list of changes made into the build from CVS/Subversion. This is also done in a fairly efficient fashion, to reduce the load of the repository.
  4. RSS/E-mail Integration: Monitor build results by RSS or e-mail to get real-time notifications on failures.
  5. JUnit/TestNG test reporting: JUnit test reports can be tabulated, summarized, and displayed with history information, such as when it started breaking, etc. History trend is plotted into a graph.
  6. Distributed builds: Hudson can distribute build/test loads to multiple computers. This lets you get the most out of those idle workstations sitting beneath developers' desks.
  7. Plugin Support: Hudson can be extended via 3rd party plugins. You can write plugins to make Hudson support tools/processes that your team uses.
Then the cool stuff kicks in!

Since each build has a persistent workspace, we can go back in time to see that workspace to trace what happened. This means Hudson can do after-the-fact tagging and has permanent links to all builds, including "latest build"/"latest successful build", so that they can be easily linked from elsewhere.

I really like the matrix style jobs that you can create. For a matrix build, you can specify the JDK version as one axis and a slave Hudson (distributed builds) as another axis. Add to that a third axis of arbitrary property/values pairs that your build understands and can act on (e.g. think, maven -PmyProfile or -Dapp.runSystemTests=true or -Ddatabase.flavor=mysql). This is powerful stuff right out of the box to support multiple compilers, OSes, etc., without having to create a new job for each configuration.

For maven 2 projects, Hudson will autodiscover the <modules> in a multiproject build and list them as sub-jobs in Hudson, complete with their own status and viewable workspaces. Links to project artifacts are linked to from the build status pages.

The plugins are also starting to become plentiful. There are plugins for JIRA and Trac integration, code violations charting, publishing builds to Google calendar!, and many more. The one with the most potential I think is the Jabber plugin. Not only can it send IM notifications to an individual or a group, but the newest (cvs head) version comes with a bot that you can interact with to schedule builds, get project statuses, and monitor the build queue.

As always, sometimes small things mean a lot. An example I appreciate in Hudson is seeing the build in progress scrolling by via the browser. For CruiseControl, you would have to login to the build box and "tail -f" to get the same real-time information.

I'm not a groovy fan yet, but I was blown away when I found a built in groovy console right there in the web UI! You can use for trouble-shooting and diagnostics of your builds or plugins.

All in all, I am really, really impressed with Hudson as a product, and with the support and development going on around it. There is a new version released literally every week. It makes me, for the first time, want to contribute to an OSS project. This is good stuff. Check it out.

Monday, October 8, 2007

"Continuous Partial Attention"


I have blogged about being a "knowledge worker" before, but just found another good slideshare resource about it thanks to a link by Jim S.

The funniest thing was on slide 25 where a knowledge worker is described as having "Continuous Partial Attention."

I love that description.
I work with several people who exhibit this trait and it is one the key characteristics, I think, that makes them successful in the workplace. To be sure, they are also criticized because their team-level focus seems to be lacking, however, this is made up for in the bigger benefit to the company. This is just another way to transcend team boundaries and have organization-wide impact.

And the reason it's funny is because it's true.