Continuous Integration: what should I automate?
- Run tests.
It shouldn’t need saying, but I’ve seen too many projects where “continuous integration” consists of nothing more than making sure the code compiles. A green light from your continuous integration process should mean that your software is working – and you can’t have any confidence in this without an automated test suite. A good initial test suite should cover as much of your codebase as possible, run as fast as possible (your main build should never take more than ten minutes), and not depend on infrastructure like databases or messaging busses. Use techniques such as stubbing and mocking to simulate these dependencies instead.
- Produce a deployable binary.
This might be a jar that you can deploy to your web application server, or an MSI file that installs your application. Whatever the method you use to install your software, your CI process should do the necessary packaging and make the end result available as an artefact. Quality analysts can then download the package from the CruiseControl dashboard and get to work testing, decreasing the time lag between developers checking in and getting feedback on whether their code is working. Making your packaging process part of your CI build also means it’s continually tested, saving you headaches further down the line.
- Deploy your application into a production-like
environment.
Getting your application working in a non-development environment is often a major (and sometimes unforeseen) cause of pain on a project. This is especially true when your software has to work on many different environments (for example desktop applications) or on complex production server environments with many integration points. Automating your deployment process and testing it as part of your automated build can substantially relieve these problems. Furthermore, automating deployment means it’s easier for your testers to test your software “for real” – giving you even more confidence that your software is really ready to use.
- Tag your code base.
t’s vital to be able to correlate your binaries with the version of the source code that was used to build them. If you’re only using a single repository and have a source control system like Subversion that has atomic commits, it’s quite easy to make CruiseControl display the revision number. However if you’re using a source control system like CVS or Visual SourceSafe that doesn’t have atomic commits or your source code is spread across several repositories, you’ll want to tag your source code with the build number. That way when a bug is discovered in a production or testing environment, you can easily work out the exact source and configuration that went into creating the binary and environment. Tagging will also help you to conform to auditing requirements like those in Sarbanes-Oxley.
- Run your regression tests.
If you have long-running acceptance or regression tests, you’ll want to have these automated and run regularly. You won’t want to put them in your main build – as I said above, this should give your developers feedback within ten minutes of checking in. However you should at least run them every night, and possibly even more regularly, as a separate build process. Not only will this give you extra confidence that your software is working, it will also make sure you keep your regression and acceptance tests up-to-date – when your codebase is changing rapidly, it’s easy for acceptance tests to fail because they no longer reflect requirements rather than because of any bugs in the code. Using CruiseControl to run your acceptance tests regularly, and taking broken builds seriously, will focus your attention on keeping your acceptance tests fresh and your software working.
- Generate metrics.
You have to be careful with metrics, since they will strongly affect the behaviour of your developers – and not necessarily in a way that improves your codebase. For example, measuring the percentage of your code covered by tests is probably the most common metric incorporated into the build process – but it is easy to write trivial, practically useless tests to increase code coverage. Nevertheless if you have agreement from the whole team, adding metrics such as code coverage, style checking, dependency analysis and cyclomatic complexity to your build are a great way for developers and testers to get confirmation that their changes are improving the codebase. The latest version of CruiseControl (2.7 at the time of writing) includes Panopticode, which lets you visualise all these metrics and more out of the box.
Comments > (HTML is allowed)
-
AntonJune 27th, 2007 @ 02:02 PM
Hi! A good initial test suite should cover as much of your codebase as possible, run as fast as possible (your main build should never take more than ten minutes), and not depend on infrastructure like databases or messaging busses. How about "big" distributed applications? E.g., the build process for our system takes more then 3 hours. I don't believe that there can be any reasonable test for a such codebase that takes 10 minutes. In general, have you experience or knowledge about CI and unit testing for distributed applications? Namely, CORBA based ones.
-
Jez HumbleJune 28th, 2007 @ 07:57 PM
Hi Anton. Thanks for your comments. I am guessing your 3 hour test suite runs in a production-like environment and tests functionality end-to-end. This suite is probably a combination of acceptance tests and integration tests. I would run this as a second cruise project, as described in point five above. The way you get a ten minute unit test suite is by testing the behaviour of your business logic in isolation from your infrastructure. So you might have a class that has a particular responsibility, and you test that, stubbing or mocking out its dependencies on infrastructure like messaging busses or CORBA infrastructure or databases or whatever. This means the tests run very fast, and test *only* the business logic. This kind of testing, known as unit testing, was the main driver behind dependency injection frameworks like Spring and PicoContainer. This kind of design also has the advantage that it isolates your business logic from your infrastructure. This is the way we write large systems here at TW - and yes, we regularly write large, distributed applications in this way. Our unit test suites do sometimes take longer than ten minutes to build, especially with the largest systems we deal with (which sometimes have tens of thousands of unit tests) - but these can generally be broken down into subsystems in any case. Thanks, Jez Humble Delivery Manager, CruiseControl Enterprise
-
Paul DuvallJune 28th, 2007 @ 08:48 PM
"A green light from your continuous integration process should mean that your software is *working*" - Right on, Jez. Nice post. This is a key differentiator of CI - Paul
-
Bob HankeJuly 15th, 2007 @ 12:20 AM
I have been reading these articles about using CI. However I am in the same boat as Anton. Our Server/Client product takes 3+ hours to build on a dedicated build machine with no tests being run. The process involves retrieving source code, building all libraries, servers, desktopapplications, and install software. Our code is mainly C++ which does take longer to build then Java. A lot of our servers and desktop exe's have many dependencies on our internal libraries. So if a major change were to take place to a low level library that almost every other library, server, or dekstop app uses wouldn't that mean a rebuild of the entire system? In addition to that I don't see how I could possibly reduce the entire build to 10 min. Most modules will build independently in 10 min or less. Am I missing something here in understanding how CI is used? I can see it being used if the desktop apps are the ones being modifed on a regular basis. Those are the end point of the dependency chain, are quick to build and can kick off tests. But if we change something that is further up the chain how would we handle that. Also on those lines how do you handle a change to an API of a library that other modules use? You assumptions of CI assume that either what I check in will not break anybody elses module or everyone checks in the updated code at the same time. Neither of these are true. Usally it takes time for the lib to get modified and rebuilt. Then each owner of the dependent modules changed code, unit builds, tests and commits. Thus working down the chain until all modules have had a chance to do this. Then we get a sucessfull build. Please give some guidence. Thanks Bob
-
billieJuly 17th, 2007 @ 11:57 PM
Great article! Maybe when you get a chance you could go into more depth on various database automation tools and practices that are in use, such as schema generation and database migrations.
-
Jez HumbleAugust 1st, 2007 @ 07:43 AM
Hi Bob. Thanks for your comment.
The 10 minute case I discussed in the article is not too difficult to do with Java / .NET, even on quite large systems. However as you point out, things are harder when you're trying to do continous integration on very large C++ projects.
So going back to first principles, the goal is that after a developer does a check-in, they should get feedback on whether they have broken anything important within 10 minutes of doing so. This feedback need not (and indeed it cannot) be definitive - but it should at least include the unit tests (http://en.wikipedia.org/wiki/Unit_testing) for the module concerned in isolation.
So what I would do in your project is have a single CruiseControl project for each module of your software. This would compile just the module, and run the relevant unit tests on it. Once you have this completed for each module, you can move on to the next stage, where you deal with changes that break the external interfaces of your frameworks / libraries.
This is where you need to look at pipelining your builds such that the results of successful builds of upstream (lower level) stages are preserved and available for use by downstream (higher level) stages. In this approach, a successful build of a given module then triggers builds of the modules that depend on it. There are many ways to do this - we'll be covering them in future articles here on the Studios blog - but one possibility is to have your build script publish binaries to (say) a git repository upon a successful build (more on how to do this in our next CCE blog entry).
The dependent stages in your pipeline (say your server code) could check this repository and trigger builds based on that as well as on check-ins to their source code. That way breakages in upstream modules don't affect downstream ones - since only successful builds trigger the next stage in the pipeline - but new versions of upstream modules/libraries automatically trigger builds of the dependent modules. Interface changes may then cause these downstream modules to break - but that is the signal that the upstream dependency built successfully and passed its unit tests, and that it is now time for the developers on the dependent modules to update them.
Clearly with this approach the whole pipeline will take a while to run. However, in this scenario you can start getting feedback without having to wait for the whole codebase to rebuild. You would probably need a final stage to assemble together and deploy the application using the last known good versions of all the modules (which shouldn't be too hard if you published everything into git), and run your full test suite. You could have this scheduled nightly.
Hopefully this addresses some of your concerns. Please get in touch with us for more information - you can email me at jez at thoughtworks dot com.
-
EricApril 16th, 2008 @ 02:30 AM
Tagging really isn't required in CVS either. Just do your checkouts from a time stamp (roughly when the trigger fired). Bam. You have a branch and a date-time stamp. It's an absolute set of source code and harder for people to sneak around and modify than the contents at a tag. You add zero seconds to your build process, make it easier to repeat and have all the auditability you want.
-
sohbetApril 28th, 2008 @ 01:28 AM
Great article! Maybe when you get a chance you could go into more depth on various database automation tools and practices that are in use, such as schema generation and database migrations.

