Addressing Performance Problems Of Continuous Integration Builds

High performance of integration builds is one of the keys to risk-free software development. This article discusses simple and advanced approaches towards addressing concerns of performance of integration builds.

Notion Of An Integration Build

Software build is a process of transforming of project code base into usable applications. An integration build is a build that is performed to ensure that new changes integrate well into the existing code base. Integration builds provide feedback on quality of new changes. This feedback is used to deliver timely fixes if the changes don't integrate and break the project code base. Integration builds are often run by a dedicated build and release management system and are triggered when new changes are detected in a version control system. Running integration builds continuously is also known as continuous integration.

Performance Requirements For Integration Builds

The main value of an integration build is a feedback on quality of new changes. With fast builds build breakage can be identified and addressed in lesser time, reducing risks caused by delays in project delivery. That is why build speed is the most important characteristic of an integration build.

Approaches For Addressing Build Speed Concerns

Software build processes are highly I/O, memory and computationally intensive. It is possible to address build speed concerns by increasing computational resources available to a build server. Build performance can be further improved by applying advanced approaches, such as:

Adding Computational Resources

Build speed can be improved by adding computational resources available to a build and release management system. Such resources include: CPU speed, quantity of CPUs, RAM, disk and network I/O. Yet, vertical scalability by adding computational resources to a single build machine is limited for such resources cannot be added indefinitely. More advanced options should be considered to scale build performance further.


Picture 1. Increasing build speed by adding computational resources

Parabuild 2.0 allows to utilize build server's resources effectively by running multiple builds in parallel.

Partitioning Build Server Load

A single-box build machine soon becomes either I/O or CPU bound. Also, financial concerns may limit growth - SMP systems with a high number of processors can be expensive. Once the limit for vertical scaleability is reached, the load on the build server can be distributed evenly by moving some or all of the build processes to rather inexpensive build machines running in a remote builder mode while being controlled by a central build manager.


Picture 2. Moving load from a build and release management system to remote builders

Parabuild 2.0 allows scaling build infrastructure beyond a single box or an operation system by providing remote multi-platform builds.

Parallelizing Build Process

Though partitioning load on the build and release management system addresses overloading of a single build box by moving load out, the build time remains limited by performance of a machine running a build. Build time can be further decreased by running parts of a single build process in parallel.

A typical build process is made of a sequence of steps that a normally execute in a sequential manner. Depending on how build steps depends on each other, some the build steps can be run in parallel by a set of remote machines. Also, certain steps such as compilation may be parallelized as well. The approach when a set of remote machines executes parts of a particular build step in parallel is also known as a build clustering.


Picture 3. Parallelizing build execution using build clustering

Partitioning Test Load

Unit tests running in a batch mode are an important part of a successful integration build because they validate those new changes doesn't introduce regression of quality of the code base. Over time, the number of tests grows, so does time to run the tests. Running tests is subject to the same performance concerns seen in software builds. Test performance can be improved by breaking a test set into groups that are deployed into remote builders. Each remote builder may run its own test group in parallel with other remote builders. Upon completion of all test groups runs test results are consolidated back at the build and release management system and presented for failure analysis or for archival.


Picture 4. Partitioning test execution

Conclusion

It is possible to address concerns of performance of integration builds by dedicating adequate hardware resources to a software build and release management system, and by applying advanced techniques such as build load partitioning, parallel (clustered) builds and partitioning of test load.