![]() |
![]() |
Software build management has been around almost as long as software itself, tracing its roots to 1952 when the first compiler was devised by Grace Murray Hopper, and the need to transform source code to executable binaries emerged. The process of transforming of project code base into usable applications became known as "build". Many believe that you don't even have to compile source code (think Perl) to have a build!
Yet, for reasons still to be discovered, many software organizations under-plan for software build and release management systems performing Continuous Integration and nightly builds, severely affecting their core ability to deliver working software. This article addresses the gaps in understanding driving forces behind capacity planning and provides best practices for selecting proper hardware for software build and release management systems.
A software build and release management system is an integral part of the Application Lifecycle Management (ALM) infrastructure. A typical ALM infrastructure consists of a version control system, a bug or an issue tracking system, and a build server.
Picture 1. Typical Application Lifecycle Management (ALM) infrastructure
Software build processes such as Continuous Integration and daily builds are highly IO, memory and computationally intensive. Dedicating adequate resources to the build and release management system ensures efficient software development.
Build processes are usually dependent on the CPU. CPU speed for the build server should be at least equal to the speed of the fastest development machine. A better approach here would be to obtain the fastest CPU your budget could afford. Faster builds produce quicker feedback, thus providing greater timing and financial savings.
It is essential that a build server is a dedicated multiprocessor system. Multiprocessor systems are better suited to perform multiple computationally intensive tasks that build processes are. The quantity of CPUs depends on the number of the build configurations and percentage of time each build runs. It can be estimated as the sum of the total build time percentage divided by one hundred and rounded to the next two, four, eight, 16, or 32. For example, if the build server runs eight builds and each build runs 20% of the time, the quantity of CPUs needed for the server is two.
Sometimes it is not possible to follow the recommendations in the previous section due to possible financial concerns - SMP systems with a high number of processors can be expensive. In this case the load of the build server can be distributed out evenly by moving some or all of the builds to rather inexpensive build machines running in a remote builder mode while being controlled by a central build manager. This approach allows scaling build infrastructure as the load grows while staying on budget.
Picture 2. Moving load from build manager to remote builders
Parabuild supports the remote builder mode.
Selecting the adequate size of the build server's RAM is important. In case of low physical memory builds can be slow because of swapping. The minimum amount of RAM can be estimated as the sum of RAM need to run each build plus RAM needed by a version control client running full checkout plus RAM needed by the operating system and system processes. We recommend multiplying the result by 1.2 coefficients to offset unaccounted factors. Example: let OS RAM is 120Mb, each build run needs 100Mb, and each client needs 1Mb. The minimal size of RAM is going to be
(120Mb + 8*100Mb + 8*1Mb) * 1.2 = 1,113.6Mb
or roughly 1Gb.
Free disk space needed on the build server depends on the number of build configurations, the number of build runs per day, and the size of build artifacts to be placed in the build archive after each build run. The estimated size of the needed free disk space in megabytes may be calculated by using this formula
Sz = (Nbuilds * 2 * Bsize) + (Nbuilds * NRuns *Asize * 3 *365)
Where Sz is minimum required disk space, Nbuilds is a quantity of build configurations, Bsize is disk space occupied by the code base, Nruns is a number of times build runs a day, Asize is disk space occupied by results placed in build archive. Example: consider a build server running 4 build configurations, each code base taking 200Mb when checked out; each build runs 10 times a day and stores 5Mb of logs and build artifacts each time; an archived items should be stored for a year. The estimated minimum disk space needed for this configuration would be
220,600Mb = 4*2*200Mb + 4*10*5Mb*3*365
Speed of the disk subsystem is an important factor affecting overall performance of a build server. Build processes are I/O intensive and more read than write -oriented. Typical writes/reads ratios range from 2 to 5. We recommend using a high-speed 10,000 or 15,000 rpm SCSI RAID-1 array under management of a quality RAID controller. RAID-1 provides high-speed concurrent reads and writes while maintaining reliability of the disk subsystem. We do not recommend using RAID-5 because of its significantly slower write speeds.
If a high speed SCSI RAID-1 is not an option because of budget concerns, 7200 rpm IDE RAID-1 is a minimal configuration.
It is important that a build server is connected to the rest of the ALM infrastructure through a high-speed local area network (LAN). The build server accesses a version control system to create local copies of the projects' code base to build and query for the latest changes. It can also access the issue tracking system to obtain release notes. Slow or congested LAN may significantly increase build roundtrip time if the size of the code base is significant (hundreds of megabytes). Ideally, the build server should be connected to the same network switch as the version control server is.
Selecting a poorly performing build box causes slow builds and delays the delivery of the information about the latest state of the code base. Such delays negatively impact productivity of a software team because broken builds are not fixed quickly. To avoid this, the build server should run on adequate hardware.
Builds run by the build server consume a considerable amount of CPU power and disk IO. Memory consumption is also very high, especially when a build server is running many automatic builds. Collocation of the build server with a version control or an issue tracking system unavoidably causes degradation of the response times of the collocated servers as they compete for limited hardware resource. While a collocating server may provide certain financial savings (one computer is needed instead of two), these savings cannot justify losses caused by the slowness of the collocated servers. Do not mix a build server with other servers on the same hardware.
Selecting hardware that is not going to be used is a waste of money. There is no point to have 12 CPU 36Gb RAM server to run 8 builds that combined need just 2Gb of RAM. Don't overspend. Build server's hardware should be adequate but not oversized.
To support efficient software development process, a software build and release management system requires well-performing hardware. Build server capacity planning allows selecting the hardware adequate to server's tasks while staying on budget.