Using Buildkite you can scale your CI pipelines using the same techniques you use to scale production servers, which for many teams has meant creating an AWS-based Buildkite Agent cluster to horizontally scale capacity depending on your build queue. Shopify’s agent cluster for example runs 50,000 build jobs per day across a fleet of Buildkite agents, scaling up during the day for maximum performance, and scaling down at night to save on server costs.
We wanted to make it easier for any team to setup their build stack on AWS, so working with some of our best customers (👋 99designs) we’ve created the Elastic CI Stack for AWS: a pre-built CI and CD stack that gives you an autoscaling build cluster in your own AWS VPC.
The stack can be used to parallelize large tests suites across hundreds of build agents, run any of your team’s Linux-based projects, and to run your AWS ops tasks. It comes pre-baked with the Buildkite Agent, Docker and Docker Compose, aws-cli, spot bid support, S3 integration, and CloudWatch metrics and logging. Check out the full list of supported features on GitHub.
You’re not just limited to a single stack or VPC either—you can create an instance of the stack for different build needs, or security requirements, and target each set of agents in your Buildkite pipelines. Internally we use 3 stacks: a single “pipeline uploader” instance running 5 agents on a t2.nano, a “builder” always-on instance which has warm git and docker caches, and “runner” instances that autoscale based on demand.
A lot of effort has been put into making it easy to build your own AMIs and run the build tools locally. The entire stack is open-source, and the metrics collector (Golang binary) and metrics stack (CloudFormation Stack) can also be run independently to instrument your own custom build infrastructure.
We’ve published a step-by-step guide for getting started with the Elastic CI Stack including running the first build on your agent cluster: Elastic CI with AWS | Buildkite Documentation
Configure the stack parameters, including instance type, spot bid price, and autoscaling rules.
New build jobs trigger the autoscaling rules, and new instances come online.
Voilà! Your agent stack is now online.
Running a parallel bash build on the stack