The Buildkite Agent
The Buildkite agent is a small, reliable and cross-platform build runner that makes it easy to run automated builds on your own infrastructure. Its main responsibilities are polling buildkite.com for work, running build jobs, reporting back the status code and output log of the job, and uploading the job's artifacts.
This page contains reference information for Buildkite organization administrators. It covers agent installation and configuration details and how agents communicate with Buildkite. If you're working with a team that already uses Buildkite and you want to write code that agents will run, read Pipelines. If you're setting up a Buildkite organization and you don't already have agents running, read Getting started.
You (or your organization) need one or more running agents to run builds, but once you've installed the agent and got it running on your own infrastructure, you don't need to interact with it directly. Whether you're starting builds automatically with every commit, or running them manually by clicking a button, Buildkite handles everything from telling the agent what version control references to use, where to get the changes from, and what code to run; as well as reporting the outcome back to Buildkite.com.
The agent works by polling Buildkite's agent API over HTTPS. There is no need to forward ports or provide incoming firewall access, and the agents can be run across any number of machines and networks.
The agent starts by registering itself with Buildkite, and once registered it's placed into your organization's agents pool. The agent periodically polls Buildkite looking for new work, waiting to accept an available job.
After accepting a build job the agent will execute the command, streaming back the build script's output and then posting the final exit status.
Whilst the job is running you can use the
buildkite-agent meta-data command to set and get build-wide meta-data, and
buildkite-agent artifact for fetching and retrieving binary build-wide artifacts. These two commands allow you to have completely isolated build jobs (similar to a 12 factor web application) but have access to shared state and data storage across any number of machines and networks.
You can install the agent on a wide variety of platforms, see the installation instructions for a full list and for information on how to get started.
$ buildkite-agent --help Usage: buildkite-agent [command] [arguments...] Available commands are: start Starts a Buildkite agent annotate Annotate the build page within the Buildkite UI with text from within a Buildkite job artifact Upload/download artifacts from Buildkite jobs env Process environment subcommands lock Process lock subcommands meta-data Get/set data from Buildkite jobs pipeline Make changes to the pipeline of the currently running build bootstrap Run a Buildkite job locally step Retrieve and update the attributes of steps help Shows a list of commands or help for one command Use "buildkite-agent [command] --help" for more information about a command.
To start an agent you'll need your organization's agent token from the Agents page of your Buildkite dashboard. You pass the token to the agent using an environment variable or command line flag, and it will register itself with Buildkite and wait to accept jobs.
The agent has a standard configuration file format on all systems to set meta-data, priority, etc. See the configuration documentation for more details.
We frequently introduce new experimental features to the agent. Use the
--experiment flag to opt-in to them and test them out:
buildkite-agent start --experiment experiment1 --experiment experiment2
Or you can set them in your agent configuration file:
If an experiment doesn't exist, no error will be raised.
Please note that there is every chance we will remove or change these experiments, so using them should be at your own risk and without the expectation that they will work in future!
Artifacts uploaded by
buildkite-agent artifact upload will be uploaded using URI/Unix-style paths, even on Windows. This makes sure that artifacts uploaded from Windows agents are stored in a URI-compatible URL.
To use it, set
experiment="normalised-upload-paths" in your agent configuration.
Artifact names displayed in Buildkite's web UI, as well as in the API, are changed by this.
For example, when using this experimental feature
buildkite-agent artifact upload coverage\report.xml uploads to
s3://example/coverage/report.xml instead of to
After repository checkout, resolve
BUILDKITE_COMMIT to a commit hash. This makes
BUILDKITE_COMMIT useful for builds triggered against non-commit-hash refs such as
To use it, set
experiment="resolve-commit-after-checkout" in your agent configuration.
bootstrap to execute in separate Kubernetes containers in the same pod.
To use Kubernetes exec, set
experiment="kubernetes-exec" in your agent configuration.
Exposes a local API to introspect and mutate the state of a running job through environment variables. This lets you write scripts, hooks, and plugins in languages other than Bash, using them to interact with the agent.
The API uses a Unix Domain Socket, whose path is exposed to running jobs with the
BUILDKITE_AGENT_JOB_API_SOCKET environment variable. Calls are authenticated using the Bearer HTTP Authorization scheme made available through a token in the
BUILDKITE_AGENT_JOB_API_TOKEN environment variable.
The API provides the following endpoints:
GET /api/current-job/v0/env- Returns a JSON object of all environment variables for the current job.
PATCH /api/current-job/v0/env- Accepts a JSON object of environment variables to set for the current job.
DELETE /api/current-job/v0/env- Accepts a JSON array of environment variable names to unset for the current job.
See the agent repo for the full API request and response definitions.
The job API is unavailable on agents running versions of Windows before build 17063, as this was when Windows added Unix Domain Socket support. If you enable this experiment on an unsupported Windows agent, the agent outputs a warning and the API is unavailable.
To use the job API, set
experiment="job-api" in your agent configuration.
Like the Job API experiment, this exposes a (separate) local API for interacting with the agent process. The Agent API offers these endpoints:
GET /api/leader/v0/ping- Returns a JSON object with the current time (useful for testing the agent is alive).
GET /api/leader/v0/lock?key=<key>- Returns a JSON object containing the current state of a lock.
PATCH /api/leader/v0/lock?key=<key>- Accepts a JSON object with old and new states for a lock. The lock is then updated atomically, and a JSON object describing whether the operation proceeded is returned.
The API is exposed using a Unix Domain Socket. Unlike the
job-api, the path to the socket is not available through a environment variable—rather, there is a single (configurable) path on the system.
To use the agent API, set
experiment="agent-api" in your agent configuration.
The following features started as experiments before being promoted to fully supported features.
Changes the file lock implementation from
github.com/gofrs/flock to address an issue where file locks are never released by agents that don't shut down cleanly. The new file locks are implemented at the kernel level, and are aware of when their parent process dies.
Promoted in v3.48.0. It's the default behavior, so there's no configuration required to use it. Because the old and new lock systems do not interact, we strongly recommend not running different versions of the agent on the same host.
Outputs inline ANSI timestamps for each line of log output, enabling timestamps you can toggle in the Buildkite dashboard.
Promoted in v3.48.0. It's the default behavior, so there's no configuration required to use it. If you want to turn it off, pass the
Maintain a single bare git mirror for each repository on a host that is shared amongst multiple agents and pipelines. Checkouts reference the git mirror using
git clone --reference, as do submodules.
Promoted in v3.47.0. You can use it by setting the
See the following agent configuration options for more information:
The Buildkite agent can redact strings that match the value of environment variables whose names match common patterns for passwords and other secure information before the build log is uploaded to Buildkite.
Promoted in v3.31.0.
See redacted-vars for more information.
The agent's behavior can be customized using hooks, which are shell scripts that exist on your build machines or in each pipeline's code repository. Hooks can be used to set up secrets as well as overriding default behavior. See the hooks documentation for full details.
When a build job is canceled the agent will send the build job process a
SIGTERM signal to allow it to gracefully exit.
If the process does not exit within the 10s grace period it will be forcefully terminated with a
SIGKILL signal. If you require a longer grace period, it can be customized using the cancel-grace-period agent configuration option.
The agent also accepts the following two signals directly:
SIGTERM- Instructs the agent to gracefully disconnect, after completing any job that it may be running.
SIGQUIT- Instructs the agent to forcefully disconnect, cancelling any job that it may be running.
The agent reports its activity to Buildkite using exit codes. The most common exit codes and their descriptions can be found in the table below.
|0||The job exited with a status of 0 (success)|
|1||The job exited with a status of 1 (most common error status)|
|128 + signal number||The job was terminated by a signal (see note below)|
|255||The agent was gracefully terminated|
|-1||Buildkite lost contact with the agent or it stopped reporting to us|
When a job is terminated by a signal, the exit code will be set to 128 + the signal number. For more information about how shells manage commands terminated by signals, see the Wiki page on Exit Signals.
Exit codes for common signals:
|130||2||SIGINT||Terminal interrupt signal|
|137||9||SIGKILL||Kill (cannot be caught or ignored)|
|139||11||SIGSEGV||Segmentation fault; Invalid memory reference|
|141||13||SIGPIPE||Write on a pipe with no one to read it|
|143||15||SIGTERM||Termination signal (graceful)|
One issue you sometimes need to troubleshoot is when Buildkite loses contact with an agent, resulting in a
-1 exit code. After registering with the Buildkite API, an agent regularly sends heartbeat updates to indicate that it is operational. If the Buildkite API does not receive any heartbeat requests from an agent for 10 consecutive minutes, that agent is marked as lost and will not be assigned any further jobs.
Various factors can cause an agent to fail to send heartbeat updates. Common reasons include networking issues and resource constraints, such as CPU, memory, or I/O limitations on the infrastructure hosting the agent.
In such cases, it's essential to check the agent logs and examine metrics related to networking, CPU, memory, and I/O to help identify the cause of the failed heartbeat updates.
If the agents run on the Elastic CI Stack for AWS with spot instances, the abrupt termination of spot instances can also result in marking agents as lost. To investigate this issue, you can use the log collector script script to gather all relevant logs and metrics from the Elastic CI Stack for AWS.