Command step

A command step runs one or more shell commands on one or more agents.

Each command step can run either a shell command like npm test, or an executable file or script like build.sh.

A command step can be defined in your pipeline settings, or in your pipeline.yml file.

pipeline.yml
steps:
  - command: "tests.sh"

When running multiple commands, either defined in a single line (npm install && tests.sh) or defined in a list, any failure will prevent subsequent commands from running, and will mark the command step as failed.

Command step attributes

Required attributes:

command The shell command/s to run during this step. This can be a single line of commands, or a list of commands that must all pass. Also available as the alias commands.
Example: "build.sh"
Example:
- "npm install"
- "tests.sh"
pipeline.yml
steps:
  - commands:
    - "npm install && npm test"
    - "moretests.sh"
    - "build.sh"

Pipelines without command steps

Although the command attribute is required for a command step, some plugins work without a command step, so it isn't strictly necessary for your pipeline to have an explicit command step.

Optional attributes:

agents A map of agent tag keys to values to target specific agents for this step.
Example: npm: "true"
allow_dependency_failure Whether to continue to run this step if any of the steps named in the depends_on attribute fail.
Default: false
artifact_paths The glob path or paths of artifacts to upload from this step. This can be a single line of paths separated by semicolons, or a list.
Example: "logs/**/*;coverage/**/*"
Example:
- "logs/**/*"
- "coverage/**/*"
branches The branch pattern defining which branches will include this step in their builds.
Example: "main stable/*"
cancel_on_build_failing Setting this attribute to true cancels the job as soon as the build is marked as failing.
Default: "false"
concurrency The maximum number of jobs created from this step that are allowed to run at the same time. If you use this attribute, you must also define a label for it with the concurrency_group attribute.
Example: 3
concurrency_group A unique name for the concurrency group that you are creating. If you use this attribute, you must also define the concurrency attribute.
Example: "my-app/deploy"
depends_on A list of step keys that this step depends on. This step will only run after the named steps have completed. See managing step dependencies for more information.
Example: "test-suite"
env A map of environment variables for this step.
Example: RAILS_ENV: "test"
if A boolean expression that omits the step when false. See Using conditionals for supported expressions.
Example: build.message != "skip me"
key A unique string to identify the step. The value is available in the BUILDKITE_STEP_KEY environment variable.
Keys can not have the same pattern as a UUID (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
Example: "linter" Alias: identifier
label The label that will be displayed in the pipeline visualisation in Buildkite. Supports emoji.
Example: ":hammer: Tests" will be rendered as "🔨 Tests"

matrix Either an array of values to be used in the matrix expansion, or a single setup key, and an optional adjustments key.
steps:
- label: "{{matrix}} build"
  command: "echo '.buildkite/steps/build-binary.sh {{matrix}}'"
    matrix:
    - "macOS"
    - "Linux"
parallelism The number of parallel jobs that will be created based on this step.
Example: 3
plugins An array of plugins for this step.
Example:
- docker-compose#v1.0.0:
    run: app
priority Adjust the priority for a specific job, as a positive or negative integer.
Example:
- command: "will-run-first.sh"
  priority: 1
retry The conditions for retrying this step.
Available types: automatic, manual
skip Whether to skip this step or not. Passing a string (with a 70-character limit) provides a reason for skipping this command. Passing an empty string is equivalent to false. Note: Skipped steps will be hidden in the pipeline view by default, but can be made visible by toggling the 'Skipped jobs' icon.
Example: true
Example: false
Example: "My reason"
soft_fail Allow specified non-zero exit statuses not to fail the build. Can be either an array of allowed soft failure exit statuses or true to make all exit statuses soft-fail.
Example: true
Example:
- exit_status: 1
timeout_in_minutes

The maximum number of minutes a job created from this step is allowed to run. If the job exceeds this time limit, or if it finishes with a non-zero exit status, the job is automatically canceled and the build fails. Jobs that time out with an exit status of 0 are marked as passed.

You can also set default and maximum timeouts in the Buildkite UI.

Example: 60

Retry attributes

At least one of the following attributes is required:

automatic Whether to allow a job to retry automatically. This field accepts a boolean value, individual retry conditions, or a list of multiple different retry conditions.
If set to true, the retry conditions are set to the default value.
Default value:
exit_status: "*"
signal: "*"
signal_reason: "*"
limit: 2
Example: true
manual Whether to allow a job to be retried manually. This field accepts a boolean value, or a single retry condition.
Default value: true
Example: false
pipeline.yml
steps:
  - label: "Tests"
    command: "tests.sh"
    retry:
      automatic: true

  - wait

  - label: "Deploy"
    command: "deploy.sh"
    retry:
      manual: false

If you retry a job, the information about the failed job(s) remains, and a new job is created. The history of retried jobs is preserved and immutable. The number of possible retries is available as an environment variable limit on the job. When a limit is not specified on automatic retry, the default limit is three.

You can view how and when a job was retried

You can also see when a job has been retried and whether it was retried automatically or by a user. Such jobs will hidden - you can expand and view all the hidden retried jobs.

Retry history is preserved and can be viewed

In the Buildkite UI, there is a Job Retries Report section where you can view a graphic report on jobs retried manually or automatically within the last 30 days. This can help you understand flakiness and instability across all of your pipelines.

Information on manual and automatic job retries over the last 24 hours to 30 days

Conditions on retries can be specified. For example, it's possible to set steps to be retried automatically if they exit with particular exit codes, or prevent retries on important steps like deployments. The following example shows different retry configurations:

pipeline.yml
  - label: "Tests"
    command: "tests.sh"
    retry:
      automatic:
        - exit_status: 5
          limit: 2
        - exit_status: "*"
          limit: 4
  - wait
  - label: "Deploy"
    command: "deploy.sh"
    branches: "main"
    retry:
      manual: 
        allowed: false
        reason: "Deploys shouldn't be retried"

Automatic retry attributes

Optional attributes:

exit_status The exit status value that causes this job to retry, and can include any value between 0-255. Other valid exit status values include * for any value between 1-255 (excluding 0), as well as -1 (the value returned when an agent is lost and Buildkite no longer receives contact from agent).

Examples:

  • "*"
  • 2
  • 42
  • 143
  • -1
signal The signal that causes this job to retry. This signal only appears if the agent sends a signal to the job and an interior process does not handle the signal. SIGKILL propagates reliably because it cannot be handled, and is a useful way to differentiate graceful cancelation and timeouts.

Examples:

  • "*"
  • kill
  • SIGINT
signal_reason The reason a process was signaled.

Examples:

  • "*"
  • none
  • cancel
  • agent_stop
limit The number of times this job can be retried. The maximum value this can be set to is 10.
Example: 3

-1 exit status

A job will fail with an exit status of -1 if communication with the agent has been lost (for example, the agent has been forcefully terminated, or the agent machine was shut down without allowing the agent to disconnect). See the section on Exit Codes for information on other exit codes.

pipeline.yml
steps:
  - label: "Tests"
    command: "tests.sh"
    retry:
      automatic:
        - exit_status: -1  # Agent was lost
          limit: 2
        - exit_status: 255 # Forced agent shutdown
          limit: 2

Manual retry attributes

Optional attributes:

allowed A boolean value that defines whether or not this job can be retried manually.
Default value: true
Example: false
permit_on_passed A boolean value that defines whether or not this job can be retried after it has passed.
Example: false
reason A string that will be displayed in a tooltip on the Retry button in Buildkite. This will only be displayed if the allowed attribute is set to false.
Example: "No retries allowed on deploy steps"
pipeline.yml
steps:
  - label: "Tests"
    command: "tests.sh"
    retry:
      manual:
        permit_on_passed: true

  - wait

  - label: "Deploy"
    command: "deploy.sh"
    retry:
      manual:
        allowed: false
        reason: "Sorry, you can't retry a deployment"

Soft fail attributes

Optional attributes:

exit_status Allow specified non-zero exit statuses not to fail the build.
Example: "*"
Example: 1
pipeline.yml
steps:
  - label: "Everyone struggles sometimes"
    command: "tests.sh"
    soft_fail:
      - exit_status: 1

Matrix attributes

setup A list of dimensions, each containing an array of elements. The job matrix is built by combining all values of each dimension, with the other elements of each dimension.
adjustments A array of with keys, each mapping an element to each dimension listed in the `array.setup`, as well as the attribute to modify for that combination.
Currently, only soft_fail and skip can be modified.
pipeline.yml
steps:
- label: "💥 Matrix build with adjustments"
  command: "echo {{matrix.os}} {{matrix.arch}} {{matrix.test}}"
  matrix:
    setup:
      arch:
        - "amd64"
        - "arm64"
      os:
        - "windows"
        - "linux"
      test:
        - "A"
        - "B"
    adjustments:
      - with:
          os: "windows"
          arch: "arm64"
          test: "B"
        soft_fail: true
      - with:
          os: "linux"
          arch: "arm64"
          test: "B"
        skip: true

Fail fast

To automatically cancel any remaining jobs as soon as the first job fails (except jobs that you've marked as soft_fail), add the cancel_on_build_failing: true attribute to your command steps.

Next time a job in your build fails, those jobs will be automatically canceled.

Example

pipeline.yml
steps:
  - label: ":hammer: Tests"
    commands:
      - "npm install"
      - "npm run tests"
    branches: "main"
    env:
      NODE_ENV: "test"
    agents:
      npm: "true"
      queue: "tests"
    artifact_paths:
      - "logs/**/*"
      - "coverage/**/*"
    parallelism: 5
    timeout_in_minutes: 3
    retry:
      automatic:
        - exit_status: -1
          limit: 2
        - exit_status: 143
          limit: 2
        - exit_status: 255
          limit: 2

  - label: "Visual diff"
    commands:
      - "npm install"
      - "npm run visual-diff"
    retry:
      automatic:
        limit: 3

  - label: "Skipped job"
    command: "broken.sh"
    skip: "Currently broken and needs to be fixed"

  - wait

  - label: ":shipit: Deploy"
    command: "deploy.sh"
    branches: "main"
    concurrency: 1
    concurrency_group: "my-app/deploy"
    retry:
      manual:
        allowed: false
        reason: "Sorry, you can't retry a deployment"

  - wait

  - label: "Smoke test"
    command: "smoke-test.sh"
    soft_fail:
      - exit_status: 1