Cache volumes

Cache volumes are external volumes attached to hosted agent instances. These volumes are attached on a best-effort basis depending on their locality, expiration and current usage, and therefore, should not be relied upon as durable data storage.

By default, cache volumes:

  • Are disabled, although you can enable them by providing a list of paths to cache at the pipeline- or step-level.
  • Are scoped to a pipeline and are shared between all steps in the pipeline.

Cache volumes act as regular disks with the following properties:

  • The volumes use NVMe storage, delivering high performance.
  • The volumes are formatted as a regular Linux filesystem (e.g. ext4)—therefore, these volumes support any Linux use-cases.

Cache configuration

Cache paths can be defined in your pipeline.yml file. Defining cache paths for a step will implicitly create a cache volume for the pipeline.

When cache paths are defined, the cache volume is mounted under /cache in the agent instance. The agent links subdirectories of the cache volume into the paths specified in the configuration. For example, defining cache: "node_modules" in your pipeline.yml file will link ./node_modules to /cache/bkcache/node_modules in your agent instance.

Custom caches can be created by specifying a name for the cache, which allows you to use multiple cache volumes in a single pipeline.

When requesting a cache volume, you can specify a size. The cache volume provided will have a minimum available storage equal to the specified size. In the case of a cache hit (most of the time), the actual volume size is: last used volume size + the specified size.

Defining a top-level cache configuration (as opposed to one within a step) sets the default cache volume for all steps in the pipeline. Steps can override the top-level configuration by defining their own cache configuration.

pipeline.yml
cache:
  paths:
    - "node_modules"
  size: "100g"

steps:
  - command: "yarn run build"
    cache: ".build"

  - command: "yarn run test"
    cache:
      - ".build"

  - command: "rspec"
    cache:
      paths:
        - "vendor/bundle"
      size: 20g
      name: "bundle-cache"

Required attributes

paths A list of paths to cache. Paths are relative to the working directory of the step.
Absolute references can be provided in the cache paths configuration relative to the root of the instance.
Example:
- ".cache"
- "/tmp/cache"

Optional attributes

name A name for the cache. This allows for multiple cache volumes to be used in a single pipeline.
Example: "node-modules-cache"
size The size of the cache volume. The default size is 20 gigabytes, which is also the minimum cache size that can be requested.
Units are in gigabytes, specified as Ng, where N is the size in gigabytes, and g indicates gigabytes.
Example: "20g"

Lifecycle

At any point in time, multiple versions of a cache volume may be used by different jobs.

The first request creates the first version of the cache volume, which is used as the parent of subsequent forks until a new parent version is committed. A fork in this context is a "moment", or a readable/writable "snapshot", version of the cache volume in time.

When requesting a cache volume, a fork of the previous cache volume version is attached to the agent instance. This is the case for all cache volumes, except for the first request, which starts empty, with no cache volumes attached.

Each job gets its own private copy of the cache volume, as it existed at the time of the last cache commit.

Version commits follow a "last write" model: whenever a job terminates successfully (that is, exits with exit code 0), cache volumes attached to that job have a new parent committed: the final flushed volume of the exiting agent instance.

Whenever a job fails, the cache volume versions attached to the agent instance are abandoned.

Billing model

Cache volumes are charged at an initial fixed cost per pipeline build when a cache path (for example, cache: "node_modules") is defined at least once in the pipeline's pipeline.yml file. This fixed cost is the same, regardless of the number of times a cache path is defined/used in the pipeline.yml file.

An additional (smaller) charge is made per gigabyte of active cache, where active cache is defined as any cache volume used in the last 24 hours.

Git mirror cache

The Git mirror cache is a specialized type of cache volume designed to accelerate Git operations by caching the Git repository between builds. This is useful for large repositories that are slow to clone.

Git mirror caching can be enabled on the cluster's cache volumes settings page. Once enabled, the Git mirror cache will be used for all hosted jobs in that cluster. A separate cache volume will be created for each repository.

Hosted agents git mirror setting displayed in the Buildkite UI

Container cache

The container cache can be used to cache Docker images between builds.

This feature is only available to Linux hosted agents.

Container caching can be enabled on the cluster's cache volumes settings page. Once enabled, a container cache will be used for all hosted jobs in that cluster. A separate cache volume will be created for each pipeline.

Hosted agents container cache setting displayed in the Buildkite UI