Cache volumes
Cache volumes are external volumes attached to hosted agent instances. These volumes are attached on a best-effort basis depending on their locality, expiration and current usage, and therefore, should not be relied upon as durable data storage.
By default, cache volumes:
- Are disabled, although you can enable them by providing a list of paths to cache at the pipeline- or step-level.
- Are scoped to a pipeline and are shared between all steps in the pipeline.
Cache volumes act as regular disks with the following properties:
- The volumes use NVMe storage, delivering high performance.
- The volumes are formatted as a regular Linux filesystem (e.g. ext4)—therefore, these volumes support any Linux use-cases.
Cache configuration
Cache paths can be defined in your pipeline.yml
file. Defining cache paths for a step will implicitly create a cache volume for the pipeline.
When cache paths are defined, the cache volume is mounted under /cache
in the agent instance. The agent links subdirectories of the cache volume into the paths specified in the configuration. For example, defining cache: "node_modules"
in your pipeline.yml
file will link ./node_modules
to /cache/bkcache/node_modules
in your agent instance.
Custom caches can be created by specifying a name for the cache, which allows you to use multiple cache volumes in a single pipeline.
When requesting a cache volume, you can specify a size. The cache volume provided will have a minimum available storage equal to the specified size. In the case of a cache hit (most of the time), the actual volume size is: last used volume size + the specified size.
Defining a top-level cache configuration (as opposed to one within a step) sets the default cache volume for all steps in the pipeline. Steps can override the top-level configuration by defining their own cache configuration.
Required attributes
paths |
A list of paths to cache. Paths are relative to the working directory of the step. Absolute references can be provided in the cache paths configuration relative to the root of the instance. Example: - ".cache" - "/tmp/cache" |
Optional attributes
name |
A name for the cache. This allows for multiple cache volumes to be used in a single pipeline. Example: "node-modules-cache" |
size |
The size of the cache volume. The default size is 20 gigabytes, which is also the minimum cache size that can be requested. Units are in gigabytes, specified as Ng , where N is the size in gigabytes, and g indicates gigabytes.Example: "20g" |
Lifecycle
At any point in time, multiple versions of a cache volume may be used by different jobs.
The first request creates the first version of the cache volume, which is used as the parent of subsequent forks until a new parent version is committed. A fork in this context is a "moment", or a readable/writable "snapshot", version of the cache volume in time.
When requesting a cache volume, a fork of the previous cache volume version is attached to the agent instance. This is the case for all cache volumes, except for the first request, which starts empty, with no cache volumes attached.
Each job gets its own private copy of the cache volume, as it existed at the time of the last cache commit.
Version commits follow a "last write" model: whenever a job terminates successfully (that is, exits with exit code 0
), cache volumes attached to that job have a new parent committed: the final flushed volume of the exiting agent instance.
Whenever a job fails, the cache volume versions attached to the agent instance are abandoned.
Billing model
Cache volumes are charged at an initial fixed cost per pipeline build when a cache path (for example, cache: "node_modules"
) is defined at least once in the pipeline's pipeline.yml
file. This fixed cost is the same, regardless of the number of times a cache path is defined/used in the pipeline.yml
file.
An additional (smaller) charge is made per gigabyte of active cache, where active cache is defined as any cache volume used in the last 24 hours.
Git mirror cache
The Git mirror cache is a specialized type of cache volume designed to accelerate Git operations by caching the Git repository between builds. This is useful for large repositories that are slow to clone.
Git mirror caching can be enabled on the cluster's cache volumes settings page. Once enabled, the Git mirror cache will be used for all hosted jobs in that cluster. A separate cache volume will be created for each repository.
Container cache
The container cache can be used to cache Docker images between builds.
This feature is only available to Linux hosted agents.
Container caching can be enabled on the cluster's cache volumes settings page. Once enabled, a container cache will be used for all hosted jobs in that cluster. A separate cache volume will be created for each pipeline.