1. Resources
  2. /
  3. Examples
  4. /
  5. Self-Healing Pipeline

Buildkite Self-Healing Pipeline Example

This example demonstrates a self-healing pipeline built with Buildkite dynamic pipelines and Claude Code. When a PR build fails, adding a label triggers an AI agent that automatically diagnoses the failure and submits a fix.

How it works

  1. You add a buildkite-fix label to a GitHub PR that has a failing build
  2. GitHub sends a webhook to Buildkite, which starts a build
  3. The first step evaluates the webhook payload — if it’s not a label event, the build exits early
  4. A TypeScript handler reads the payload, finds the failed build via the Buildkite REST API, and checks that the failure matches the PR’s head commit
  5. If there’s a matching failure, the handler uses the Buildkite SDK to dynamically generate a new pipeline step and uploads it with buildkite-agent pipeline upload
  6. That step launches Claude Code in a Docker container with access to the repo, the failed build logs (via the Buildkite MCP server), and GitHub (via gh CLI)
  7. Claude reads the logs, diagnoses the issue, creates a fix on a new branch, opens a PR, and verifies it passes CI

The handler pattern

The core of the handler is short — read the webhook payload from build metadata, evaluate whether to act, and generate a step with the Buildkite SDK:

// 1. Read the webhook payload that Buildkite stored as build metadata
const payload = JSON.parse(
  execSync("buildkite-agent meta-data get buildkite:webhook").toString(),
);

// 2. Evaluate the condition — right event, right label?
if (payload.action !== "labeled" || payload.label.name !== process.env.TRIGGER_ON_LABEL) {
  process.exit(0);
}

// 3. Generate a step with the Buildkite SDK and pipe it into `pipeline upload`
const pipeline = new Pipeline();
pipeline.addStep({ label: ":robot_face: Fix the build", command: "scripts/claude.sh" });
execSync("buildkite-agent pipeline upload", { input: pipeline.toYAML() });

The real handler also calls the Buildkite API between steps 2 and 3 to confirm there’s an actual failing build on the PR’s head commit — see scripts/handler.ts.

The key Buildkite features at play:

  • buildkite-agent pipeline upload — adding steps to a running build based on runtime conditions
  • buildkite-agent meta-data — reading webhook payloads stored as build metadata
  • @buildkite/buildkite-sdk — programmatically generating pipeline YAML in TypeScript
  • Buildkite webhooks — triggering builds from external events
  • Buildkite Hosted Models — proxying LLM requests through Buildkite’s model provider endpoint

What’s interesting about this?

This pipeline doesn’t have a fixed set of steps. Whether anything happens at all depends on the webhook payload and the state of the builds at that moment. That’s the core idea behind dynamic pipelines — your pipeline logic runs at build time and decides what to do based on real conditions, not static YAML.

The self-healing use case takes this further: the pipeline not only decides whether to act, it decides what to do by handing the problem to an AI agent. This is one pattern for building agentic CI/CD workflows on Buildkite.

Setup

To run this yourself, you’ll need:

  1. Fork this repo
  2. Create a Buildkite pipeline pointing to your fork with webhook support enabled
  3. Configure a GitHub webhook to send pull_request events with the labeled action to Buildkite
  4. Set up the required secrets in your Buildkite pipeline: GITHUB_TOKEN and BUILDKITE_API_TOKEN
  5. Add the buildkite-fix label to a PR with a failing build and watch it go

Known limitations

  • The handler assumes the Buildkite org slug and pipeline slug match the GitHub org and repo name. This won’t always be the case — you may need to configure these separately.

Credits

Originally built by Grant Colegate and Christian Nunciato as a demo for AWS re:Invent.

License

See LICENSE (MIT)

More examples

Start turning complexity into an advantage

Create an account to get started for free.

Buildkite Pipelines

Platform

  1. Pipelines
  2. Public pipelines
  3. Test Engine
  4. Package Registries
  5. Mobile Delivery Cloud
  6. Pricing

Hosting options

  1. Self-hosted agents
  2. Mac hosted agents
  3. Linux hosted agents

Resources

  1. Docs
  2. Blog
  3. Changelog
  4. Example pipelines
  5. Plugins
  6. Webinars
  7. Case studies
  8. Events
  9. Migration Services
  10. CI/CD perspectives

Company

  1. About
  2. Careers
  3. Press
  4. Security
  5. Brand assets
  6. Contact

Solutions

  1. Replace Jenkins
  2. Workflows for MLOps
  3. Testing at scale
  4. Monorepo mojo
  5. Bazel orchestration

Legal

  1. Terms of Service
  2. Acceptable Use Policy
  3. Privacy Policy
  4. Subprocessors
  5. Service Level Agreement
  6. Supplier Code of Conduct
  7. Modern Slavery Statement

Support

  1. System status
  2. Forum
© Buildkite Pty Ltd 2026