NewCI/CD in 2023. Check out the December Release for usage metrics, platform improvements, and a sneak peek at upcoming features.

Ephemeral macOS builds with Buildkite, Nix, and Tailscale


At Determinate Systems, our mission is to make Nix, an extremely powerful but often tricky language, build tool, and package manager, dramatically easier to use for everyone (for more on Nix, check out Zero to Nix, an open source learning resource we recently released). Traditionally, Nix has been seen as more of a “Linux thing” and part of our mission is to change that by making Nix just as easy to install and use on macOS as it currently is on Linux. This objective has implications across our org, but in this post I'd like to talk about what it means for our continuous integration needs—and how we were able to use Buildkite's unique feature set to great effect.

Our CI needs

Like many organizations, we use GitHub Actions when we have relatively straightforward CI logic, although we use it in a non-standard, Nix-heavy way. This is because it’s easy to set up and generally has a nice interface. There are times, however, when our needs go well beyond what Actions can offer. Whenever that happens, we turn to Buildkite, which we’ve done across several important projects.

One area where GitHub Actions does not shine is in support for macOS machines. More specifically, Actions doesn’t provide an Apple Silicon runner, which means that if we want to run jobs on Apple M1 and M2 machines, we’re simply out of luck with GitHub Actions. Another problem: we don’t use Rosetta, and we want to be strict in checking that our M1 builds don’t use it and that our Intel (x86) builds don’t use any M1 features.

Weak support for Apple machines on Actions is a problem for us because we have several use cases for Apple machines. Two examples that are public (some aren’t yet public):

  • We built a tool called Riff that uses Nix to provide dependencies in Rust projects. We need to build and run Riff on Apple Silicon so that we can distribute Riff to users on M1
  • We’re currently working on a next-generation Determinate Nix installer (next-generation because it provides features like seamlessly uninstalling Nix from your machine). As with Riff, we need to distribute Silicon-compatible binaries to users, but we also need to test the Determinate Nix Installer on “fresh” Apple Silicon machines without Nix installed.

As you can see, we need a completely custom CI setup for macOS.

Our Buildkite setup

We use Buildkite in a way that is both fairly elegant and comically inelegant. The comically inelegant part: our CEO, Graham Christensen, has four physical Macs running in his basement. Why four? Because we wanted to provide a proper “matrix” of machines:

With Nix installedWithout Nix installed
x86_64
Apple Silicon

The four physical machines in Graham's basement.

Here's what this majestic setup looks like:

Determinate Systems' macOS build farm in Graham Christensen's basement

Determinate Systems' macOS build farm in Graham Christensen's basement

And now for the more elegant part–each of the Macs you see in that picture:

  • Runs Mosyle, a mobile device management (MDM) solution for macOS. This enables us to control the machines remotely rather than needing to interact with them directly.
  • Is declaratively configured using Nix and nix-darwin. Most of the configuration logic is in this Nix file. With Nix, we have an actual language we can use to describe the desired system setup. No scripts, no brittle procedural logic; just a tidy expression of what we want the machine to look like and Nix handles the dirty work of setting the system up.
  • Is configured as a Buildkite agent with just a few lines of Nix. This enables the machines to do our Buildkite bidding in pipelines like this and this.

Making our macOS machines ephemeral

Whenever one of our Macs is finished running a Buildkite job, it erases itself and shuts down; when it’s asked to run a job, it boots up and configures itself from scratch using Nix and nix-darwin. This means that each Buildkite job runs on a “fresh” machine.

How do we achieve this? We’ve built an internal tool called bonk that erases the machines and shuts them down using the Mosyle API. We can remotely trigger bonk over our company Tailscale network at any time.

Here’s the causal sequence for a typical Buildkite job:

  • The Mac machine (acting as a Buildkite agent) awaits instructions from Buildkite.
  • The machine receives instructions from Buildkite and runs its assigned job.
  • The machine erases itself and shuts down.
  • Mosyle uses Nix and nix-darwin to generate a “fresh” machine configured to our specifications.
  • ♾️

We wanted a fully automated pool of totally ephemeral macOS machines. It took a bit of elbow grease (and Nix), but we achieved that with Buildkite.

Making the machines accessible

While we mostly use our Mac build farm to run automated Buildkite jobs, we also occasionally use them for more ad-hoc experiments. Only one of us in the company uses a Mac as a daily work machine and we wanted to avoid buying everyone in the company an admittedly pretty but also pricey new Apple laptop. We set up all the machines inside our company’s Tailscale virtual private network (VPN) so that any of us, even the most serious Linux diehard, can access and tinker with those machines at any time.

The benefits of our setup

Having a pool of ephemeral Buildkite-ready macOS machines has been hugely beneficial for us, and it’s hard to see us fulfilling our mission to make Nix easier for macOS users without it. The two most important benefits:

  • Cost. Using macOS instances on AWS, for example, requires a minimum of 24 hours of usage (due to Apple licensing restrictions). This made cloud-based provisioning extremely cost prohibitive for us over the longer term. The up-front cost of buying physical machines was high but it has quickly paid for itself in savings vis-à-vis cloud-based solutions.

    More specifically: running a Mac machine on AWS for just one month costs 75% of the purchase price of a similarly configured physical machine. If we run a single CI job each day for two months we’ve already paid for the machine. Overall, the monetary benefits are a clear win even when accounting for energy and other costs associated with running physical machines.

  • Ephemerality and speed. One of the typical advantages of cloud provisioning is that those machines are ephemeral (read: easily disposable). Spin up the machine, use it, shut it down, done. With our macOS setup, we get to have our cake and eat it too: we get immense savings by avoiding high variable costs in the cloud while also having access to purely ephemeral machines.

    In terms of speed, the full finish/wipe/rebuild cycle for our Macs is typically 5-7 minutes, which is dramatically faster than what we’d get on AWS, where the process typically took around 15 minutes.

The value of Buildkite

True to our company name, we specialize in determinate software systems. That is, systems that are configured to exact, declared specifications from the ground up. More generic CI runners (like GitHub Actions) don't meet our needs. What Buildkite enables us to do is delegate work to whatever gnarly constellation of machines we happen to dream up. Right now, that’s a macOS build farm but it could be something else in the future.

Buildkite really shines as a delegator. It passes the bare minimum information required to define a unit of work and we get to control the rest. Determinacy is essential to us and other people’s CI runners often don’t provide that. Buildkite, by contrast, provides a neat and tidy separation of concerns that fits our macOS use cases perfectly, and may well cover other use cases in the future.