Build systems in the age of AI-assisted coding

AI-assisted coding tools have upended how software developers work. These tools can write code for you, answer questions, and get involved in essential workflows like code reviews. In short, they save typing and thinking time, making developers much more productive.

This conversation isn’t new, but what’s changing fast is the maturity of the tools. Compared to a year ago, advancements in AI mean that the code generated is often precise, it matches what the user intended, and it usually works. You cannot ignore AI-assisted coding as a tool to gain competitive advantage.

However, the increased usage of AI-assisted coding tools has ripple effects. The generated code may have poor style and bugs. More tests might be needed to compensate for this. And because more code is being produced, the number of pull requests and code reviews per day will increase.

These ripple effects put more strain on build systems, and for reasons we will explore in this article, not all build systems scale up well to handle the new load.

The AI-assisted coding revolution

The rate of uptake of AI-assisted coding is astonishing. GitHub Copilot was released just three years ago (October 2021), and according to a GitHub survey, 92% of U.S.-based developers use AI coding tools both in and outside work.

Today there are an abundance of different coding assistants in use, including:

GitHub Copilot: Copilot offers autocompletion and code generation from chat. Copilot is available in different editors, including VS Code, JetBrains IDEs, and Vim.
Codeium: Codeium also offers autocompletion and chat-based generation. It’s compatible with many editors and is currently free for individuals. In addition, it is context-aware, which means it considers the whole codebase when making suggestions.
Continue: The unique thing about Continue is that it is open-source with an Apache Licence, which opens up the possibility of extending it yourself. It has features similar to those of others here, plus the ability to add documentation (e.g., React docs) to the knowledge available to the agent.
Cody: Sourcegraph’s Cody assistant offers chat, code generation, and wide LLM integration. You can choose models from all the major players, including Microsoft, OpenAI, Hugging Face, and Amazon Bedrock.
GitLab Duo: Duo is bundled with GitLab’s paid tiers. It sports organizational controls, refactoring, and the typical chat and code generation features.
There are many more: A recent Reddit post lists 70 coding assistant tools.

Engineers have a lot of choice as to which coding assistant to use. Different coding assistants use different LLMs under the hood, with different system prompts and methods of augmenting the LLM with information about the codebase. This means that the tool that works best can depend a lot on the use case, the programming languages used, and the structure of the code.

But regardless of which assistant they choose, most engineers are using some form of AI assisted development these days—and the applications are broad:

Autocompletion: Code can be repetitive, and the assistant can save typing time. An example is when you’ve made an API endpoint to create an object, and now you need to make one that updates the object. Having seen the previous endpoint and the function name, the assistant can write the entire function for you.
General questions: You might want to ask questions like “Can a Dockerfile produce two different images?”. In this capacity, the assistant acts like ChatGPT and can give you an answer as a chat response, except now you don’t need to leave the IDE.
Questions about your code base: If you’re new to a codebase, you might ask, “What are the main frameworks and patterns used in this codebase?” or “What is the Rust code used for in this repo?”.
Documentation: A coding assistant is well suited to writing and updating comments to match new code. You can also ask it to write markdown documentation pages, based on your code, that you can include in your repo.
Anything else: Coding assistants can figure out pretty much any new task you might come up with, provided that it involves looking at your code and generating code or text. Examples include web design, creating an SVG logo, or making an error message sound more friendly.

The hidden challenges of AI-assisted development

Of course, there are challenges when using code assistants. You’ve heard of hallucinations, where an AI chat agent gives a confident but wrong answer. This happens with code generation, causing code that is either wholly incorrect or has a subtle bug. It may also produce working code, but not for the problem you intended to solve.

The more pressing, less acknowledged issue is the effect that software engineers’ newfound productivity has on build systems.

AI assistance leads to more code and a higher commit velocity, which gives us new challenges.

Increased code reviews

Coding assistant usage causes code to be produced at a faster rate, creating more code review work. Code reviews are necessary to reduce the chance of bugs, and to make sure guidelines are being followed. Ironically, AI can actually help here because assistants can speed up these reviews in a few ways:

PR summaries: AI-generated summaries can tell you what the pull request is about and what the reviewer should focus on.
Security scanning: Various tools can detect vulnerabilities in code changes.
Review assistance: You can configure some tools to handle aspects of the code review directly, like commenting on code reviews and detecting potential bugs in the code.

Impact on testing and QA

In modern software development, most testing is done automatically. QA teams that once tested everything manually are now responsible for manual and automated testing. The increased commit rate from AI-assisted development will increase the amount of work for both types of testing.

Automated tests will be added faster, causing the test suite to take longer and be more expensive to run. Furthermore, tests that occasionally fail even when the code is unchanged (flaky tests) will occur more often and must be tamed to avoid wasted build and software engineer time.

Manual tests need to be done more selectively, and ideally automated if possible, to avoid swamping the team with work or delaying releases. As with code reviews, AI can help speed up testing—companies like Uber use AI to simulate human intent when testing mobile apps, for example.

Pressure on build systems and infrastructure

With AI helping generate and review code, productivity is up. However, more commits means more build system load. In many build systems, this causes an increase in build times, the reasons for which we will explore in a moment.

Build time increases are horrible for software engineers. Waiting for builds to complete often means they have to switch to a new task while the build runs, and then later come back to “babysit” the old task—looking at test failures, conflicts, and logs and trying to recall the details to resolve issues. This context-switching is terrible for productivity.

In many ways, this isn’t a new problem—it also arises when companies hire new software engineers quickly, as the number of commits typically rises as a result. But AI-assisted development threatens to exacerbate the problem beyond the typical strain of a growing team. Before we look at the fix, let’s explore why extra load is so problematic for build systems in particular.

Why do build systems slow down with the extra load?

In the age of scalable cloud computing, where you can quickly scale a web server from 1 to 1000 nodes, why is it a hassle to scale up build systems? Some of the challenges include:

More tests: Almost without exception, each build run involves running a suite of automated tests. Every commit will include additional tests, increasing the test run time. The load equals the test time multiplied by the (now higher) commit rate—meaning the load increase is superlinear.
Monorepos: In a monorepo, diverse changes, from two-factor authentication enhancements to CSS fixes, hit the same repo and must be merged one at a time, making merge queues critically important. If the monorepo is large, it will receive a lot of commits for all sorts of changes, and each commit could take a long time to build - especially if the build system can’t scale elegantly. Learn how Uber brough sanity to their monorepo builds using Buildkite.
Merge Issues and Incompatibilities: Two pull requests can work perfectly when applied separately but have bugs when combined. If this happens in a CI/CD system, the second pull request in the queue will fail to merge, fail to build, or fail its tests. It now requires a fix from the programmer and probably a Git rebase, or worse, some kung fu in the IDE. More commits per hour mean these incompatibilities become more frequent.
Compute requirements: Depending on the build system, increasing the number of runners may be easy or hard. On-prem systems need to be manually scaled up, and many SaaS build systems may not have the capacity at hand. While Ops or DevOps are sorting this out, the commits just hit a queue and stay idle while waiting for a runner, which causes delays.
Specialist environments: Some build processes might need a particular environment, like an Apple M2 Pro processor or a bare-metal machine with an A100 GPU. These are harder to acquire and scale up than regular VMs. A lack of these resources can create bottlenecks even when other build steps run quickly and are scaled up.

These challenges can quickly make yesterday’s humming build system an ops nightmare today. It is important for your build system to provide a foundation that makes dealing with these challenges feel as effortless as possible. Next we will look into more about what a build system can do to help.

Software delivery in the new world of AI-assisted development

Software development practices have evolved significantly over the past two decades. The introduction of continuous integration and deployment (CI/CD) systems was an enormous boon to software delivery speed and quality. CI/CD uses highly controlled ephemeral environments to build and test software consistently. This solves the “works on my machine”-type problems and mitigates the risks of shipping buggy software.

However, in the 2020s, more improvement is needed. The complexity of software, the expectations of fast delivery, the extensive software supply chains, and the increasing size of teams and projects add too much pressure to simple CI/CD systems. With AI-assisted development increasing the commit frequency, those systems have reached their limits. A fresh approach is needed to create build systems capable of meeting these modern demands.

The future: Scale-out delivery platforms

To meet the challenges created by the prolific use of AI assistance, we need build systems that are both scalable and flexible.

Scalability ensures near-immediate results so that the time engineers are waiting for something—a test outcome, a commit to main, a deployment—is as close to immediate as possible, even as the workload on the build system increases. That increase may be a predictable trend as the company grows or an unexpected spike due to many projects coinciding.

Flexibility allows the system to adapt as jobs become more complex. You don’t want your engineers to have to work around build system limitations using hacks or worse—deploying unsanctioned internal apps. In addition, as engineering teams innovate—training AI on custom hardware, for example—you need a build system that can adapt to the new challenges these innovations create.

These attributes prevent bottlenecks, reduce wasted time, and free platform engineering and DevOps teams from constant troubleshooting, ultimately enabling developers to focus more on product development.

Buildkite understands this and provides a scalable and adaptable delivery system that software and platform engineers love to use. Buildkite offers scale-out delivery, providing a more efficient engineering function and delivering more features quickly to customers. Scale-out delivery promises scale across 4 dimensions - each important to handle the demands and complexities of software delivery today.

Scale-out concurrency of build processes: Your build system needs to keep up with the exponential growth in work through your large engineering teams. It must be able to scale up without limits while avoiding bottlenecks. Eliminating bottlenecks such as long queues and dependency chains ensures that the scale-up actually results in reduced build times and more productivity.
Scale-out components: You must be able to make your builds and tests suit your unique requirements. You can’t afford to hit constraints that stop you in your tracks, and cause you to waste time on workarounds. This requires an unopinionated platform with scalable components, tools, and services that you can use to create workflows to your specification.
Scale-out workloads: Your build system needs to be able to support all software delivery workloads and use cases within a single comprehensive platform. You don’t want the hassle (and risk) of maintaining multiple software delivery tools.
Scale-out compute: Whether you are building on EC2, testing on a bare-metal cluster, or deploying firmware to a self-driving car, the build system should support your scenarios. The build system should support self-hosted cloud, multi-cloud, on-premise, and hosted solutions for specialized workloads, such as macOS, iOS, and Android.

Here are some of the scale-out delivery capabilities that the Buildkite platform provides:

SaaS control plane: Buildkite runs a control plane that is fully managed as a SaaS. It routinely orchestrates thousands of agents, with ease, and is fully managed by us.
Test Engine: The Buildkite Test Engine can intelligently split tests across many agents, using statistics to ensure the work is distributed evenly across them. Using our flaky test detection and assignment system, you’ll have visibility and ownership of flaky tests, empowering you to eliminate them.
Dynamic pipelines: Rather than a static YAML file, your pipelines can be defined in code. This allows you to adapt to more complex jobs, and use any programming language to define your pipeline using code at runtime.
Plugin support: Support for Buildkite-authored, hosted on Github or locally, and empowered to write your own custom user-defined plugins in any programming language.
Self-hosted: By default, Buildkite agents are self-hosted. Open source, no limits. Buildkite doesn’t see your secrets or source code, nor should we!
No-nonsense UI: You can quickly see what’s happening across the whole organization, and in very few clicks, drill down to individual jobs or log outputs. Features like annotations can surface the most pertinent information. And there’s emoji support 😉

Buildkite is used by leading engineering companies such as Uber and Slack as their standard build system, and it’s worth considering if you are hitting some of the challenges and walls mentioned earlier.

Conclusion

AI-assisted development is a boon to software engineers and tech companies. It lets us get more done and eliminate many mundane parts of software development.

However, this creates new challenges downstream because build systems are more strained. This can result in a reduced, or even terrible, developer experience. It impacts productivity and leads to losing the gains made by using AI tools in the first place.

We need to use flexible and scalable build systems to conquer this challenge. Buildkite provides build systems that can do this.

If you want to learn more and see how Buildkite might be helpful for your engineering teams, you can read more about Buildkite’s platform, or get started for free at buildkite.com/signup.

Scale-Out Delivery Platform→

Capabilities

Pipelines→

Test Engine→

Package Registries→

Mobile Delivery Cloud→

Bring your own compute

Hosted compute

Replace Jenkins

Workflows for AI/ML

Testing at scale

Monorepo mojo

Bazel orchestration

Webinars

Blog

Case studies

Events

About

Careers

Follow Buildkite

The AI-assisted coding revolution

The hidden challenges of AI-assisted development

Increased code reviews

Impact on testing and QA

Pressure on build systems and infrastructure

Why do build systems slow down with the extra load?

Software delivery in the new world of AI-assisted development

The future: Scale-out delivery platforms

Conclusion

Related posts

Fully dynamic pipelines with Bazel and Buildkite

A guide to Bazel query

Understanding the SLSA framework

Start turning complexity into an advantage

Platform

Hosting options

Resources

Company

Solutions

Legal

Support