Syntax linting in Buildkite's Docs with Vale and Alex

We're all familar with publishing typos in production docs. But what if there was a better way? After merging Juanito's super helpful pull request to replace 'blacklist' and 'whitelist' with friendlier terms, I was finally motivated to add a few tools and checks to stop myself and others making those types of mistakes.

These ideas aren't new, here are some lovely folks talking about improving the inclusivity of docs from 2017 and 2018.

In this post I'll show you how to configure a couple of different tools to run various checks on the documentation, and how to configure Buildkite to run them automatically.

Anyhow, to the tools.

Alex

First up, alex is a one-stop nodejs tool for catching insensitive or inconsiderate writing.

As long as you have npm (with Node.js) installed, you can run alex directly without installing it using npx. Running alex on our docs repo looks a bit like this:

1$ npx alex@8 --diff pages/**/*.erb pages/**/*.txt

We're using npx because we're planning to run this in CI, pinning alex at version 8, and checking all docs in the pages directory, some are *.erb files and some are plain text *.txt. (*.erb files are Embedded Ruby Template files, which are mostly markdown, with a sprinkling of Ruby code)

The first time I ran it, alex returned a few hundred errors and warnings in this format:

1pages/integrations/cc_menu.md.erb: no issues found
2pages/integrations/github.md.erb: no issues found
3pages/integrations/github_enterprise.md.erb: no issues found
4pages/integrations/gitlab.md.erb: no issues found
5pages/integrations/phabricator.md.erb: no issues found
6pages/integrations/slack.md.erb: no issues found
7pages/integrations/sso.md.erb: no issues found
8pages/pipelines/artifacts.md.erb: no issues found
9pages/pipelines/block_step.md.erb
10        3:37-3:46  warning  Be careful with `execution`, it’s profane in some cases                                                                              execution        retext-profanities
11
12pages/pipelines/branch_configuration.md.erb
13      68:65-68:72  warning  Be careful with `periods`, it’s profane in some cases                                                                                periods          retext-profanities

I decided that neither periods not execution were words that I need to avoid in a software context, so I added them to the list of exceptions in .alexrc:

1allow:
2  - execution
3  - periods

A combination of adding all of the checks that I didn't find useful to .alexrc and making edits to the docs where necessary removed most of the warnings, but sometimes as a real thinking human person you know better than the linting tool. In that situation you can explicitly tell alex to ignore something that it would otherwise warn you about, using an html style comment (which is valid in both markdown and html).

1<!--alex ignore whitelist -->
2This sentence will **not** trigger the check for whitelist. This can be useful when writing a style guide that includes examples of what *not* to write.

Great, our docs are already much more friendly and inclusive, but what else can I improve while I'm here?

Vale

Vale is a syntax-aware linter for prose built with speed and extensibility in mind, written in Go. Sure, but what does that mean? It means that we can configure it to do various different things, such as spellchecks, testing for unintentional repeated words, suggesting alternatives to incorrect capitalization, etc.

Exactly how you install vale depends on where and how you're using it, but we're doing something like this:

1curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2

Note that you should never download and pipe software to a shell interpreter like this from a source you do not trust.

Configuring vale is a little trickier, because it does more things. Let's take a look at .vale.ini:

1# we'll keep our custom configuration in the vale/ directory
2StylesPath = vale
3
4# display warnings as well as errors
5MinAlertLevel = warning
6
7# Remember how we said we use both *.txt files and *.erb files? 
8# This tells vale to treat both as markdown
9[formats]
10erb = md
11txt = md
12
13# For all file formats [*], run the tests and checks in vale/Buildkite 
14# and run the built-in check for repeated words (Vale.Repetition)
15
16[*]
17
18BasedOnStyles = Buildkite, Vale.Repetition

In vale/Buildkite, which is where we configure our custom checks there are currently two files, one for each check that we do.

existence.yml suggests alternatives to things that we know we don't want in the docs:

blacklist → allowlist
oAuth → OAuth

The configuration for it looks like this:

1extends: substitution
2message: Consider using '%s' instead of '%s'
3level: error
4# swap maps tokens in form of bad: good
5swap:
6	whitelist: allowlist
7	blacklist: blocklist
8	oAuth: OAuth

We're using a customization of the built-in substitution rule, that when it detects one of the words we don't want, such as whitelist, raises an error and suggests the alternative (allowlist):

1Consider using 'allowlist' instead of 'whitelist'

spelling.yml uses the default dictionary (en-US, but you can use other languages of course) to highlight potential spelling mistakes. Due to the amount of technical terms in softare documentation, it might intially seem that you have far too many of these to want to fix, but we can work around that.

Vale already ignores anything marked as code, or in links, we then used some regular expressions and shell scripts to get a list containing one (and only one) of each word that vale considers a spelling mistake. (I'll leave that regular expression work out of this article, but if you're super interested drop me a line)

We quickly went through the list, deleting any mistakes, and leaving in technical terms that we do not consider errors:

1Atlassian
2autogenerated
3Autoscaling
4autoscaling
5Basecamp
6Basscss
7Bazel
8...

Then we confgured the spelling rule like this, telling vale to ignore any words that we've put in vale/vocab.txt.

1extends: spelling
2message: "Did you really mean '%s'?"
3level: error
4ignore: vale/vocab.txt
5filters:
6	- ':[a-z\-]*:' # Ignore all custom emoji words

The last couple of lines of that configuration file, show ways that you can ignore words using regular expressions. In this example we're ignoring any words in between colons, like :example:, a common shortcut for for emoji.

And similarly to Alex, if you need to tell vale to ignore some parts of your documents, you can mark them with comments:

1This part will be checked
2
3<!— vale off —>
4
5This part won't be checked
6
7<!— vale on —>
8
9This part will also be checked

There are many other types of checks that you can use vale for, from making sure that your capitalization and punctuation are consistent, to making sure that you're writing to a popular style guide such as the Google developer documentation style guide.

Tying it together

You can see how this all ties together in the buildkite/docs.

We run both vale and Alex automatically on every pull request, using Buildkite, configured in .buildkite/pipeline.yml. We're running Vale directly on the Buildkite Agent, but using the Docker plugin to run Alex on a Docker image that already has Node.js installed.

1steps:
2  - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting for insensitive words"
3    command: npx alex@8 --diff pages/**/*.erb pages/**/*.txt
4    plugins:
5      - docker#v3.5.0:
6          image: "node:alpine"
7  - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting"
8    commands:
9      - curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2
10      - ./bin/vale pages

We still have a lot of room for improvement, and will look at things like renaming the master branch of the git repository, linting the app as well as well as the docs, and improving our contributing and style guides.

Scale-Out Delivery Platform→

Capabilities

Pipelines→

Test Engine→

Package Registries→

Mobile Delivery Cloud→

Bring your own compute

Hosted compute

Replace Jenkins

Workflows for AI/ML

Testing at scale

Monorepo mojo

Webinars

Blog

Case studies

Events

About

Careers

Follow Buildkite

Linting the Buildkite Docs

Alex

Vale

Tying it together

Related posts

Private package management with Packagecloud

Buildkite plugins, Docker, and shared environment variables

Understanding and Preventing Common Security Vulnerabilities

Start turning complexity into an advantage

Platform

Hosting options

Resources

Company

Solutions

Support