We're all familar with publishing typos in production docs. But what if there was a better way? After merging Juanito's super helpful pull request to replace 'blacklist' and 'whitelist' with friendlier terms, I was finally motivated to add a few tools and checks to stop myself and others making those types of mistakes.
These ideas aren't new, here are some lovely folks talking about improving the inclusivity of docs from 2017 and 2018.
- Don't Say Simply - Write the Docs
- Harriet Lawrence: Sociolinguistics and the Javascript community: a love story | JSConf EU 2017
In this post I'll show you how to configure a couple of different tools to run various checks on the documentation, and how to configure Buildkite to run them automatically.
Anyhow, to the tools.
Alex
First up, alex is a one-stop nodejs tool for catching insensitive or inconsiderate writing.
As long as you have npm (with Node.js) installed, you can run alex directly without installing it using npx
. Running alex on our docs repo looks a bit like this:
1$ npx alex@8 --diff pages/**/*.erb pages/**/*.txt
We're using npx
because we're planning to run this in CI, pinning alex at version 8, and checking all docs in the pages
directory, some are *.erb
files and some are plain text *.txt
. (*.erb
files are Embedded Ruby Template files, which are mostly markdown, with a sprinkling of Ruby code)
The first time I ran it, alex returned a few hundred errors and warnings in this format:
1pages/integrations/cc_menu.md.erb: no issues found
2pages/integrations/github.md.erb: no issues found
3pages/integrations/github_enterprise.md.erb: no issues found
4pages/integrations/gitlab.md.erb: no issues found
5pages/integrations/phabricator.md.erb: no issues found
6pages/integrations/slack.md.erb: no issues found
7pages/integrations/sso.md.erb: no issues found
8pages/pipelines/artifacts.md.erb: no issues found
9pages/pipelines/block_step.md.erb
10 3:37-3:46 warning Be careful with `execution`, itβs profane in some cases execution retext-profanities
11
12pages/pipelines/branch_configuration.md.erb
13 68:65-68:72 warning Be careful with `periods`, itβs profane in some cases periods retext-profanities
I decided that neither periods
not execution
were words that I need to avoid in a software context, so I added them to the list of exceptions in .alexrc
:
1allow:
2 - execution
3 - periods
A combination of adding all of the checks that I didn't find useful to .alexrc
and making edits to the docs where necessary removed most of the warnings, but sometimes as a real thinking human person you know better than the linting tool. In that situation you can explicitly tell alex to ignore something that it would otherwise warn you about, using an html style comment (which is valid in both markdown and html).
1<!--alex ignore whitelist -->
2This sentence will **not** trigger the check for whitelist. This can be useful when writing a style guide that includes examples of what *not* to write.
Great, our docs are already much more friendly and inclusive, but what else can I improve while I'm here?
Vale
Vale is a syntax-aware linter for prose built with speed and extensibility in mind, written in Go. Sure, but what does that mean? It means that we can configure it to do various different things, such as spellchecks, testing for unintentional repeated words, suggesting alternatives to incorrect capitalization, etc.
Exactly how you install vale depends on where and how you're using it, but we're doing something like this:
1curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2
Note that you should never download and pipe software to a shell interpreter like this from a source you do not trust.
Configuring vale is a little trickier, because it does more things. Let's take a look at .vale.ini
:
1# we'll keep our custom configuration in the vale/ directory
2StylesPath = vale
3
4# display warnings as well as errors
5MinAlertLevel = warning
6
7# Remember how we said we use both *.txt files and *.erb files?
8# This tells vale to treat both as markdown
9[formats]
10erb = md
11txt = md
12
13# For all file formats [*], run the tests and checks in vale/Buildkite
14# and run the built-in check for repeated words (Vale.Repetition)
15
16[*]
17
18BasedOnStyles = Buildkite, Vale.Repetition
In vale/Buildkite
, which is where we configure our custom checks there are currently two files, one for each check that we do.
existence.yml
suggests alternatives to things that we know we don't want in the docs:
- blacklist β allowlist
- oAuth β OAuth
The configuration for it looks like this:
1extends: substitution
2message: Consider using '%s' instead of '%s'
3level: error
4# swap maps tokens in form of bad: good
5swap:
6 whitelist: allowlist
7 blacklist: blocklist
8 oAuth: OAuth
We're using a customization of the built-in substitution
rule, that when it detects one of the words we don't want, such as whitelist
, raises an error and suggests the alternative (allowlist
):
1Consider using 'allowlist' instead of 'whitelist'
spelling.yml
uses the default dictionary (en-US, but you can use other languages of course) to highlight potential spelling mistakes. Due to the amount of technical terms in softare documentation, it might intially seem that you have far too many of these to want to fix, but we can work around that.
Vale already ignores anything marked as code, or in links, we then used some regular expressions and shell scripts to get a list containing one (and only one) of each word that vale considers a spelling mistake. (I'll leave that regular expression work out of this article, but if you're super interested drop me a line)
We quickly went through the list, deleting any mistakes, and leaving in technical terms that we do not consider errors:
1Atlassian
2autogenerated
3Autoscaling
4autoscaling
5Basecamp
6Basscss
7Bazel
8...
Then we confgured the spelling rule like this, telling vale to ignore any words that we've put in vale/vocab.txt
.
1extends: spelling
2message: "Did you really mean '%s'?"
3level: error
4ignore: vale/vocab.txt
5filters:
6 - ':[a-z\-]*:' # Ignore all custom emoji words
The last couple of lines of that configuration file, show ways that you can ignore words using regular expressions. In this example we're ignoring any words in between colons, like :example:
, a common shortcut for for emoji.
And similarly to Alex, if you need to tell vale to ignore some parts of your documents, you can mark them with comments:
1This part will be checked
2
3<!β vale off β>
4
5This part won't be checked
6
7<!β vale on β>
8
9This part will also be checked
There are many other types of checks that you can use vale for, from making sure that your capitalization and punctuation are consistent, to making sure that you're writing to a popular style guide such as the Google developer documentation style guide.
Tying it together
You can see how this all ties together in the buildkite/docs.
We run both vale and Alex automatically on every pull request, using Buildkite, configured in .buildkite/pipeline.yml. We're running Vale directly on the Buildkite Agent, but using the Docker plugin to run Alex on a Docker image that already has Node.js installed.
1steps:
2 - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting for insensitive words"
3 command: npx alex@8 --diff pages/**/*.erb pages/**/*.txt
4 plugins:
5 - docker#v3.5.0:
6 image: "node:alpine"
7 - label: "<img class="emoji not-prose size-[1em] inline align-[-0.1em]" title="lint-roller" alt=":lint-roller:" src="https://buildkiteassets.com/emojis/img-buildkite-64/lint-roller.png" draggable="false" /> Linting"
8 commands:
9 - curl -sfL https://install.goreleaser.com/github.com/ValeLint/vale.sh | sh -s v2.2.2
10 - ./bin/vale pages
We still have a lot of room for improvement, and will look at things like renaming the master branch of the git repository, linting the app as well as well as the docs, and improving our contributing and style guides.