“Always write documentation for everything” seems like advice that you can give anyone in any situation and it will make everything better. It’s not. There is a cost and risk associated with every line of documentation you write, and this is exactly what this blog post is about.
When I say “documentation”, I mean all forms of it, including README, Markdown files, docstrings, and code comments.
Why your documentation sucks
- It requires maintenance. Each time a function behavior is changed or an argument is added or removed, the function documentation needs to be checked and updated. And not only the documentation of this function but potentially any other documentation that references it. And often the refactoring tools won’t help you with that. Many IDEs for many languages will let you rename a function and all references to it but won’t change its name in documentation strings that reference it. And no IDE will help you if that reference is in a readme file or another plain text document.
- It gets outdated. Naturally, when you change a function behavior but forget to update the documentation, they start to contradict each other. And no documentation is always better than documentation that lies. If function docs don’t cover behavior in some corner cases, I can check it myself. If it covers such behavior but that’s not how it actually works, I will learn about it only when I get a nasty bug during tests or even on the prod.
- It can get noisy. It takes time to read documentation, and if it doesn’t bring anything in addition to the code (i.e. says the same as the line of code it describes or the function name), that time is spent without purpose. For example, I often see a comment like
# create a user
followed by a line of code likeuser.create()
. Or a functioncreate_user
with documentation saying"create_user creates a new user"
. You shouldn’t make other people read the same thing twice. - Nobody reads it anyway. Everyone is in a hurry. There are hundreds of libraries in the project, constant pressure from the management to deliver features, and just laziness. From this blog post, most people will read only the title, some will also look at section titles and the bold text in each bullet point, and only a few (you’re probably one of them) will actually read all the text.
What you should do
- Don’t require documentation. If you have a tool that enforces the presence of documentation but doesn’t check its quality, you will naturally get people writing documentation like “create_user creates a user”. However, it’s still a good idea to require documentation for public functions in an open-source library.
- Use type annotations. If you can describe something in a way that can be checked by a tool, you should do that. In particular, that means describing function argument and return types not in plain English but in a structured manner that can be checked by a type checker. Use mypy for Python, TypeScript for JS (with JSDoc or the custom TypeScript syntax), Sorbet for Ruby, etc.
- Provide examples and test them. “Show, don’t tell”. Instead of explaining how the function behaves in some specific situation, make an example. And the great thing about code examples is that they can be tested. Use doctest for Python, doc-tests for Rust, testable examples for Go, etc.
- Write contracts. Don’t assume that users will call your function only with specific arguments. Check in the runtime that it is actually true. For example, if a function argument is expected to be only positive, add a check for it and return an error if this contract is violated. If your language has a good framework for design by contract, use it. For example, deal for Python or arguard for Go.
- Write safe code. “The best API is the one that cannot be misused”. Don’t tell users that
Connection.close
must be called only afterConnection.open
. Instead, makeConnection.open
return a newOpenConnection
object and make theclose
method to be available only onOpenConnection
but not onConnection
. That way, it’s impossible to callclose
without callingopen
first. - Avoid useless comments. Don’t write a comment if it says the same as the code. Don’t comment
# create project directory
if the code saysproject_dir.create()
. If you see it in the code, remove it. Can you automate it? Perhaps. I made flake8-comments, an experimental linter for Python that detects some of the comments like these. The linter is very simple, consider making a similar one for the language of your choice. - Prefer descriptive code. If a code block is so confusing that it needs an explanation, instead of simply adding that explanation think if you can make the code itself more readable.
- Avoid duplication. Don’t say the same thing in many places. Instead, describe it once and reference it in all other places. That way, when you need to change it, you can do it once. For example, the first version of documentation for genesis/channels used to describe the behavior for buffered channels and cancellation of each function in the function’s documentation. While it may maybe more convenient for the users, it’s a lot of duplication because almost all of the functions behave in the same way in this regard. So, the updated documentation describes it once, in the package documentation.
- Generate documentation. For example, it’s a common practice to generate an OpenAPI specification from the code. The code already knows about available endpoints, supported HTTP methods, and expected types, so at least some parts of the public API documentation can be generated, saving lots of effort and avoiding situations when documentation gets out of sync.
- Write tests for documentation. Sometimes, you’ll have the same thing repeated twice, in the documentation and in the code. Maybe because you don’t want to bother with generating documentation or maybe because you want to provide the best documentation-first experience for users. Either way, it might be a good idea to write tests for your documentation. For example, if you have a number of CLI commands defined in a project and each one should be described in the docs, you can parse the code, extract the list of defined commands, and test that indeed each one is mentioned in the docs.
- Keep documentation close to what it describes. When modifying a piece of code, it’s easier to notice documentation that needs to be updated if it is placed right next to the code it describes. For example, if you write a linter with a number of rules it checks, it might be a good idea to describe each rule in the code next to where it is defined instead of a separate file or documentation.
- Take some time to improve documentation readability. Write TL;DR, structure the docs so that it’s easy to navigate, make important things bold, add emojis and visual clues, and use lists and tables.
Remember: readability matters. We spend much more time reading code than writing code. So, it should be your goal to provide the best experience for people reading and maintaining your code. Sometimes, that means writing less, not more.