Day 25: Learning to Document Properly
Documentation is a polemic topic in programming because there are so many ways of doing it. While everyone agrees at some level that software documentation is important, there is rarely agreement between two programmers on exactly what needs to be documented and how.
Despite the misunderstandings, there are still some common practices that can be incorporated in a daily development workflow. To simplify, let us start discussing some areas of a code base where most everyone agrees that there should be extensive documentation: the public API. We will see why and how such interfaces should be documented, and then move to other cases.
If there is a part of a code base that really needs good documentation, that is the external API. This is specially important because, in many cases, the interface of the library is the only part of the code available to users. This is a common situation in languages such as C++, where header files can be distributed independently from the implementation sources. A similar case happens when only the Javadoc files are distributed for a Java library.
When the source code for a library is not distributed, the comments in the public API are the only way to determine how to properly use an interface — in the absence of a more extensive manual. Moreover, even if the complete source code of a library is accessible, API developers need to remember that it is sometimes a hassle to look up the implementation for an API — especially when a code browser is not available. Whatever the situation, looking at the implementation files should not be necessary to understand how to properly use an API.
When documenting libraries, it is very important to provide clear examples of how a method can be called, and if possible, the context in which such a call is correct. Sadly, many libraries come with documentation that provides detailed explanations for each method individually, but fails at explaining how that method can be used along with other classes to achieve a desired result. This is a frustrating experience for programmers, and can make your library harder to use, even if it is well designed and implemented.
Documenting Internal Header Files
Header files that are internal to the project have many of the same challenges presented by external libraries. In a sense, each header file is a mini API that defines how other parts of the library or application will interact with that specific section of the code. For this reason, programmers need to be careful to present a complete picture of what that module or class are trying to accomplish.
On the other hand, many internal header files have low importance when viewed along with other parts of the code base. Because they are mostly used as an implementation detail, several of these internal header files and classes don’t really require further explanation. Depending on the particular phase of the project, these files may be frequently written and modified, and are subject to changes that may render any documentation useless within a short time span. Therefore, providing extensive documentation for some private classes and header files may be completely unnecessary.
A common guideline when working with internal header files is to treat them according with their relative importance to the project. There are some small header files, containing only POD classes, for example, that don’t require any time spent in documentation. These are essentially dependent files, which receive their meaning from other, more important definitions in the project.
More important header files should receive better documentation, however. For example, classes that encapsulate fundamental concepts in the application should be fully document, sometimes with as much care as one would spend when writing external APIs.
Documenting Internal Code
Internal code is the easiest to handle, because developers have complete control over what it will look like. At the same time, it is the most contentious area in code documentation because each software author has a different idea of what needs to be documented or not.
General guidelines are still possible here. It is now understood that there is little value in trying to document “how” something is done. For example, the typical comment:
i++; // increment the counter “i”
should be avoided at all costs: it just adds more noise to the code, without contributing anything. Any C++ or Java programmer will be completely familiar with the meaning of the line of code above. It would be a different situation, however, if the comment tried to explain “why” the expression is necessary:
i++; // incrementing here because the next section assumes? // all counters have been updated
The comment is now useful. Its purpose is clear, and it explains in a few words why the counter needs to be updated at this particular location, and not after the next instructions are executed.
It is easy to understand the difference between “how” and “why” comments once you start asking these questions yourself. Whenever you are about create a new comment, think first: is this describing what happens or why it happens? The first type of comment should be avoided. The second type of comment may be useful enough to warrant its inclusion.
Commenting code is an stylistic decision. Every programmer has a different way to handle comments. Some developers never write comments, while a few others like to create small essays for each section of code (consider, for example, literate programming).
Despite the differences, comments should always be handled with proper care. They are more important when code is made available to external users, who don’t have access to the implementation. They should also appear in sections of the code where an explanation is necessary for a complete understanding — clearly describing why, instead of just how, something is done.
About the Author
Carlos Oliveira holds a PhD in Systems Engineering and Optimization from University of Florida. He works as a software engineer, with more than 10 years of experience in developing high performance, commercial and scientific applications in C++, Java, and Objective-C. His most Recent Book is Practical C++ Financial Programming.