Day 26: Write About Your Problem

Problem solving is a very individual process. Everyone has a preferred way of thinking about problems, and as a result they come up with their own methods and strategies to find solutions. Programming is similar in that there are several ways of solving any design or implementation issue. It is also a singular field in which developers spend most of their time solving problems.

This leads to the necessity of cultivating problem solving strategies in the programming field. Most programmers have a crystalized process that is used frequently for solving similar classes of problems. Many of the differences between programmers have to do with how effective their solutions processes are. That is why I think that discussing solution techniques is a very important part of becoming a better programmer.

Writing as an Exercise

One of my preferred methods for problem solution is to use the old pen and paper to write down solution possibilities. Unlike most people who like to draw diagrams, I think that writing words that convey meaning about the problem is the shortest way to understand and finally figure out a solution.

Writing has several advantages, but they are compound when it comes to software development. First, software creation is mainly a symbolic manipulation activity, in many ways similar to writing. The software creation process certainly has its peculiarities, but it  involves describing the problem at a level that can be understood not only by computers, but also by human beings.

Another of the advantages of writing in the initial phase is the opportunity to build up a vocabulary. Playing with words to explain and try to find potential solutions for a problem can help in building this vocabulary in a quick and effective way. Later on, this might be helpful in identifying key elements such as classes, methods, use cases and other components of the system.

Writing During Initial Design

The design phase is the part of the development process where writing plays an obvious role. First, when designing you will want to convince not only you, but probably your boss and other members of the group that your approach is viable. In this phase, you will be interested in creating a formal document such as a technical spec for the work that will come.

Even if a formal document is not necessary, it is a good idea to create something polished and presentable. The reason is that this will become necessary eventually. Even when working on your own projects it will come a time when you need to explain the design to someone else. Having a design document can greatly simplify this communication effort between you and other developers.

Having a technical spec can also be of great help for yourself, because it may serve to clarify issues that would not appear otherwise. For example, if two or more systems need to cooperate for your design to work, a spec can help you clarify the roles for each component of the system. This way, you can refer to the resulting documentation whenever you’re ready to work on the implementation.

Writing as an Implementation Strategy

Another good use for writing is as part of the coding process itself. During this phase, one is interested not only on what will be done, but in the exact details of how some programing task can be accomplished.

A few people have perfected this kind of coding strategy. The best known of them is professor Knuth from Stanford. He was the creator of the literate programming technique, which is based on a simple idea: first write one or more paragraphs describing whatever a section of code is supposed to do and how it will be accomplished, then present the source code that achieves the desired result.

The great advantage is that, by writing first, you are thinking about particular issues that the code will have to deal with. Once you start writing the code itself, the whole strategy will be well developed. While doing this takes some extra time, depending on the complexity of what you’re implementing it may be a net win. I wouldn’t spend this energy to write software that I already know how it is supposed to work. On the other hand, writing complex functionality may benefit from the extra care, and literate programming can be a big win in this case.

Writing for Debugging Purposes

Debugging can also benefit from good, organized writing. The typical problem during a debug session is to track the reason why some unexpected event is happening. That may include things such as a crash, or a value that is printed incorrectly. Whatever the erroneous event you are trying to debug, it is easier to trace the causes if you write down the known reasons.

It turns out that writing a few paragraphs about how a bug happens may be the key to find its solution. The reason is that by writing about the problem you need to put the known facts in a logical sequence, and that sequence is frequently the key to understand how a bug is triggered. Composing a good description of how a bug is happening may provide all the elements necessary to pinpoint where the faulty code is.

For more complicated cases, I like to add some description of the involved code and the methods in the stack trace — if one is available. Then, it is possible to write down the sequence of events happening in the application from the point of view of the code being executed. Of course, I don’t need to do this for every bug, since it is a time consuming process. But for bugs that are difficult to solve, writing seems to be better than spending my time aimlessly staring at code.

Conclusion

Writing is a great way to make concepts clear both for you as well as for other people. As a development tool, writing has been explored by several programmers. For example, it is a major part of techniques such as literate programming. However, you don’t need to change your programming strategies to benefit from writing as a planning method. Just put your thoughts on paper, giving proper names to concepts, will increase of capacity of analyzing problems and uncover bugs even before they hit your code.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • E-mail this story to a friend!
  • HackerNews
  • Reddit
  • StumbleUpon
  • Twitter

Day 25: Learning to Document Properly

Documentation is a polemic topic in programming because there are so many ways of doing it. While everyone agrees at some level that software documentation is important, there is rarely agreement between two programmers on exactly what needs to be documented and how.

Despite the misunderstandings, there are still some common practices that can be incorporated in a daily development workflow. To simplify, let us start discussing some areas of a code base where most everyone agrees that there should be extensive documentation: the public API. We will see why and how such interfaces should be documented, and then move to other cases.

Documenting APIs

If there is a part of a code base that really needs good documentation, that is the external API. This is specially important because, in many cases, the interface of the library is the only part of the code available to users. This is a common situation in languages such as C++, where header files can be distributed independently from the implementation sources. A similar case happens when only the Javadoc files are distributed for a Java library.

When the source code for a library is not distributed, the comments in the public API are the only way to determine how to properly use an interface — in the absence of a more extensive manual. Moreover, even if the complete source code of a library is accessible, API developers need to remember that it is sometimes a hassle to look up the implementation for an API — especially when a code browser is not available. Whatever the situation, looking at the implementation files should not be necessary to understand how to properly use an API.

When documenting libraries, it is very important to provide clear examples of how a method can be called, and if possible, the context in which such a call is correct. Sadly, many libraries come with documentation that provides detailed explanations for each method individually, but fails at explaining how that method can be used along with other classes to achieve a desired result. This is a frustrating experience for programmers, and can make your library harder to use, even if it is well designed and implemented.

Documenting Internal Header Files

Header files that are internal to the project have many of the same challenges presented by external libraries. In a sense, each header file is a mini API that defines how other parts of the library or application will interact with that specific section of the code. For this reason, programmers need to be careful to present a complete picture of what that module or class are trying to accomplish.

On the other hand, many internal header files have low importance when viewed along with other parts of the code base. Because they are mostly used as an implementation detail, several of these internal header files and classes don’t really require further explanation. Depending on the particular phase of the project, these files may be frequently written and modified, and are subject to changes that may render any documentation useless within a short time span. Therefore, providing extensive documentation for some private classes and header files may be completely unnecessary.

A common guideline when working with internal header files is to treat them according with their relative importance to the project. There are some small header files, containing only POD classes, for example, that don’t require any time spent in documentation. These are essentially dependent files, which receive their meaning from other, more important definitions in the project.

More important header files should receive better documentation, however. For example, classes that encapsulate fundamental concepts in the application should be fully document, sometimes with as much care as one would spend when writing external APIs.

Documenting Internal Code

Internal code is the easiest to handle, because developers have complete control over what it will look like. At the same time, it is the most contentious area in code documentation because each software author has a different idea of what needs to be documented or not.

General guidelines are still possible here. It is now understood that there is little value in trying to document “how” something is done. For example, the typical comment:

i++; // increment the counter “i”

should be avoided at all costs: it just adds more noise to the code, without contributing anything. Any C++ or Java programmer will be completely familiar with the meaning of the line of code above. It would be a different situation, however, if the comment tried to explain “why” the expression is necessary:

i++; // incrementing here because the next section assumes? // all counters have been updated

The comment is now useful. Its purpose is clear, and it explains in a few words why the counter needs to be updated at this particular location, and not after the next instructions are executed.

It is easy to understand the difference between “how” and “why” comments once you start asking these questions yourself. Whenever you are about create a new comment, think first: is this describing what happens or why it happens? The first type of comment should be avoided. The second type of comment may be useful enough to warrant its inclusion.

Conclusion

Commenting code is an stylistic decision. Every programmer has a different way to handle comments. Some developers never write comments, while a few others like to create small essays for each section of code (consider, for example, literate programming).

Despite the differences, comments should always be handled with proper care. They are more important when code is made available to external users, who don’t have access to the implementation. They should also appear in sections of the code where an explanation is necessary for a complete understanding — clearly describing why, instead of just how, something is done.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • E-mail this story to a friend!
  • HackerNews
  • Reddit
  • StumbleUpon
  • Twitter

Day 24: Refactoring Code Frequently

Software is not a static entity that exists in a single, defined way. The great advantage of software over traditional engineering building material is that it can be modeled for different uses as necessary. This advantage, however, incurs in a penalty that is easily overlooked: it is necessary to maintain software during its life time so that it maintains the original qualities, even when facing changes in the way it is used.

This flexibility along with its associated maintenance problems can be easily seen in any non-trivial piece of software, be it commercial or open source. An application or library usually starts as a way of solving a particular need. As development progresses, however, the same software is frequently adapted to solve more and more complex use cases. As a result, code needs to adapt itself to these changing requirements, changes that sometimes modify important assumptions upon which the original code was written.

The main challenge faced by developers is to allow a piece of software to evolve in an organic way, so that it will be provide a response to these sometimes conflicting requirements. As an example, the same application that was once able to process a single file with a particular purpose may have, a few years later, be able to handle multiple files, each one with a different type of contents.

An Answer to Changing Requirements

Because of these difficult requirements, a lot of software artifacts just decay due to a lack of proper maintenance. As changes happen to the environment where it exists, a program will start to provide incorrect answers some of the times, or even most of the time. These are bugs that have been introduced by the very same evolution that is reshaping the program to be able to cope with different requirements.

Despite this, it is possible to maintain software properly even through big changes happen to its processing capabilities. Refactoring is the central concept enabling this adaptation on a daily basis. With the help of refactoring, programmers are able to slightly modify the way a piece of software works, so that it can satisfy specific characteristics needed by new requirements. At the same time, refactoring can be used along with unit testing to improve confidence on the quality of the existing code base.

Refactoring to the Rescue

Refactoring surfaced as an automated way of modifying programs, especially code written in an object-oriented fashion. Smalltalk was among the first languages to provide a refactoring facility, so that programmers could make small changes with confidence. It is important to understand that changes allowed by refactoring are usually minimal so that they could be easily automated. Examples include: renaming methods, changing the order of arguments in a method declaration and all method calls, or moving a method from a subclass to its parent class.

The main advantage of refactoring compared to other approaches for code modification is that refactoring is (or can be) completely automated. With the help of a refactoring tool, it is very simple to make changes to a code base that will make the program cleaner, while at the same time maintaining its correctness.

When to Use Refactoring

Another advantage of refactoring is that it doesn’t need to be considered as a task separated from coding. Good software engineers make liberal use of refactoring in their workflows. In fact, some people use refactoring very frequently, in order to provide one or more of the following:

Improving readability: a simple reason to use refactoring is to improve the readability of existing code. Sometimes it is possible to combine two or more statements and make a method much easier to read. Conversely, it is possible to split a statement in one or more, while at the same time improving the readability of the whole section of code. These are powerful uses of the capabilities of a refactoring tool.

Adapting to changes in the underlying code: sometimes the implementation of a new feature will require just a slight change to existing code. For example, that change may require a new parameter to be added to a method; or maybe an existing method will need to be moved to another class. Instead of doing this manually, one can use a refactoring tool to achieve the same results with less work.

Naming methods and variables properly: another good use of refactoring is making sure that methods and variables reflect their true meaning in the program. Sometimes, as a piece of software evolves, the names used in the initial implementation may lose their meaning, either because of business reasons or because other parts of the implementation have evolved too. Refactoring is a simple way to solve this issue, since it guarantees that all other parts depending on these methods or variables (including automatic tests) will be properly updated.

Conclusion

Refactoring is a powerful tool, and it should be part of any programmer’s tool chest. However, like any tool it takes some training to get used to it. Some of this ties with the general idea of properly learning your software environment. Most modern environments (command line or IDE-based) have provisions for running refactoring tools.

Also, refactoring tools are available for many modern languages, the most common ones including Java, Smalltalk, Python, and C++. Therefore, the use of refactoring has just become a matter of mastering the underlying programming system. The frequent utilization of such tools can make our lives easier, and our code even better.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • E-mail this story to a friend!
  • HackerNews
  • Reddit
  • StumbleUpon
  • Twitter

The Beauty of Short Methods

Writing short methods is one of the pieces of conventional wisdom that have been passed around as a form of agile technique. Many programers adhere to the use of short methods without thinking too much about it, while others disregard the idea altogether, more concerned with the practical activity of creating usable code.

There is, however, a number of surprising reasons why writing short methods can improve the quality of software. Some of these reasons have nothing to with just writing auto-commenting code, as some people describe it. One of the main issues is the distinction between the static description of code provided by a set of classes, as compared with the dynamic and complex view provided by algorithms.

Dynamic Properties of Algorithms

When we start on Computer Science at college, we discover a new world of scientific pursuit in the area of computational systems. Students are then presented to algorithms, and taught that this is the foundations of programming. We are lead to believe that using correct algorithms is necessary and sufficient to write good software.

There is no question that algorithms are a foundational concept, but they give programmers the false sense that explicitly writing an algorithm is the best way of solving complex problems in programming. Throughout my experience as a software engineer, I have found that combining small methods from several classes and libraries is a far more common way of solving problems – a technique that covers the huge majority of programming tasks available nowadays.

Despite all the ideas about the simplicity and elegance of small methods, I believe the main reason to use them is to transform the complex dynamic nature of algorithms into a static description of a solution in terms of classes and methods.

Complex Algorithmic Methods

The main advantage and liability of thinking in algorithms is that you learn to reason in terms of complex logic. It is true that, after you really understand an algorithm, its logic become clear enough that you may use it in several other contexts, as with any other mathematical concept.

There is a problem with this, however: whole algorithms are not so easy to recognize at first sight, especially by someone who hasn’t been educated in computer science. Even if you have good training, it is not always clear that an algorithm can be recognized when it is used in the context of a larger method. If that is the case, a programmer needs to document that a particular algorithm is being used, and the reader needs to believe that the implementation is correct.

Also, it is not easy to verify that an algorithm implemented as a large method is correct. This is a big problem encountered when moving from algorithms in a classroom setting to their concrete implementation. While in a classroom, an algorithm can be proved as a mathematical entity. It can be shown as correct by simply proving its mathematical properties.

However, when an algorithm is converted into software, all kinds of mundane issues happen that directly affect its result. For example, Knuth once mentioning that a very small percentage of implementations of a particular quick sorting algorithm appearing in the literature are actually correct.

Small Functions as a Vehicle to Algorithms

Small methods are a much more fruitful way to express complex algorithms, because they have an important property: they are able to convert the dynamic world of algorithms into the static world of classes and their associated structures. This is not a small feature, and it is the root of the power of the Smalltalk approach.

In a Smalltalk program, every method is just a few lines long. Incidentally, that is why they have such a simplistic editor that can display only a small number of lines. When one sees this kind of program, there is the feeling that every method does very little. However, this allows a number of important features that are core to the philosophy promoted by the language.

First, it is very easy to determine how a single method works. The two-to-five-lines rule used when creating new methods makes it simple enough to understand their contents just by inspection – unlike anything you could do with long methods or functions. As an example, a number of static analyzers for languages such as Java and Objective-C fail when a method is too long. If a computer has problems when considering the branching possibilities in large methods, imagine how difficult it is for humans!

As another advantage, all the complexity of an implementation can been moved from the leaves of the system (the methods’ contents) to the internal nodes composed of classes and their methods. As a result, there is a large number of self documenting methods that can be analyzed by just looking at a class diagram or class browser. This is exactly the reason why the class browser is the standard tool in Smalltalk systems.

Finally, all the dynamic aspects of a program are stored in the code either using polymorphism (which further reduces the complexity of each individual method) or as part of the methods themselves, using selection and repetition structures. However, since these control structures are used in very small pieces of code, the resulting interactions are also very easy to understand.

Small Methods in Traditional Languages

Small methods can be a powerful weapon when trying to reduce the complexity inherent to software programming. Looking in this way, it is hard to understand why this kind of methodology is not more widespread in the programming field.

The first reason is the belief that programming with short functions/methods is bad from a performance standpoint. This kind of view is less relevant each year due to the increasing performance of computer systems, but is still around. The best answer is that doing otherwise is a premature optimization, and code quality should be the number one priority for a software engineer.

Another valid reason is the increase in class complexity when a large number of small functions or methods are used. This can justify the idea of using longer methods and a smaller number of methods per class. The solution for this problem, however, is already in place with modern IDEs. Each on of them allow programmers to traverse the hierarchy of classes in a system, most frequently using graphical tools to simplify the task.

When using this kind of system, it is important to emphasize the complexity shift from single methods to the whole class, while moving from analyzing the contents of a method to looking at class names and connections between classes. Modern IDEs also provide tools for moving methods around, using refactoring techniques that have been significantly improved during the last decade.

The good news is that, although Smalltalk is built around the idea of very short methods, it is not necessary to use Smalltalk to benefit from that concept. It is nowadays possible to use the same techniques in any object oriented language. In fact, it is possible to do this even on traditional languages such as C or Pascal, although this will result in losing the benefits of polymorphism and encapsulation – which can even be a reasonable tradeoff depending on the type of application being written.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • E-mail this story to a friend!
  • HackerNews
  • Reddit
  • StumbleUpon
  • Twitter

Five Common Mistakes in the Early Design Stage

Creating great software requires attention to several details that can make the difference between a well-designed system in one hand, and a failure in terms of features and maintainability in another. Every software engineer has a big list of elements that are necessary for a successful project and that should be part of the work cycle no matter what happens in the team.

In this post, however, I would like to emphasize a few mistakes that are so common in this field, but that keep repeating themselves due to bad management practices. It would be great to maintain this list in mind, since this kind of information can help us make better decisions whenever we are faced with similar situations.

Solving too much of a problem

A lot of failures in software design have to do with a lack of clear focus on a particular area. For example, inexperienced programmers and managers try to work on more than they can easily accomplish. The main cause for the resulting failure is trying to solve a big problem all at once.

The reason why it is so much better to focus on a particular goal has to do with the nature of problem solving: it is easier to solve several smaller problems than finding a single solution for a big one. When human beings try to handle complex details they get overwhelmed — and this happens even with smart people. In fact, a great part of being smart is to learn how to break problems into smaller pieces that can be easily solved.

Similarly, when developers try to create a program with too many responsibilities, the result is usually less satisfying. To avoid this pitfall, try to break problems into parts that can be easily solved independently. Then, come up with a way to combine the solution so that it becomes transparent for users. As most problems in software design, there are several ways to achieve the same result. Experience can teach you the solutions that work better over time. Just become aware of the complexity of each subproblem you’re trying to solve, and try to reduce the complexity by breaking up the problem if necessary.

Solving too little of the problem

While this is far less common that the previous problem, it also happens that systems may be underpowered for the problem they aim to solve. For example, they may miss critical steps that are necessary for the execution of a process, requiring manual intervention during a workflow. When this happens, users frequently need to take additional steps to fix issues that should be covered by the application in the first place.

They main cause for this problem is a failure in the design and requirement gathering phase. It is possible that software developers didn’t have enough experience to determine the main requirements of a full solution for the problem they were trying to solve. This is a cause of much frustration for users, and it usually leads people to disregard software as always incomplete and unreliable.

The best way to fix this category of problem is to have a better understanding of the user’s needs. Many time this issue can be solved not only by adding more features to an application, but by making it more flexible, so that different users can exercise the application in the way that better match their needs. Most successful programs make a conscious effort to create flexible solutions that can be quickly adapted by users as necessary.

Designing around the user interface

While the user interface is the most visible element of a software application, it may not be the best way of analyzing and organizing the functionality of a piece of software. Unless you are working on an application that absolutely needs to have a particular interface, working first with the functionality in mind is better than blindly applying user interface elements from the beginning.

In many cases, software engineers use the interface as a focal point of feature design, this being one of the core ideas proposed by agile methodologies. They frequently forget, however, that the GUI is merely a presentation method for something of more fundamental in importance for the user. The logic and data model components contain must of the necessary parts of a program; moreover, these parts are mostly independent of any graphical interface.

Such a policy of functionality-based design may be hard to enforce if you start the project working directly on the user interface. As a result, a number of bad decisions may occur from an unnecessary focus on UI in detriment to the real functionality of the application. Just as an example, in the past it was common to have graphical applications with menus and main windows, even when that was not the best way to present functionality to users.

Not integrating properly with other systems

In most cases a system cannot be used in isolation. Even the simplest ones have to be aware of the hardware, operating system, and programming environment where they live. Similarly, software should be designed to take into consideration the degree of integration required from other systems that have already been implemented.

In a company, for example, we need to integrate with database systems, and well as other applications that provide business functionality. Web sites need to integrate with other technologies provided by the environment used such as Java, Ruby, or PHP. A smart software engineer should be able discern the best way to integrate into the environment where the program will live.

Failure in this area usually leads to systems that are underpowered because they don’t use and integrate the functionality available in the environment. Therefore this type of issue can also be viewed as a cause for the second item mentioned above. Try to make the best of existing technology in order to avoid this problem.

Not Using Common Solutions

Many engineers have trouble using established solutions for common software problems. This may come from lack of understanding of a particular platform or programming environment. In other occasions it is common to observe the familiar feeling of “not made here”, which leads many developers to recreate existing technology.

The truth is that current programming technology is too complex to avoid using libraries and other applications developed by third parties. In fact, I would characterize as a liability the urge to implement everything in a software project, for the simple reason that a single person (or team) cannot have the competency to implement all parts needed by a modern application.

The best practice is to evaluate and use libraries and frameworks that have been proven to solve all the main technical problems found during development. This has been made much easier by the emergence of open source software. A major advantage of open source is that solutions can be created for common problems and consequently shared by a whole community. A strong open source scene is a must have for most modern programming environments.

Whatever the reason for this problem, it is essential to leverage the knowledge and work available through existing solutions. Trying to reinvent the wheel is in most cases just an example of bad practices in software engineering.

Conclusion

Developing software requires attention to several details. I have mentioned only five of the factors that can make or break a project, depending on how well they are managed. A lot of other issues remain, but being careful about the items mentioned here can definitely help in the successful completion of complex software projects.

  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • E-mail this story to a friend!
  • HackerNews
  • Reddit
  • StumbleUpon
  • Twitter