How much code should you write?

Software is not like books: you don’t sell them based on number of words. A book requires you to write a few thousand lines to make it valuable. After all, the buyer will check the number of pages in the book, and consider the size as an indicator of how much content is in there.

This is not true for software. Users don’t care about how many thousand lines you had to write and test for the software to be built. They just care about their needs, and if the product solve the problem they have.

This is not to say that you should not write the lines of software if that is necessary for the functionality you want to provide. But putting emphasis on software writing is frequently the wrong thing. It is much more important to understand if the software you’re creating will solve the issues faced by your customers

Spending Time on Wrong Priorities

A basic mistake of technology startups, for example, is thinking that writing the software is the most important part of the process.

It is clearly important to create a software product in the first place, but the time you take for this to happen doesn’t correlate linearly with the value of the resulting system.

The best thing is to understand clearly what your users need, and supply an initial solution to that problem that can be coded as quickly as you can.

Sometimes this will involve less software writing than communicating with customers and listening to their feedback. Sometimes this will even involve removing existing features of the software that you currently have.

Buying or Building?

Another issue with code writing is that it is frequently better to look for existing solutions than to write these solutions yourself.

Our world has a large number of software waiting for good use. Open source projects, for example, have a huge amount of software that can be employed in the behalf of your clients. If you spend less time writing that software, you could probably get more out of the existing libraries, and provide even better products as a result.

On the other hand, if a new company fails to use existing libraries, specially open source ones, competitors will certainly use this opportunity. Nowadays, correct use of open source is a strategic choice that can impact the quality of your products. Even large companies such as Apple, which has the budget to create as much software as they need, are using Open Source libraries whenever it makes sense.

In other words, current developers have a lot more to do than just writing more code. For example, correct understanding of user needs is one of the main facets or the work that is frequently overlooked.

Summary

The main aspect of defining how much code to write as compared to reusing existing libraries is the proper use of effective solutions, without reinventing the wheel. Many developers think that they have the responsibility of writing all the major pieces of a software solution themselves. This is a disservice to customers, since they want a new solution, not a rewriting of what already exists.

These are ideas that can make your software writing take a fundamental leap. The alternative is having you spend your whole time with activities that, at the end, may not be in the best interest of your clients.

Day 16: Measure Performance Before Optimizing

Writing programs takes time, but writing correct and fast programs is even more difficult. That is why there is so much badly written software in the market. Bugs are the result of sloppy and rushed code. Programs are frequently slower than they need to be as a result.

In this situation, it sometimes seems smart to design systems that are efficient. But for many programmers, efficiency becomes an end in itself, and they end up trying to optimize code before understanding the real implications of these optimizations.

One of the most common problems is not understanding the real bottlenecks in the system. For example, when writing software for the web using standard architectures, accessing the data base is frequently the bottleneck of the system. When faced with this situation, it doesn’t matter if the code to handle string comparison is written in assembly or in Ruby, the result will be approximately the same.

Measuring Speed Up

There is an interesting result studied in computer architechture called the Ahmdal Law. This principle says that any optimization (say, getting twice the performance) applied to a part of the system that is responsible for only a small percentage of the execution time will result in a very small improvement.

For example, suppose that you improve the string handling code to be 10 times more efficient. This might give you hopes that the whole system will be a few times more efficient that it is. However, after measuring the system, you verify that the time spent on string comparison is 10% of the total time.

The result is that an operation that takes 10% of the total time now takes only 1% of the time. However, this means that the performance improvement for the whole system is just 9%.

If 9% still seems good for you, notice that I was very generous here. It is really hard to change something to be 10 times faster, unless you come up with a different algorithm. Also, the differences in percentage of usage are not commonly in the range 10 to 90%, but more frequently it is 1 to 99%. That is, most areas of your code contribute 1% or less to the running time, while bottlenecks such as network and database delays are responsible for 99% of the time.

Even for processor bound tasks this situation remains. There is probably a few small parts of your code that are responsible for 99% of the time spent in the program.

That is why measuring a system is so important. If you don’t know what is responsible for that figure of 99%, your efforts to improve the performance of the system are basicaly wasted. You should aim first at understanding what contributes to the bottlenecks in the system. Then, you can start planning on how to improve that area.

Further Reading

Read Computer Architecture: A Quantitative Approach to find everything you may want to know about Amdahl Law and how it applies to computer systems.

Day 15: Use Meta Programming When Possible

Meta programming is a style of software development where one is able to modify programs automatically, based on rules that have been previously defined in the system.

For example, suppose that one wants to create a graphical interface based on simple specifications. The UI code may be written as a complicated C program, which makes it difficult to adapt the system every time there are changes.

A better way to solve the problem is to have a separate program that reads a simpler specification for the graphical interface and produces C code that implements the specifications. It may be a little complicated to write a program to read the specifications and write the code in C, but once this is done it is a simple issue of adapting the specification file, instead of rewriting the GUI code.

In meta programming, we are thinking about ways of avoiding writing software by converting simple specifications into real code. By doing this, a programmer can speed up development time, because now the changes can be made by the computer, not by yourself.

The number of bugs can also be drastically reduced, because you need only to check that the generation is of code is correct. Once this is done once, every time there are changes you can rely in the code generator to do the right thing.

Language Support

A few languages make the concept of meta programming easier to use. For example, Lisp provides macros as a standard facility to write code that will be directly read and compiled by the system. Prolog provides simple clause expansions that may be used to generate code as needed.

For more traditional languages, meta programming is not part of the normal development process. This doesn’t mean however that it cannot be used. In fact, more verbose languages such as C/C++ and Java are the best targets for meta programming, because we can avoid writing a huge amount of code with such a facility.

A well known example of code generation with C is the lex/yacc system for generation of parsers and lexical scanners. Instead of doing the laborious work of generating a parser in C or C++, one can use yacc to generate the repetitive parts of the program. Then, one can add only the code necessary to operate on the parsed entities.

Similar programs exist for Java and C#. The general idea, however, is that such an approach can be used anywhere we have a formal specification for which we know how to generate code.

Nowadays it is easier than ever to use meta programming. For example, one can simplify the task of parsing the formal specification by using XML as the specification language. We have standard tools and libraries that can parse XML and generate code as necessary.

The main step in leveraging meta programming is thinking about the needs of the system and take advantage of simpler specifications whenever possible.Using meta programming when possible can easily reduce the cost and time of development for a system by one order of magnitude, while improving the overall quality of the resulting code.

Further Reading

Flex & Bison (Bison is a version of YACC) provide a simple way of meta programming, where code is generated based on the syntax of a particular language.

Generative Programming is a book that describes methods to create systems through code generation. It has many novel ideas that are of interest for practical applications.