Day 11: Understand the Lower Layers of the System

Hi, this is the 11th part of a series of posts on 30 tips to becoming a better developer. If you would like to keep up to date with the topics that I am covering, just check the main post.


As software developers, we enjoy thinking of a system in as higher level as possible. This has been an organizing principle in software development: people tend to create higher level abstractions that make it easier to reason about a more complex system.

However, working at higher levels of abstraction has its cost. Inefficiencies become harder to see, and at some point performance becomes a problem that needs to be handled.

A layered approach to software

Computers are organized in layers. The lowest level is hardware. In a higher level is application code. It is nice to work on higher levels, but if you don’t understand the most basic parts of the system it becomes difficult to make competent decisions. This is even truer as code gets more abstract.

This is not say that one shouldn’t be working at higher level. In fact, even though we shouldn’t understand about concepts such as machine language, it would be counter-productive to work with machine-dependent code.

The idea is that you should know (or at least have a good idea of) what is happening at the lower levels of your code. This way, you can understand what trade offs are being made in your code.

Another reason why it is important to understand the lower levels is that you can “break free” of the higher level language if necessary. People working in time sensitive code (such as game programmers) already know when to use C or assembly for critical sections of their programs. A smart software architect knows when to reach for help on the lower levels of the software architecture.

This is specially true when higher level abstractions don’t work. For example, suppose there is a bug in a compiler or interpreter. How can you diagnose it? The only way is understanding what they do and how it should be done correctly (in this particular case it involves checking the machine code generated by the compiler). Otherwise, one is at the mercy of whoever is supplying the programming software used.

Conclusion

Thinking with his/her own head is very important for a software developer. Understanding the lower layers of the system you are working with will make wonders for you ability to think by yourself.

In this respect, one of the great advantages of open source software is to allow people to poke the innards of applications and even the operating system. Despite what you think of software freedom, working with open source software may improve your awareness of lower levels of the system.

Another possible help is to read books and articles that can shed some light on the lower layers of your system. Here are a few suggestions:

Why C++ will not die

If you work with programming, from time to time you will see articles talking about the inevitable downfall of C++ as a main programming language.  Some people go as far as to say that C++ is even harmful to your career and that you should avoid using it whenever possible.

Despite the reasons why some people dislike C/C++, I believe that the backlash against C and C++ is rooted more on lack of understanding than on real fact.

The Case for C

C was created to be a language for professional programmers, not for people that are still learning the trade. The designers of C created it as a means to efficiently develop operating systems. One of the consequences of this design is that C is as close to the hardware as one can get without using assembly itself.

This may be bad if you are just starting, but it is liberating if you are a seasoned programmer that really wants to get the most out of the machine.

If you look at the criticisms against C, the most important arguments are that it makes it easy for programmers to shoot themselves on the foot. No wonder, C was created exactly to make such things possible.

Writing Simple Programs Quickly

If your goal is to write small programs easily, you should stop your criticism of C. The fact is that for writing small, simple programs C doesn’t count. Such programs should be written in Perl, Python, or whatever flavor of scripting language is the most popular at the time. This is what UNIX programmers have done forever. 

For small programs that impose no bottleneck on the system any language is OK, as long as it makes it easier to solve the problem at hand. Lisp and Prolog are excellent to solve symbolic problems, for example. Quick file processing can be solved with Perl.

Large Scale Programs

Now, let us talk about the programs that really matter in each platform, i.e., large scale software that imposes a bottleneck on the machine.

In higher level languages, such as Pascal, Java, or Lisp, one is working with protective gear that allows them to be sloppy without sacrifice of safety. This is not to say that one cannot write efficient programs in Lisp, say, but doing this is harder than usual.

In a language like Lisp, if you want to write efficient code you need to understand all sources of bottlenecks. This usually means you have to understand how your particular system is implemented and how the underlying architecture works. Then, you need to find ways to avoid such bottlenecks, and doing this usually means going to a lower level of programming such as (guess what) rewrite you program  in C or assembly (or maybe using typed expression in Lisp, which is just another way of thinking in assembly).

In other words, you have to be really expert in the language, in the implementation of the language, and in the underlying machine in order to write something out-of-ordinary. And if you have the knowledge to do this, it might be easier just to write everything in C. On the other hand, while you may need to do this just in a few cases on the whole program, the bigger your program is the higher the chances you will need to go lower level more frequently. This is why writing a web browser or an operating system in anything else other than C/C++ is such a difficult proposition.

Now compare the situation with C. Sure, you cannot expect that a novice programmer will know how to handle pointers or null terminated strings. That is fine, though, because C was not made for them. The C language was created as a vehicle to write efficient software for people that care enough to use such a tool.

Notice, however, that once you master a few concepts you can treat C as just another high level language, but one that matches closely to hardware concepts like real memory addresses. Maybe you will lose the illusion of objects, for example, but you will gain a real ground on what happens in the computer.

What about C++

I talked a lot about C, but what about C++?

C++ is the child of C that was adopted by the industry. C++ has everything that the industry likes about a technology: it has buzzwords, it is constantly changing and requiring new tools and compilers, it has strong support from big companies like Microsoft, and it  has a lot of legacy code.

The legacy issue with C++ is so important that C++ has essentially killed C as a commercial product. You don’t buy nowadays a C compiler, you by a C++ compiler that can also compile C code. In this sense, C++ is really important, because it is the only way C programmers can continue using C.

Also, despite the many problems with the extensions created by C++, many of them are really useful, such as namespaces, a standard library of containers, and basic support for objects. So, basically all commercial installations of C++ can use C and pick-and-choose the features from C++ that they want (see for example Google’s style guide for C++ [2]).

In conclusion, although a lot of people don’t love C/C++, it has an important role in the industry that is not been filled by any other language. This probably means that we will continue to see good C++ programmers making money and working on interesting projects in several areas.

Further Reading

Many of the advantages and disadvantages of C++ for large scale programs are listed on  Large Scale C++ Software Design, by Lakos.

An even deeper description of C++ model is given in Inside the C++ Object Model, by Lippman.

[1] http://www.ittybittycomputers.com/IttyBitty/CppHarm.htm

[2] http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml

Using Firebug to Edit HTML

Some time ago I started playing with Firebug, a Firefox extension, that provides tools for editing and debugging html and Javascript. It is a tool that I recommend to anyone interested in creating web content.

The core of the extension is a window that allows one to inspect the HTML, CSS, and Javascript in a web page. With this efficient tool you can highlight, make small changes, or even write parts of a web page until it is as good as you want it to be.

Firebug is the closest I have seen to a true HTML editor, that is, something that allow you to play with HTML code in real time. While a lot of HTML editors will provide the necessary ways to enter HTML code, it is not the same as having the browser updating at each stroke.

The only inconvenience of Firebug as an HTML editor is that it will not allow you to automatically save the edited content to a file. This make the process less smooth, because you need to save yourself the document.

Despite this small problem, it is actually easy to save a page that has been edited with Firebug. Just select with the mouse the elements of the page that have been edited. Then, select copy from the context menu. You can now save that text to a suitable editor or modify the previous html/css file.

Firebug provides a lot of flexibility for website writers. It is a great tool that can provide invaluable work for anyone that needs to create web pages.