Why GUI Systems are Object Oriented?

In the previous post I talked about areas of software development where data is at least as important as behavior in the system design. Object oriented practitioners will contend that behavior is the most natural way of designing a system, and will give a number of examples where it works and the data-centric approach doesn’t.

In almost 100% of the cases they will cite GUI libraries as an example where data or functional design doesn’t work. And you know what, they are right.

The reason is that OO was created to model system behavior. And, at the same time, we don’t really care about what data is stored in GUI code (we can always store relevant data somewhere else). For the most part, we just want to operate on the UI in terms of its behavior: window open events, button clicks, menu selections.

As we have seen before, OO is an excellent programming methodology when data is not important. One can design the interaction between objects in a very simple and direct way. This is clear from the heritage of languages such as Smalltalk and Simula. Smalltalk was designed for direct UI interaction. Simula is a simulation language above all. In these languages, operating with data using a functional approach is not the main goal.

Lack of Behavior

The advantages of OO systems start to fade when posed with tasks that don’t require a rich behavior component. For example, a compiler is just a system that inputs source code and outputs binary code. There is not much behavior to be explored in such a system, unless you want to stretch the definition of events.

It is no surprise, then, that it is easier to write compilers in functional languages (although compilers are typically written in languages such as C/C++ due to the low level nature of the task). Similarly for numerical code, a lot of which is still written in Fortran and similar languages.

Data Manipulation

Another way to view the shortcomings of OO languages when faced with data-oriented tasks is the strange way it interoperates with data. In some languages, like C#, data elements assumes the form of properties. In general, the idea is to provide getters and setters to access data components.

Now, if you think about it for a second, getters and setters don’t make any sense. Objects were created to encapsulate behavior, not to provide access to information. If that were the case, why having an object in the first place? Objects have no business in storing information that should be accessed by outside code. Any object that provides such access is not doing enough with its data to justify its existence, unless the data is computed in some complicated way.

The real answer is that objects should not provide access to data. They should provide behavior, not function as storage areas. Java and C++ designers recognize this by talking about POD objects, which have no associated behavior (only getters and setters…)

What happens if you have a programming project that manipulates lots of data? In OO programming, you have to fake plain data use POD objects. It would be much more natural, however, to use a functional approach to develop the system.

Conclusion

Object oriented design is not a panacea. It provides little support for data manipulation. Instead of trying to fix its shortcomings through things like POD objects, maybe we should allow richer programming models that are not dependent solely on the use of objects.

Further Reading

Pearls of Functional Algorithm Design is a very solid book explaining how to design algorithms in a functional style

The Functional Approach to Programming is another book that explain the nuts and bolts of writing functional code.

Data Driven Software Design

Due to the importance of object oriented design in software engineering, most developers believe that objects are the holy grail of software systems. However, despite the usefulness of interfaces provided by objects, I believe that there is something even more basic that we should look for when designing a system.

To see this, we may first take a look at how object oriented languages make us think about design.

The grand idea of OO is that behavior of individual objects and their interactions with other objects is the most important aspect of the system. Therefore, a lot of time is spent in finding the right interfaces and the best class hierarchy to represent these behaviors and relationships.

While this is a sensible way of start investigating a system, OO may lead you into wrong decisions down the line. This usually happens in the implementation phase, because it causes developers to stop thinking of software in terms of data and start thinking just about behavior and object interaction.

Deficiencies of Code Behavior

The problem with using behavior instead of data during the implementation of a system is that (outside the helm of high level interfaces) behavior is transitory, while data usually is not.

In other words consider the following example: a system determines how to go from point A to point B. The necessary steps for this transition can change a lot. But the information I have at the beginning of the process will not change. Similarly, the end point will be defined by the data, not by the path used by the system to reach that point.

So, looking exclusively at behavior in a system is the same as prioritizing the way things currently work, instead of the desired results. The moment you create an object that provides operations x, y, and z, you specify behavior. If you create a class hierarchy it gets even worse, since it has the combined behavior of a set of classes. Changing the parent class now requires similar changes to dozens of files.

However, unless this is a high level interface to a subsystem, I shouldn’t care what the behavior of the object is. I should only care about the input and output data that is defined as part of my programming contract.

Looking at this from the point of view of functional programming, one see why it is so appealing compared to object oriented programming: what I described above is exactly the functional view, where the only thing that matters is the transformation of input into output.

By the way, this also explains why UNIX systems are so pervasive, even though its implementation is in terms of simple imperative code. High level interfaces are defined by top level commands: the input (parameters) and outputs (what they’re supposed to return). This is the interface. The internals of a UNIX command, though, are just procedural C code operating on the data provided. It is quick (and sometimes dirty) but always does the job.

This approach of software development is not very difficult to adapt, even with object oriented languages, once a good design is in place.

This could be done in C++, for example, by writing OO interfaces only for the major subsystems. This would be the equivalent of a creating set of UNIX commands that cooperate to perform well defined tasks. The internals of such classes would be defined by the available data, without the need of encapsulation provided by objects unless necessary for clear reuse cases (for example, logging subsystems, UI subsystems, etc.)

Summary

Object oriented programming makes us think in terms of responsability and state of objects. However, when we want to implement concrete functionality, data is more important than behavior because data is a better representation of what needs to be done. While behavior can change based on how the system interacts with the environment, properties of the suplied data are immutable.

Clearly, there are situations when behavior is more important than data, but object oriented systems try to see this as the only possibility. I think that programmers should start to use the right abstraction when necessary, instead of holding only one tool for all jobs. Therefore, we should start to think more about when it makes sense to view data as the main aspect of a system. We will end up seeing a lot of places where this makes systems more robust and easier to implement.

Further Reading

Pearls of Functional Algorithm Design is a very solid book explaining how to design algorithms in a functional style

The Functional Approach to Programming is another book that explain the nuts and bolts of writing functional code.

What is Second System Syndrome?

Second system syndrome is term coined to characterize software development groups that enter into the dangerous sport of rewriting an existing piece of software.

It usually happens when someone, feeling that the current system is not good enough, believes that it is would be easier to abandon the current code base and give it a fresh start.

Second system syndrome has happened to a number of high profile companies, but probably the most well known case is the rewrite of the Netscape web browser. By the end of the 90s, Netscape was losing dominance of the browser business to Microsoft. The main problem Netscape faced, however, is that it was too hard for them to enhance the aging code base of Netscape navigator.

The solution Netscape found was to rewrite the whole browser from scratch, in order to create a new code base that would be easier to evolve. The result of this pursuit is that it took years before Netscape had a new browser with the expected functionality. By that point, Microsoft was already the dominant browser vendor, and from there it was just a downward spiral for Netscape.

The other thing that is inherent to second system syndrome is that we have the desire to create a new version of a system that is an order of magnitude better than the previous version. This creates expectations that are hard to match.

For a common example, Microsoft has suffered from this syndrome several times during the evolution of Windows. They frequently promised a new version of its OS that would fix all perceived flaws and introduce revolutionary technology. This usually is followed by long delays and finally to a new version that is more similar to the previous than to the promised one.

The Lessons

While rewriting may be a good thing to do on research and prototype systems, it is rarely the best right thing to do in real systems. Particularly if there is fierce competition in the area of the market you are, there may be no time to recreate a product. In the case of Netscape versus Microsoft, Netscape gave up all the development advantage they had.

Clearly there are some perceived advantages on rewriting a piece of
software:

  • Do it right from the beginning: this is the main allure for software engineers (at least the good ones). We really like to do the right thing. However, starting from scratch is not the only way of improving software.
  • Fix a broken architecture: Similar to the above, the problem faced may be that a completely wrong architecture is used. For example, we may be using a client-server architecture where this is not applicable.
  • Addressing performance problems: This is probably the weakest reason to create a new system. Performance is usually a result of the algorithms and architecture used. It is better to deal with these issues than to start everything from scratch.

Disadvantages

  • The new architecture is not a proven one: although we may know that the previous architecture is bad, there is no way to know that the new doesn’t suck until we tested it. A lot of resources may go into fixing something that isn’t broken.
  • Lessons of current system may not apply: your team spent years developing a product. With a rewritten version, a lot of that experience is lost simply because the code is different. Improving a system may be easier than recreating it.
  • No guarantee that the second will work: another issue is that, if your current code base is not working, the attitude towards software creation will probably be taken along to the new code base. It is possible that in a few months the new code will look very similar to the old one.
  • Time to market is lost: more importantly for companies, the lead in the market place may just disappear. In a competitive market, this may be a terrible move. It is better to spend time improving what you already have than trying to create something new and unproven.

Refactoring instead of Rewriting

As a response to the problems created by second systems syndrome, the industry has moved towards the use of refactoring as a standard tool to improving systems. Refactoring is the art of changing a code base in the direction we want to improve it. For examples of what can be done, it is possible to refactor a code base to make it more object oriented, or functional oriented, if that is desired.

There are a number of good books in this area, but the best known is

Refactoring, Improving the Design of Existing Code, by Fowler, Beck, Brant, Opdyke, and Roberts;

Another remarkable book on this topic is Clean Code: A Handbook of Agile Software Craftsmanship, by Robert Martin.