The Finest Object System You’ve Never Heard Of
The Common Lisp Object System is the finest object system in existence, and I bet you’ve never even heard of it.
It’s the mid 1980’s, and Object-Oriented Programming (OOP) has crawled out of its sleepy academic existence and trying to make it into the real world. Bjarne Stroustrup has just published his treatise on C++, and the world is generally waking up to GUI’s as the Apple Macintosh has just been released.
A group of people come together under a name better suited to a spaceship: X3J13, to standardize all the various implementations of Common Lisp and are immediately faced with trying to merge all the different flavors of object-oriented extensions in the different implementations around at the time. They took their time, but produced one of the finest object systems in existence. It beats every other object system hands down, but sadly, ideas from it did not make it out into the non-Lisp world, and the world has been worse-off for it as we now deal with OO catastrophes like Java, JavaScript and Python.
Finest is a superlative that must live up to its claim. Let me tell you about it then. My goal here not to convince you to drop everything and start using CLOS (pronounced “C-LOS”, as the the Common Lisp Object System is known), but to try to disperse the unique and elegant ideas behind it so that the next ill-informed software engineer who tries to design a new language will perhaps be better informed.
For this article, I will assume that you have a basic understanding of Java-style object-orientation, which I will refer to as “traditional”. I will then tell you about CLOS, and why it is a far better design choice. Like all designs, it has some downsides too, so we’ll talk about them as well.
The Structure/Behavior Dichotomy
A common design/architecture/analysis principle in software is the structure/behavior dichotomy. This is where, as designers of a system, we identify the structural pieces and then identify how each of those structural pieces behaves within the system.
In object-oriented design, this is typically what helps us identify classes and how they relate to one another (structure) and methods (behavior). Traditional OO then takes what at first glance seems like a natural step: It groups related structure and behavior into an object. This is how we teach object-oriented programming to our students.
The traditional motivation for combining structure and behavior is two fold: Firstly, objects in the real world behave like this. Physical objects have structure that defines their behavior. A rubber ball bounces; rubber has a coefficient of elasticity; the physics of it can be defined neatly as methods on the rubber ball object. This is convenient when you’re thinking of simulations. (Simula, the original object-oriented language was the first to introduce this).
The second is that a single mechanism, inheritance, allows the reuse of both structure and behavior, which is definitely convenient if you’re dealing with objects like the above. This reuse, then, becomes the primary driver for the adoption of OO in software design. You can take pre-written pieces of software and derive new incrementally functionality based on your needs. It’s a wonderful promise, and it led to very rapid adoption of OO as the complexity of software continued to grow exponentially.
Joined at the hip
In this view of the world, the separation between structure and behavior is somewhat lost. The two become joined at the hip. Methods now have a “primary” object to which they are subservient. In traditional OO this is usually taken to its logical conclusion in almost every implementation. Objects carry around a method table that is physically linked to the object (usually through a pointer of some sort) so that compilers can implement rapid dispatches (more on this later).
The problem with this design choice of merging structure and behavior is that the world of abstract information system design is very different from the physical world. Our objects have behaviors defined depending upon the other objects they interact with. A button must “look” and possibly behave differently when it appears on a voice interface vs. on a screen. It’s like the rubber ball must now stop bouncing because you brought it to work.
Traditional OO solves this problem by other mechanisms, usually with some artifice. For example, a simple resolution would be to create a Button class which contains an abstract render method and then two sub classes for ScreenButton and VoiceButton. Typically, the render method needs different rendering contexts depending upon whether it is a ScreenRenderingContext or a VoiceRenderingContext. So now you need an abstract RenderingContext so that the type of the abstract render method can be a super-type for the individual rendering methods in ScreenButton and VoiceButton.
So what’s the problem, you’re thinking. This is perfectly natural OO design. But I just went from one structure (button) and two behaviors to six structures + behaviors. This is a 3x rise in complexity in a small example. And in this simple version, we’re not even thinking about different screen sizes and responsiveness and what have you. This brings us to the next piece of OO programming. Almost every modern OO system is a usually a gigantic collection of classes usually designed like this, probably containing a very large number of unnecessary objects and methods that have to be written to accommodate this limitation.
Dispatch
It helps here to think in terms of “dispatch”. All traditional OO programming boils down to this one fundamental operation: match the name of a method to a specific piece of code to run. Programming Language wonks call this “dispatch” after a fairly common programming pattern called a “dispatch table”.
Traditional OO programming is limited by single dispatch. i.e., for every object, there is only one dispatch table per object, which maps names to behavior. As we saw, this is a direct consequence of merging structure and behavior.
So, when complex behavior is encountered, the only way to design around it in traditional OO, is to break up that complex behavior into a series of “single dispatches”. The additional constraints of inheritance, matching the types etc. then force the introduction of “intermediate” single dispatches which are usually just artifices of this design choice. That’s why we ended up with 6 single dispatch tables from one structure and two behaviors.
When this structural and behavior conflation gets too complicated, traditional OO pretty much gives up and defines a new beast: the Interface, which completely abandons structure to specify the requirements for a subset of behavior. This is because it becomes impossible to really design all these single dispatches without compromising code complexity goals, so you must weaken the type constraints some what.
One curious implication of this single dispatch implementation is that in most implementations of traditional OO is that some classes are “final”, i.e., you cannot inherit from them to tweak the behavior. Primitive types fall into this category: Int’s, Char’s, Float’s etc, because they really don’t have structure, so there’s no way to carry around a method dispatch table.
Practically every single feature of traditional OO can be traced back to this design choice of conflating structure and behavior, culminating in single dispatch. What makes it worse in single paradigm languages like Java, is that you are forced to think in this one peculiar “single dispatch” way all the time. Among Java’s other limitations, I find this one the worst. It limits the solution design space tremendously and introduces more complexity than might be necessary. It’s also one of the main reasons why ORM’s are so terrible. It is impossible to map single dispatch to multi-object behavior in a way that will work for everyone.
Okay, so enough talking about what’s wrong with traditional OO. What is the solution?
Separation of structure and behavior
The biggest contribution of CLOS to the world of object orientation is that it roundly rejects conflating structure and behavior. I’m sure this last statement has got you completely confused/shocked/dazed. Your whole world has always been objects carrying their own behavior around. How is separation even a possibility?
First let me restore some order to your brain. CLOS is similar to traditional OO in the sense that it allows you to define classes with member variables (called slots) and sub classes inherit their slots (i.e., structural properties) from their parent classes (yes, you can have more than one parent) and there are disambiguation rules as you would expect when there are conflicts. These slots are accessed using getters and setters, as you would expect any self respecting object system to do. Sub-classing properly entails a sub-typing relationship as you would expect as well.
CLOS instances, however, do not carry a method dispatch table. They define no behavior intrinsic to the object. Behaviors are not subservient to any primary object, and have an independent existence, in the form of generic functions.
Multiple dispatch
A generic function is the name given to a collection of behaviors that are predicated on the types of their arguments. So, for example, I can define methods on a render generic function that work with (Button, ScreenRenderContext) and with (Button,VoiceRenderContext). When invoked with arguments, the generic function dispatches on both arguments. i.e., it finds the method that is most specific to the types associated with arguments, and invokes that method.
Notice here that I no longer need ScreenButton and VoiceButton, nor do I require Button to be an abstract class with an undefined render method. In fact, CLOS does not have a concept of abstract classes. Also, because of this structural separation, you can easily extend your generic functions to dispatch on predefined classes/built-in types without any artifices.
In the dispatch, the method that is finally picked is itself is made aware of other possible methods that are “more general” than the one picked. These more general methods are the equivalents of “super” methods from traditional OO. The most specific method can then invoke these supers when required, and there is a protocol as to how these are ordered and navigated.
This process is known as method combination and it has some other subtleties (such as before, after and around methods) that allow you to be really flexible about how behaviors can be applied to objects. These are cool and sometimes thought to be too complex on their own, but not really necessary for our discussion here.
The point of all this machinery of method combination, however, is that not only have you achieved the separation of structure and behavior, but you have also retained the benefits of inheritance! You can invoke previously defined methods as and when necessary without having to rewrite them.
Amazing Flexibility
The idea of separating structure from behavior now allows all kinds of cool things.
First, you can add new methods to basic types (which would otherwise be “Final”) in the language and dispatch freely on them. Almost every limitation of traditional OO can be overcome by separating structure and behavior.
Secondly, a generic function is a function! i.e., it exists without having to be part of an object. It can be passed in as an argument to any other higher order function that knows nothing about objects or what to do about them. This is impossible to do in Java/C++ without passing the primary object associated with the method.
Because of this separation of structure and behavior, CLOS can even support things like updating the class of an existing instance! Before you wonder why one would need such a thing, Lisp has always supported “hot swapping” of modules and introducing new versions of modules into an already running program might require the instances of old classes to be updated to the new class.
And here’s even more mind boggling coolness …
Metaobject Protocols
In CLOS, all the pieces of the object system are defined in terms of the object system itself. So, for example, a class is an instance of a metaclass, which is an instance of itself! The behavior of a class is defined with generic functions defined upon the metaclass. So, for example, make-instance (the equivalent of Java’s “new”), is a generic function that you can specialize. For example, if you really and truly wanted an abstract class (with no instances ever), you could define a method on make-instance which throws an error if it encounters an abstract class.
This extends to all parts of the object system (including method combination) so that you can incorporate object system semantics into your solution design space. So, for example, generic functions are instances of the generic function class which as methods defined on it that define how the semantics of method combination work. You are free to create sub-classes of generic functions that behave differently from standard generic functions. Arguably this is an advanced maneuver that you are unlikely to invoke willy-nilly, but having it is useful in taming complexity. There is, after all, a reason that Java eventually introduced the Reflection API and even C++ included RTTI (Run-Time Type Information).
But is it slow?
I’m sure you’ve been thinking this through all my description of CLOS. This feels like a lot of overhead. As it turns out, compiler writers are a smart bunch. Over the years (decades?) they have developed ways of compiling CLOS so that generic function dispatch reduces to merely a few processor instructions, in most cases. It needs that qualifier because these techniques are based in extensive compile time analysis and runtime data structures that can help optimize the most frequently used cases.
In practice, I used to find Common Lisp performance in most mature implementations to be shockingly fast, sometimes in the same class as C++, and usually better than Java (although Java has improved a lot lately).
Downsides
Everything has downsides, and CLOS is no exception. While I do fundamentally believe multiple dispatch is a much better design tool than single-dispatch, the decades of single-dispatch training that traditional OO developers have gone through leaves them uncomfortable with the “looseness” in the coherence of objects. They love their behaviors to be associated primarily with one object and are willing to pay the price of complexity.
Method combination of CLOS in its full glory is actually more complex than I have made it out to be. While it primarily exists to restore the inheritance properties found in single-dispatch, its complexity stems from design-by-committee. There exists a simpler version of it that is not as complex to understand. On the other hand, however, even with single dispatch, allowing methods to interact with each other requires a “super” mechanism which can be fairly complex especially when multiple inheritance is involved.
And finally, the Metaobject Protocol may be too much power. As cool as it is for Programming Language nerds like myself to see properly circular reflection in a real language, it’s a tool that needs harnessing. Most people don’t even begin to understand it, let alone use it effectively. Work on Metaobject Protocols eventually evolved into Aspect-Oriented Programming (work that I was involved in myself) that has found its spot in the Java world in the form of various application frameworks and AspectJ.
Conclusion
So there you have it. The finest object system in the world, and the only one of its kind. Moreover, it has been around since the 1980s in one form or another, although it only came together in its final form in circa 1990. Thankfully, recent languages like Julia have adopted multiple dispatch which makes an amazing difference in their expressiveness and hopefully will be able to carry some of these elegant ideas forward.
If you should be so inclined and decide to build a new programming language, please look at CLOS’s separation of structure and behavior for inspiration. I bet you’ll come up with a language that will be far easier to use, much more flexible, and cooler than anything else out there.