Premature cliches are the root of all evil (and considered harmful)

Did you know that you can use wild cards in Google searches? Searching for ”premature * is the root of all evil” yields a lot of people quoting Knuth in an almost fanboy like fashion (though it seems that Tony Hoare is the actual source of the quote). I’d like to nominate ”Premature optimization is the root of all evil”, together with ”Goto considered harmful”, as the most tired and overrated quotes in computer science ever.

Don’t get me wrong, these quotes made a lot of sense thirty, or even ten years ago, but now that the ideas that they illustraded has been absorbed into the programming community, they do more harm than good. For example, Dijkstras paper titled ”Goto considered harmful” argued against the practices found in ALGOL and similar environments 36 years ago. Most common languages at the time had weaker support for structured programming, and the general consciousness about structuring code in the programming community was low, leading to ”spaghetti code” with goto’s all over, making it hard to figure out how the program actually worked. At that time, the condemnation of goto’s made sense.

Well, a lot have happened since that time, both language-wise and with programmer’s attitude towards the systems they’re building. ”Software ICs”, object-orientation, components, SOA… all leading to environments and programs that can grow quite large without collapsing under the weight of it’s own complexity (some say that we’ve exchanged the problems of ”spaghetti code” for that of ”ravioli code”, but that’s another discussion). When a programmer nowadays considers using ”goto”, it’s usually for situations where it actually makes sense (breaking out of nested loops, for example). Having the mantra ”Goto considered harmful” alive in the collective programming community consciousness actually makes things worse, as it discourages people for using the construct where it actually would make the program easier to read.

The quote about ”Premature optimization is the root of all evil” is more of a ”timeless” truth, but of course it’s basically a truism — premature anything, is by definition, never a good thing. The quote is rooted in a time where optimizing code would make it harder to read and maintain, and really is aimed against those micro-optimizations that make code harder to read (explicit loop unrolling, contrived program flow, unneccesary use of bitshifting and bitfields) without providing a huge performance benefit. However, this quote has been repeated so often, often in an authorative voice, that new programmers think that there’s something wrong with thinking about performance as you design and write the initial version of your program — fearing that it will lead to a design that is worse and a program that’s harder to maintain.

These fears do not match my experiences. I think it makes sense to think about performance through the design phase, particularly considering which pieces of code will be run often (I seem to recall statistics saying that 98% of a program’s running time is spent in 2% of it’s code), and what is needed to get those parts to run really fast. Of course, I don’t consider these optimizations to be premature.

I’ve seen cases where architectural descisions were made early on, without measuring performance, leading to a system that was not only slow, but basically impossible to optimize without rearchitecting. As these systems were built without much thought about performance, when the performance problems surfaced, it was basically too late. These scenarios are, in my experience, much more common than systems where optimization has made the code hard to read and maintain.

However, there are other stuff that often is done prematurely, and in my experience are a much larger source of evil — abstraction and/or generalization, things that are generally considered to be good things. New (or even experienced) programmers that are ambitious will often design overly generalized systems without being familiar enough with the problem domain to know which generalizations makes sense. Similarly, they will create abstractions through elaborate class hierarchies and interfaces with the intention that it will make the individual pieces of the system easier to understand, without realising that the abstractions get in the way of understanding the system as a whole.

Code that has been obfuscated by premature optimization attempts usually can be made more readable piece by piece, but an overengineered system with sophisticated class hiearchies that turned out not to match the problem domain is much harder to refactor into something useable. I wholeheartedly agree with this page on the original wiki that states:

It’s probably important not to create an abstract base class or an interface until you can think of at least two descendants or implementations — that you are actually going to use right now

When reading and understanding other peoples code, I find that it much more common to find it over-abstracted than under-abstracted. Maybe we need new quotes/cliches/memes to warn us of where the real risks actually are now, as opposed to where they were thirty years ago.

Next time, I’ll take apart the Fred Brooks quote ”Adding manpower to a late software project makes it later” 🙂