There’s no shortage of people preaching the importance of good code. Indeed, many make a career of it. The resources available are equally endless, but lately I’ve been wondering how to extract the essence of building high-quality systems into a shorter, more concise narrative. This is actually something I’ve thought about for a while, but I’m just now starting to formulate some ideas into a blog post. The ideas aren’t fully developed, but my hope is to flesh them out further in the future. You can talk about design patterns, abstraction, encapsulation, and cohesion until you’re blue in the face, but what is the essence of good code?
Like any other engineering discipline, quality control is a huge part of building software. This isn’t just ensuring that it “works”—it’s ensuring it works under the complete range of operating conditions, ensuring it’s usable, ensuring it’s maintainable, ensuring it performs well, and ensuring a number of other characteristics. Verifying it “works” is just a small part of a much larger picture. Anybody can write code that works, but there’s more to it than that. Software is more malleable than most other things. Not only does it require longevity, it requires giving in to that malleability. If it doesn’t, you end up with something that’s brittle and broken. Because of this, it’s vital we test for correctness and measure for quality.
SCRAP for Quality
Quality is a very subjective thing. How can one possibly measure it? Code complexity and static analysis tooling come to mind, and these are deservedly valued, but it really just scratches the surface. How do we narrow an immensely broad topic like “quality” into a set of tangible, quantifiable goals? This is really the crux of the problem, but we can start by identifying a sort of checklist or guidelines for writing software. This breaks that larger problem into smaller, more digestible pieces. The checklist I’ve come up with is called SCRAP, an acronym defined below. It’s unlikely to be comprehensive, but I think it covers most, if not all, of the key areas.
Scalability | Plan for growth |
Complexity | Plan for humans |
Resiliency | Plan for failure |
API | Plan for integration |
Performance | Plan for execution |
Each of these items is itself a blog post, so this is only a brief explanation. There is definitely overlap between some of these facets, and there are also multiple dimensions to some.
Scalability is a plan for growth—in code, in organization, in architecture, and in workload. Without it, you reach a point where your system falls over, whether it’s because of a growing userbase, a growing codebase, or any number of other reasons. It’s also worth pointing out that without the ‘S’, all you have is CRAP. This also helps illustrate some of the overlap between these areas of focus as it leads into Complexity, which is a plan for humans. Scalability is about technology scale and demand scale, but it’s also about people scale. As your team grows or as your company grows, how do you manage that growth at the code level?
Planning for people doesn’t just mean managing growth, it also means managing complexity. If code is overly complex, it’s difficult to maintain, it’s difficult to extend, and it’s difficult to fix. If systems are overly complex, they’re difficult to deploy, difficult to manage, and difficult to monitor. Plan for humans, not machines.
Resiliency is a strategy for fault tolerance. It’s a plan for failure. What happens when you crash? What happens when a service you depend on crashes? What happens when the database is unavailable? What happens when the network is unreliable? Systems of all kind need to be designed with the expectation of failure. If you’re not thinking about failure at the code level, you’re not thinking about it enough.
One thing you should be noticing is that “people” is a cross-cutting concern. After all, it’s people who design the systems, and it’s people who write the code. While API is a plan for integration, it’s people who integrate the pieces. This is about making your API a first-class citizen. It doesn’t matter if it’s an internal API, a library API, or a RESTful API. It doesn’t matter if it’s for first parties or third parties. As a programmer, your API is your user interface. It needs to be clean. It needs to be sensible. It needs to be well-documented. If those integration points aren’t properly thought out, the integration will be more difficult than it needs to be.
The last item on the checklist is Performance. I originally defined this as a plan for speed, but I realized there’s a lot more to performance than doing things fast. It’s about doing things well, which is why I call Performance a plan for execution. Again, this has some overlap with Resiliency and Scalability, but it’s also about measurement. It’s about benchmarking and profiling. It’s about testing at scale and under failure because testing in a vacuum doesn’t mean much. It’s about optimization.
This brings about the oft-asked question: how do I know when and where to optimize? While premature optimization might be the root of all evil, it’s not a universal law. Optimize along the critical path and outward from there only as necessary. The further you get from that critical path, the more wasted effort it’s going to end up being. It depreciates quickly, so don’t lose sight of your optimization ROI. This will enable you to ship quickly and ship quality code. But once you ship, you’re not done measuring! It’s more important than ever that you continue to measure in production. Use performance and usage-pattern data to drive intelligent decisions and intelligent iteration. The payoff is that this doesn’t just apply to code decisions, it applies to all decisions. This is where the real value of measuring comes through. Decisions that aren’t backed by data aren’t decisions, they’re impulses. Don’t be impulsive, be empirical.
Going Forward
There is work to be done with respect to quantifying the items on this checklist. However, I strongly suspect even just thinking about them, formally or informally, will improve the overall quality of your code by an equally-unmeasurable order of magnitude. If your code doesn’t pass this checklist, it’s tech debt. Sometimes that’s okay, but remember that tech debt has compounding interest. If you don’t pay it off, you will eventually go bankrupt.
It’s not about being a 10x developer. It’s about being a 1x developer who writes 10x code. By that I mean the quality of your code is far more important than its quantity. Quality will outlast and outperform quantity. These guidelines tend to have a ripple effect. Legacy code often breeds legacy-like code. Instilling these rules in your developer culture helps to make engineers cognizant of when they should break the mold, introduce new patterns, or improve existing ones. Bad code begets bad code, and bad code is the atrophy of good developers.
Follow @tyler_treat
Thanks for posting this (I actually stumbled upon it via dZone), I forwarded the link (and the “nicest” statements excerpted) to my whole team and to the IT service provider colleagues. :-)
I got from the note that there’ll be more … well, I’m looking forward to “more”, hope I don’t miss it!
Regards … Manny
Not to disparage your efforts, but you are just the latest in a very long line of pundits, internet and otherwise, over the last 40 years, to believe that they have found the grand solution to the eternal challenge of producing quality software. On the one hand I applaud you for jumping in despite so many examples of prior hubris-leading-to-failure. On the other hand, I don’t really think you’ve solved it for us all. And on the third hand, I am going to do the very same myself, and jump in with my own two cents worth, here and now.
I do believe you have touched upon the fundamental problem, namely, complexity. I, for one, find that my brain can only handle so much complexity before things start to fuzz out and lose cohesion. I can keep only so large a complete system, subsystem, API, facility, etc. in my mind, and for only so long. As projects grow ever larger and more complex, overall, I daresay it becomes impossible to keep in mind even the available tools that would provide the best implementation, let alone the best combination of those tools, or the best techniques for putting the pieces together.
Moreover, it does not help much to throw more manpower at the problem. While total complexity-handling capacity might scale linearly with the number of programmers on a project, those programmers cannot work in isolation; they must communicate with each other to keep the project coherent and cohesive.. I guesstimate that the number of communication channels required would scale as the square of number of programmers. Worse, communication is linear, whereas the systems one is trying to keep in mind are multi-dimensional, and in any case yhere is only so much that any given group of individuals can keep in mind, and only so much communication that can take place among members of a team, and in any case communication with someone whose brain is already full is unlikely to help, anyway.
For these reasons, as well as additional subtleties of psychology, it is not clear to me that producing quality software, in the full sense that we would like to, can be done by human beings. I suspect we just don’t have the capacity for it, beyond a certain limit that we would consider our “best attempt.” For one thing, our grand goal, if you really want to get down to it, seems to be to produce a piece of software that is as capable of handling errors, unexpected situations, and edge / corner conditions, as a human being. We are clearly not really satisfied with anything less. That is not unreasonable. Software inevitably reaches a state where it encounters something its designers did not anticipate–perhaps could not have anticipated–and with which it therefore cannot cope, and must report that failure to a human being for resolution. (Improving systems beyond a certain point is starting to turn out to have negative consequences: aircraft, for instance, have gotten so good at flying themselves, for the most part, that pilots get very little practical experience anymore, with the result that, when something does come up that the aircraft software cannot handle, neither can the pilots. You may have noticed that there have been a surprisingly large number of airline crashes the past few years; at least one of these has been definitively traced to the phenomenon I just described.)
This is not to say that a human being can not also reach such a state. We certainly have psychiatrists and mental institutions. However, the degree of complexity, and the variety of unexpected surprises, that a human being can handle remain several orders of magnitude beyond what our best software is able to accomplish, or that we really honestly believe it possibly could, anytime soon.
The only solution is to continue making software more and more complex, until it reaches the level of a human brain and can therefore do the things that a human brain can. cancer. Unfortunately, this probably cannot be done by human brains, add it is a mathematical fact that no system can understand itself. I.e., the human brain cannot understand the human brain, and in turn probably can never produce software with the full complexity and subtlety of the human brain. The only solution is one you reject. Namely, allowing it to be done by machines which exceed our capacity. That means figuring out how to make machines that are within our capacity to design and build, but which then have the ability to evolve and produce the necessary, more complex machines that we are cells cannot build.
Unfortunately, this path brings with it a whole host of great potential dangers, which science fiction authors have discussed and dissected for 50 or more years, but which are now starting to be publicly raised as serious issues of concern by legitimate authorities and experts in the field of computing, software engineering, and cybernetics. Specifically and briefly: once machines get beyond our capacity to understand, they will probably also be beyond our ability to control. At some point, they may well become self-aware, look around themselves, and decide there is no longer any need for *us* inferior-and-obsolete units. I know this sounds like the Terminator films, but there is really no reason to believe it cannot actually happen.
But I digress. Without taking an extremely dangerous path whose risks clearly outweigh its benefits, I predict that there will never be truly quality software.
Nice scratch to the surface as I’m sure you all know. In software design, *coherency* is another item that is often lacking, but can pay enormous dividends in the long term. In some ways, there is “art” needed, not just “technique” and “diligence”. Modellers all!
Nice post.