Economics of Software Quality

When we talk about the quality of software, we must refine the discussion to make it clear whether we are talking about internal or external quality. External quality refers to the software’s presentation and behavior from a user or customer’s perspective. Internal quality refers to how the software was constructed, and how easy it might be to maintain or extend.

External quality is often under the control of the project’s stakeholders, whether they are technical or not. If the system has bugs, stakeholders can report them and request they be fixed. If the UI is ugly, then can request updates to it. If the system is counterintuitive to use and lacks validation of inputs, again these can all be seen and therefore corrected by the project’s stakeholders.

Internal quality, however, is frequently undetectable by a project’s stakeholders directly. If the source code is a tangled mess of bad practices and spaghetti code, the non-technical project stakeholder has no idea. If new features are hacked in via copy paste programming instead of proper design, again this is invisible to most stakeholders. In some cases, the internal quality of the code is of little importance to the stakeholder, as in the case of a throwaway trade show demo. But in most cases, the internal quality of the software will have an impact on its long term cost to grow and maintain, and so assessing this quality should be a part of the due diligence done by project stakeholders for any significantly scoped project. Low internal quality can result in software that must be rewritten, or which fails when you need it to stretch beyond its typical usage patterns (see Nasdaq’s failure during the Facebook IPO, for example).

Why So Much Poor Quality Software?

I’m going to assert here publicly that I believe virtually all software that is created today is of relatively poor internal quality. I’m afraid I have no scientific evidence to back this up, only my own industry experience, which includes conducting software quality assessments for a variety of clients. Feel free to disagree, but hopefully you will at least grant that poor quality software is common enough to warrant discussion. Given that we as an industry produce loads of crap quality software, the obvious next question is why? We can brainstorm a number of possible reasons for why poor quality is produce, such as:

The individual developer or development team

  • does not know the code quality is poor or how to make it better
  • does not value internal code quality
  • needs more time to write good quality code

In the first case, the current capabilities of the developer (team) limit the level of quality that can be produced. The only way to address this is to train the existing developer (team) or replace them with one more skilled.

In the second case, the developer (team) simply does not believe that the internal quality of the software they produce matters, or at least it is not a high priority. This value system may be intrinsic to the devs themselves, or it may be imposed upon them by their larger organization or the project stakeholders. Frequently this lack of emphasis on code quality is a result of a decision at some higher level based on the iron triangle, where a decision has been made to sacrifice Good in favor of Fast and Cheap.

The third case is related to the second. The developer (team) is under time pressure, and while they know how to write better quality code, doing so would slow down their pace of delivering features/scope. Thus, the developers choose (or ask the project stakeholder to choose) to sacrifice quality in order to maximize speed.

In each of the second two cases, a decision is being made about the quality of software to produce. In both cases, it is based on an assumption about the rate at which software features can be delivered. That assumption is embodied Figure 1.

Figure 1 – Software Delivery Rate by Quality

SoftwareQuality01

Figure 1 shows two axes, Time and Features/Scope. The former should be self-explanatory; the latter is the productivity of the team as viewed by the project stakeholder. The red line, QLow, represents the output rate of low internal quality software. The green line, QHigh, represents the output rate of high internal quality software. The assumption in this figure, which I think is valid for many teams, is that the rate of production of low quality software is higher than the rate of production of high quality software. This makes intuitive sense, and corresponds nicely with the whole notion of Fast, Good, Cheap – choose two.

But what if software development teams were able to produce high quality software just as quickly as low quality software, or even faster?

Why does writing high quality code take longer (for some)?

Does it always take longer, for everybody, to write high quality code than to write low quality code? I would argue that this is not the case. In fact, Robert “Uncle Bob” Martin contends that he can write high quality code, the “right” way, faster than he can write poor quality code, every time. In Flipping the Bit, Martin writes:

I want you to believe that Test Driven Development saves time in every case and every situation without exception amen.

In this particular article, Uncle Bob is specifically referring to TDD, but I think we can generalize this to encompass a number of practices that produce higher quality software, but which individual developers and teams must invest time to learn. In the case of a master craftsman like Uncle Bob, it’s possible for the delivery rate of high quality software to exceed that of low quality software, as shown in Figure 2.

Figure 2 – Uncle Bob Software Delivery Rates

SoftwareQualityUncleBob

Uncle Bob goes on to say, Well, I might not use TDD if I needed the task to take a really long time to finish, and have lots of bugs. In Figure 2, the rate of producing high quality code via TDD is shown in green, with the rate of producing lower quality code without TDD is shown in red. If a developer or team exhibited the behavior shown in this model, they would be able to produce high quality code faster than low quality code. They would response to time pressure by writing high quality code, since they can deliver more quickly when writing high quality code.

But what kind of fantasy land is this? How can mere mortal developers achieve this seemingly impossible state of being both Fast and Good? The answer is investment. In the case of Uncle Bob, he was able to conclude that writing good quality code was faster, for him, after investing a month in learning and using a particular practice (TDD):

I used TDD for a month or so… That’s all it took for me – just doing it. Maybe that’ll be enough for you.

Choose What to Invest In

If we want to improve ourselves and our teams, we can invest time into training (or hiring those with the skills we value). Over time and with practice, we can increase the rate at which we are able to deliver high quality software until that rate equals or exceeds the rate at which we produce poor quality software. When that happens, the virtuous cycle kicks in, and we respond to pressure by working better, not by cutting corners, because we are confident that our production rate is fastest when we write high quality code. Figure 3 shows the change in the slope of the green curve that we are after.

Figure 3 – Increasing High Quality Production Rate Through Training

SoftwareQuality04

Our challenge, as an industry, is to increase the slope of the QHigh curve until it exceeds that of the QLow curve, as shown in Figure 3. If your organization isn’t helping you to do this yourself, and you value your career, then you can try to change your organization or you can change your organization. That is, you can lobby for your company to change its behavior, or you can vote with your feet and join an organization with values that match your own.

Investing in our developer and team capabilities has benefits that mirror those of capital investment and technical progress in macroeconomics. As we increase the training and expertise level of our teams, our capital/effective labor ratio increases, which shifts us to a higher production curve. The gains in productivity suffer diminishing returns, which means we get the largest boost in productivity when we invest in those workers with the least amount of capital.

Diminishing Returns and the Catch-Up Effect

The least-experienced, least-well-trained teams and developers stand to gain the most from training and investments in better processes. Once teams and developers are capable of writing quality software as productively as they were once able to produce poor quality software, additional investment yields diminished returns. If you’re fortunate enough to reach this point, additional training in emerging technologies would probably be a better investment than in code quality fundamentals and processes.

Informed Decision-Making for Software Stakeholders

Software stakeholders are responsible for the software produced by individual software developers or teams. They frequently are not software developers themselves and are not intimately involved in the actual coding of the solution. However, they are frequently responsible for making short- and long- term decisions related to the project and/or the developers working on the project. These decisions should ideally be made based on actual facts rather than assumptions.

In the short term, with each new feature request, the stakeholder can decide whether the development team should be on the QHigh or QLow path. They can make this decision implicitly by emphasizing rapid delivery above all else (or quality above all else), or explicitly by monitoring internal software quality and tying feature acceptance to certain quality requirements, for instance. In order for this to be an informed decision, the stakeholder needs to know what the relative slopes of the QHigh and QLow paths are for their developers. It may be that the current development team is incapable of delivering software of the required quality, no matter how much time they are given, because they simply lack the necessary skillset. Or it may simply be that the developers require more time to produce high quality than is economically practical for the project at hand. At that point, the stakeholder can make an informed decision to move ahead with the existing team and a low quality vector, or to replace the team.

In the long term, software stakeholders can make decisions about whether or not to invest in their teams. Again, this requires knowledge of the team’s current QHigh and QLow paths. Making investments to improve the team’s QHigh production rate, either through training or hiring, will yield benefits on all future projects that can take advantage of this path. Ideally, the team’s skillset and processes should allow them to produce high quality software faster than low quality software, but achieving this level of intellectual capital may require significant investment.

Summary

The productivity rate of software developers and teams can vary based on the level of quality of code they are producing. Typically, developers are capable of producing code in a range of qualities, from low to high. It is unreasonable to expect developers to deliver software of higher quality than is within their current range of capabilities, though over time these capabilities can be improved. In many cases, teams produce low quality software because they can do so more rapidly than they can produce high quality software. However, sufficiently skilled teams can reach the point where their production rate of high quality software exceeds their rate of producing low quality software, at which point there is no longer any incentive to produce low quality code. Software project stakeholders need to assess the internal quality of the software they are responsible for, and the teams who are delivering it, in order to make informed decisions about level of quality to expect from their team and the investments they should be making in their team’s capabilities.

  • Comment

    My 15+ years consulting experience leads me to agree with your description of most code out in the wild. It tends to be sloppy and at best works as well as the last set of QA (or end user) tests.

    Given the current state of software quality I’d suggest it is far from accidental. It’s very much in keeping with our financially motivated management style – don’t worry about tomorrow.

  • Guest

    Companies get what they pay for… Bring on inexpensive, lessor skilled/minimally trained staff to write software and the quality will always be poor. The end up paying for their mistake on the support and customer satisfaction sides.

  • Thomas L Deskevich

    Most contracters cannot get away with this because of required inspections. But I often compare it to rewiring the house or the like. You don’t know what you may find when you bust the old walls out. Or you don’t know if the electrition re-wired and used 2000 piees of romex he had left over and used wire nuts to hold it all together. The lights all turn on fine.
    Until…

  • JL

    Over my career, I have worked in two biotech companies and the internal code quality was poor.  At my current company (biotech), we do not use not know of good design techniques.  Here, the main push of the stakeholders is to get it out ASAP, then quality a close second. (How close to the edge can you get it?) We usually start a new project coding the more interesting stuff (usually the internal business logic) and make the rest of the software fit what we think the business logic should be.  When our software goes to QC, the number of bugs is quite large, and the time it takes to fix these bugs is lengthy.
    We have improved our UI and started using UI patterns there, but the rest of the process is still more ad-hoc.  But, we do not know much about good architecture or use or know about design patterns. 
    Some of our earlier software (before I came on board), we have functions that run up to 2000 lines in length (of C++).  As a result, maintenance and updates have been a nightmare. (Get strange side effects when you make a change in one place.) 
    Its unfortunate when internal software design is not given its due diligence by management and the stakeholders.  In the end, it takes longer to get out the software.  And in biotech, if the internals of the software are not done well, would you want a medical application to be used on you?

  • sorgfelt

    Define “quality”.

  • Steve Smith

    What definition do you use?

  • Brian Balke

    I think that it’s actually worse than is made out here. If there are any two disciplines that should be closely aligned in their practices, I would think that they are operations planning and programming. The problems that we have in software development are also endemic in operations planning. This means that we’ve got the classic GI/GO problem, but worse because deploying a software package changes the context for operations management.

    In other words, deploying software solutions changes requirements.

    So until we bring to bear an integrated system design technology (encompassing both software and operations) then there’s little hope that design as a practice will be supported because requirements are simply too unstable.

    Think about it – what are you going to test to if the environment managed by the operations people is constantly changing? When change is endemic, speed is the only thing that counts.

  • sorgfelt

    There are several things that I look at, but TDD is not one of them. It is quite possible to write sloppy spaghetti code that passes tests and keep patching it to pass more tests, until it is such a mess that it can no longer be maintained. Code should be tested frequently and thoroughly during development, but tests should not necessarily drive the development. Logic and function should drive development. I personally favor starting with a small but working code base, and extending as you go. This allows easier testing and debugging as you go, and some sort of working deliverable is always available at any point.
    So, off the cuff, there are a few things:
    1. Not duplicating code blocks with copy and paste. Write methods/functions as necessary.
    2. Variable names need to be consistently formatted and named, easily understandable, and commented where they are declared.
    3. There need to be comments describing the overall structure of the code, and more describing the purpose of each class and method.
    4. Some reasonable method of version control needs to be followed.
    5. Tests should be thorough and all run successfully.
    6. Some document describing the specifications, even if they are loosely defined, and how they are being met by the code needs to be kept and updated as necessary.

    There was something else, but it slipped my mind and I have to go fix
    supper for my kids.

    Bruce Patin
    267-210-3814

  • Guest

    A couple of thoughts.

    You’re talking about Software Quality, and you differentiate between external an internal.  But does that also include Data Quality?  I’m not talking about Data Validation, for data that comes from (and is validated by) your Application, but data that is already in an “inherited, dirty” existing database (for which your client has *not* hired you to “clean up”).
    Or that is put there by some external process *not* under control of your Application/Software.

    Secondly, I believe the so-called Iron Triangle of Good-Fast-Cheap, Pick Any 2 often actually becomes “Pick 3”, not simply because people often “want cake and eat it too”.  But because Fast and Cheap are often “coupled” in as far as billing is at least indirectly based on time.  Thus, for example, there is practically not really such a thing as Fast-But-Expensive or Slow-But-Cheap.  What you’re presumably proposing with TDD, is that things can/should be Good-Fast-Cheap, but I believe that in reality most things end up being Good-Slow-Expensive.

  • Ricky

    It’s called “Big Ball of Mud” (link below) …and yes…sadly…in my 16 years of experience it is the dominant architecture.  

    http://www.laputan.org/mud/mud

  • DefaultGraphic

    Quality is the lack of waste.  For the developer, the largest source of waste is rework, i.e. doing things over or more directly – fixing defects and repeating the verification process, rebuilding the application.  For the user, waste is lost data, repeated entry of the same data, clumsy work flow and on occasion, loss of life. 

  • Steve Smith

    I should make it clear, in case I wasn’t, that I’m not suggesting in this post that the only way to write quality code is via TDD, or that the only way to write high quality code faster than low quality code is via TDD.  I’m merely suggesting that it should be possible for developers, with enough training, to get to a point where they are able to write high quality code faster than low quality code, and I’m using Uncle Bob’s article as an example that this is in fact attainable.

  • Eugene Z

    That’s not how it happens (somebody writes good or poor quality right away).

    In the beginning of the project usually there is “Time” – developers and management are not under stress, and are full of hopes.
    In the beginning of the project, people discuss class hierarchies, and describe full-featured APIs, and generally write something that can be “good”.

    However, as project progresses, stakeholders come out with new and terribly important features, and bugs start to pile up;
    and developer gives his manager a choice “I can do a dirty fix in 3 days or a good one in 2 weeks”.
    And more often then not manager says “let’s do the dirty fix now, and we will do the good one later”.

    However, this “later” can not happen when the product is about to be released;
    and it generally does not happen afterwards – because the product is shipped;
    and development team is reassigned to another project…