Today we are going to talk about quality assurance. I’m going to typical processes of a software house as an example but the principle should be applicable in a wider context.
People make mistakes. Sometimes it is due to the way they work (the light switch and the drain-nuclear-reactor-coolant-system switch are identical and placed an inch appart) and sometimes is because they are just not well enough trained for the job. Sometimes they make mistakes because they are human but in any case, if the average quality of processes or people goes down, the number of defects in the product goes up. When the defects go up, the market is not going to like it and it will end up, in some way, driving down the revenue (Note: it might also be that the customer support costs or returns or whatever drive up the cost but the result is the same). This in turn drives down profit. When profit goes down, managers will be eager to find the culprit and might be inclined towards increasing investment in the QA. Effort spent on finding the defects before they reach the market goes up and the number of defects goes down again. This is what we’d call a balancing loop. Because it, well, balances itself.
What also happens as QA expense goes up, is that the average cost base of the company goes up which drives down profit which might lead to the same sort of managerial anger that caused QA expenditure to go up in the first place. Also, the more you spend on a unit the more powerful they get in the organization. The person commanding 200 people has more say in budget decisions than the person commanding 2. And of course they are going to as for more money. The entire picture looks like so:
There might be some process improvement going on but I think most QA folks agree that their primary job is catching bugs and finding better ways to do that. There is also a direct link between revenues and quality of people/processes as pointed out earlier but let’s ignore that added complexity for now.
I have two questions for you. With the forces at play, where do you think the quality level of the product stabilizes (if it indeed does stabilize)? And what do you think it takes in terms of money and effort to actually raise it to the next level?
You don’t know? Me neither. That’s the bloody thing: its a convoluted complex system where cause and effect go and dance tango leaving you to scratch your head in puzzlement. Oh, and did you notice? the thing that started at all does not feature in our primary feedback system at all. Nobody does anything to it and thus, whatever caused it to go down in the first place (loss of training budget due to missing of revenue targets?) is going to happen again and the entire system will tango to the sunset in search of a new equilibrium.
Let’s now say that instead of just kicking the arse of the head of QA (or increasing his budget), the management would go “Why?”. Why do our products have bugs? Why do we have more bugs today than we had yesterday? And increase the effort spent on process quality and people.
This picture is slightly better. The dance is still happening but at least the root causes are addressed and the entire system is likely to behave in a stable fashion after a while (even if this means oscillations).
Finally, let us take a long leap of faith and assume there is no managerial anger. That the leadership of the company has gone “Why the bloody hell are we constantly talking about quality? Why don’t we make it so that we don’t have to step in and manually govern the process? Let’s just make it simple and assign a fixed percentage of the revenue to process improvement”.
In this new reality, when the number of defects is pushed down (via a conscious push from QA, for example), the revenues go up, effort spent on process quality and training/hiring top people goes up which will, surprise, reduce the defects. What will also happen, is that the costs are actually going down. Smarter people working more smartly is a surefire way of reducing your development time and thus cost.
Whoa, hang on a minute. This thing does not stabilize! This thing is going to drive the defect rate down to a what exponentially approches zero!
Through a simple act of not caring the management has turned a balancing loop into a reinforcing loop. A loop that, once started, will drive down the defect rate to as close as zero as practically possible.
And this ladies and gentlemen, is why Toyota has become the worlds larges auto maker. This is why they have surpassed the Big Three in both quality and volume having started from the position of a clear underdog in both in just 40 years. Such learning-based feedback loops are a routine part of their production processes.
If this sounds similar then rejoice: agile software methods are to a large extent (to my surprise) rooted in the Toyota Production System. They preach the same concepts of fast reflection, constant improvement and built-in tests than Toyota does.
With this unexpected foray into the car industry, its time to end. Thank you for reading, have a good weekend and enjoy System Dynamics in action!