» simulation SD Action!

On changes

H’llo, here we go again!

Last time I promised I’d have some neat numbers for you but first, let’s talk about changes. Not like changing your hair color or favorite brand of root beer but changes in projects. Everybody knows they can be dangerous and implementing a proper change management procedure is one of the first things project managers are taught. And yet, change management can be a downfall of even the most well-managed projects. For instance, the Ingalls shipbuilding case I have referred to earlier.

Footnote: My favorite case ever on project management is also about changes. The Vasa was to be the pride of the Swedish navy in 1628 but sank on its maiden voyage in front of thousands of spectators. The reason? The king demanded addition of another gun deck dangerously altering the center of gravity of the ship. The jest of the story? The ship contained wooden statues of the project managers and those are now on display in the Vasa Museum in Stockholm. Get change management wrong and chances are 400 years later people will laugh and point fingers at your face.

Why is this? Mainly because of the difficulties of assessing impact. While direct costs involved in ripping out work already done and adding more work can be estimated with relative ease, the secondary effects are hard to estimate. Are we sure ripping out stuff wont’ disturb anything else? How many mistakes are we going to make while doing the additional work, how many mistakes will slip through tests and how many mistakes will the fixes contain? As the case referred to earlier illustrates, this is a non-trivial question.

“Yes”, you say, ” this is why good project managers add a wee bit of buffer and it is going to be fine in the end”. Really? How much buffer should you add, pray? Simulation to the rescue!

What I did was to add 10 tasks worth of work to a project of 100 tasks. 10% growth. I did this in a couple of ways. Firstly, I made the new stuff appear over 10 days early in the project, then the same late in the project and then added 10 tasks as a constant trickle spread over the entire project. Here are the results:

What’s that butt-ugly red thing, you ask? Oh, that’s something special we already had a brush with an an earlier post. You see, sometimes projects are set up so that the customer does not accept or test anything before there is something to really show off and that is late in the project. Of course, this means that the customer can not come up with any changes before that delivery happens and of course no mistakes are discovered either. The thing I like most about the red bar is how the amount of work to be done doubles after the testing starts. For the project manager this means that there is no way to even assess the quality of their work and thus there is no way to tell, if you are meeting the schedule and budget or not and the actual project duration is FOUR TIMES longer than projected based on initial progress…

I realize the graph is a bit of a mess so here’s a helpful table:

	Tasks done	Percentage added to base	Multiplier to work added
Base case	336.42	0.00%	0.00
New work added early	357.89	6.38%	2.15
New work added late	369.82	9.93%	3.34
Late acceptance	429.077	27.54%	9.27
Trickle	361.369	7.42%	2.49

I chose not to review the deadlines as we are trying to asses the cost and not deadline impact of a change. The amount of work actually done is much more telling.

The first column shows the number of tasks actually done at the end of the project. For base case (the productivity and failure parameters are similar to the ones used in the previous post), this is 336.42. This should not come as a surprise to you, dear reader, but stop for a moment to digest this. In an almost ideal case the project takes 3.36 times more effort than would be expected.

The second column shows how many percentages the scenario adds to the tasks done in base case and the third one shows by how much these ten additional tasks got multiplied in the end.

Not very surprisingly, the best case scenario is to get the changes done with early on in the project. This is often not feasible as the customer simply will not know what the hell they want and so, realistically, trickle is the practical choice. By the way, this is where agile projects save tons of effort. Adding new work late is much worse, 10 new tasks become 33.4.

Now, close your eyes and imagine explaining your customer that a change that adds $1000 worth of effort to the project should be billed as $3340.

Done? At what price did you settle? Well, every dollar lost represents a direct loss to your company as the costs will be incurred regardless of whether or not the customer believes in this or not. To put this into perspective, 11.93 tasks worth of work can be saved if the customer comes up with a change earlier. Esteemed customer, this is the cost of you not telling the contractor about changing your mind early enough.

By far the worst case is the late testing. The effort goes up by almost an order of magnitude! That’s really not cool. Who does that sort of thing, anyway? Come to think of it, anybody who does classical one-stage waterfall which is an alarming percentage of large government contracts and a lot of EU-funded stuff. Scary. Nobody wins, you see. Even if the contractor, through some miracle of salesmanship combined with accounting magic, manages to hide the huge additional cost somewhere in the budget, they are unlikely to be able to hide the cost and the margin so the contractors overall margin on the project goes way down while the costs for the customer go up. They could choose to change the process instead and split the 2/3 of the savings between themselves… Wouldn’t it be lovely?

Let’s hang on to that thought until next week. Meanwhile, do observer System Dynamics in Action!

Why are projects late?

It’s this time of the week again, time for another episode of (drumroll) SD Action!

Last time I introduced a basic project management model, this time let’s look at what this baby can do.

Let our base project be a project with 100 tasks. The team size is 200 people, each of whom can accomplish 0.005 tasks per week, this leads to… Oh, I don’t know. Here’s a graph:

Yup, the amount of work to be done (see the previous post for the model framework) goes down at a steady rate and the project is done by the one hundredth week. Nice. I can hear the more experienced project managers go “yeah, right!” Nothing ever goes as smoothly, people make mistakes! You’re supposed to add buffers and such, 30% is the standard practice.

Hm, let’s see what happens if we allow people to make mistakes. On the model, this amounts to there being 20% of chance of a task to have the need to be re-done and the rework generation and discovery flows kicking in. Given the one fifth chance of a mistake, how much should we add to the project duration? 20%, right? Not exactly. You see, you might make mistakes on the bug fixes as well… You guessed it, here’s a graph:

What kind of sorcery is this? The project duration did not grow by 20% and not even 30%. It grew by 110%! Blimey, we just missed our deadline.

Oh well. Sure. Mistakes happen. But what if the mistakes are costly, generating more work to remove the previously done stuff? Remember the example of having to chip out old concrete before pouring new. Here we go:

Yes, this added another 55 weeks to the project. This is one year. By allowing mistakes to cause additional work. Of course, the relationships are more subtle but they are way too geeky to explain here. The deconstruction rate depends on how much of the project is done: it is 0 for about 50% and grows to 1 (in the later phase, as much of effort goes into deconstruction as into rework) as the project progresses. These assumptions are probably different in your field but in my world, one year got added to the project by making a fairly reasonable assumptions of mistakes costing effort.

As said earlier, the team size is 200 people. Given that at this point we are looking at a five-year project, it would be reasonable to assume that there is employee churn. Of course, the newcomers must learn the ropes before they can be productive and, in fact, the entire team starts out this way having about half the productivity. Let’s assume there is 10% employee churn annually, hiring is started immediately to replace the leavers (6 weeks to fill a position on average) and that it takes four weeks to get acquainted with the project.

This is actually not half bad, we loose only 5 weeks or so. It turns out that 10% churn in 200-person team is not a big deal. What is curious, though, that most of the lag is caused by the the fact that the team size actually goes down. How come? You see, given the parameters, the churn turns out to be faster than hiring. People leave until annual churn drops to the same level as hiring and stops there, the model stabilizes. In our case, this means there are 195 productive people, 3 people are constantly in incubation and 2 are just lost. This is where system dynamic modeling excels: solving this symbolically would have involved constructing and solving a system of differential equations but I just drew a couple of boxes and pressed a button.

How many of you have spotted a fatal flaw in the model? You did? Right! Give the gal a cookie!

Let’s give others a moment, shall we…

Yes, right. The thing is that the current model assumes testing starts immediately. The moment anybody writes a line of code or draws a line, it gets tested and, after a while, possible mistakes end up back at the work queue. Unfortunately, this is not how stuff happens in many cases.

Let’s take construction. Firstly, the architect dreams up a house. Then a bunch of engineers figure out the structure of the thing. Then people come and work on pipes, ventilation and drains. And finally somebody devises a loom of electric wires. And then people go and start building it only to discover that a ventilation duct must pass directly through a structural beam. And a cable ladder crosses a flight of chairs. At about chest height. Bummer. With the way construction is done in this country, I’m assured, there are very little means to discover such mistakes before construction actually begins. In our model, I’ve made it so that there is no rework discovery until about a third through the project, then everything proceeds normally. This is how it goes:

Sweet mother of baby Jesus! 80 weeks! Of course I’m overdoing things a bit. Some testing does happen earlier. True. But the current model does not account for any customer spec changes or for any risk materializing so, broadly speaking, the order of magnitude – about 30% – should be in the ballpark. What is worth, though, is this:

The graph shows the ratio of percentage of work actually done and the percentage of work believed to be done. For all other cases, it peaks pretty early on and starts declining nicely but for late testing, it remains very high until very late. For a project manager this means that they have no idea whatsoever how the project is progressing. Which is a Bad Thing ™.

Let’s recap. By adding only four simple aspects of project behavior, our project has grown 350% in the worst case and about 250% for sensible testing behavior. And we still have not talked about risks or awkward acceptance tests or multiple contractors or, or… Oh God.

See, this is why projects are late. Project managers are faced with dynamically complex systems that can go off on wild tangents for any reason and usually only have their gut to rely on. Of course, being under deadline pressure and lacking concrete evidence they give in and promise these 100 weeks or possibly 150. Well, they should go and simulate their project model and see what comes out the other end. In short, they should observe System Dynamics in Action!

SD Action!

System Dynamics in Action!

Tagged with simulation …

On changes

Why are projects late?