Tagged with project management

On changes

H’llo, here we go again!

Last time I promised I’d have some neat numbers for you but first, let’s talk about changes. Not like changing your hair color or favorite brand of root beer but changes in projects. Everybody knows they can be dangerous and implementing a proper change management procedure is one of the first things project managers are taught. And yet, change management can be a downfall of even the most well-managed projects. For instance, the Ingalls shipbuilding case I have referred to earlier.

Footnote: My favorite case ever on project management is also about changes. The Vasa was to be the pride of the Swedish navy in 1628 but sank on its maiden voyage in front of thousands of spectators. The reason? The king demanded addition of another gun deck dangerously altering the center of gravity of the ship. The jest of the story? The ship contained wooden statues of the project managers and those are now on display in the Vasa Museum in Stockholm. Get change management wrong and chances are 400 years later people will laugh and point fingers at your face.

Why is this? Mainly because of the difficulties of assessing impact. While direct costs involved in ripping out work already done and adding more work can be estimated with relative ease, the secondary effects are hard to estimate. Are we sure ripping out stuff wont’ disturb anything else? How many mistakes are we going to make while doing the additional work, how many mistakes will slip through tests and how many mistakes will the fixes contain? As the case referred to earlier illustrates, this is a non-trivial question.

“Yes”, you say, ” this is why good project managers add a wee bit of buffer and it is going to be fine in the end”. Really? How much buffer should you add, pray? Simulation to the rescue!

What I did was to add 10 tasks worth of work to a project of 100 tasks. 10% growth. I did this in a couple of ways. Firstly, I made the new stuff appear over 10 days early in the project, then the same late in the project and then added 10 tasks as a constant trickle spread over the entire project. Here are the results:

What’s that butt-ugly red thing, you ask? Oh, that’s something special we already had a brush with an an earlier post. You see, sometimes projects are set up so that the customer does not accept or test anything before there is something to really show off and that is late in the project. Of course, this means that the customer can not come up with any changes before that delivery happens and of course no mistakes are discovered either. The thing I like most about the red bar is how the amount of work to be done doubles after the testing starts. For the project manager this means that there is no way to even assess the quality of their work and thus there is no way to tell, if you are meeting the schedule and budget or not and the actual project duration is FOUR TIMES longer than projected based on initial progress…

I realize the graph is a bit of a mess so here’s a helpful table:

Tasks done Percentage added to base Multiplier to work added
Base case 336.42 0.00% 0.00
New work added early 357.89 6.38% 2.15
New work added late 369.82 9.93% 3.34
Late acceptance 429.077 27.54% 9.27
Trickle 361.369 7.42% 2.49

I chose not to review the deadlines as we are trying to asses the cost and not deadline impact of a change. The amount of work actually done is much more telling.

The first column shows the number of tasks actually done at the end of the project. For base case (the productivity and failure parameters are similar to the ones used in the previous post), this is 336.42. This should not come as a surprise to you, dear reader, but stop for a moment to digest this. In an almost ideal case the project takes 3.36 times more effort than would be expected.

The second column shows how many percentages the scenario adds to the tasks done in base case and the third one shows by how much these ten additional tasks got multiplied in the end.

Not very surprisingly, the best case scenario is to get the changes done with early on in the project. This is often not feasible as the customer simply will not know what the hell they want and so, realistically, trickle is the practical choice. By the way, this is where agile projects save tons of effort. Adding new work late is much worse, 10 new tasks become 33.4.

Now, close your eyes and imagine explaining your customer that a change that adds $1000 worth of effort to the project should be billed as $3340.

Done? At what price did you settle? Well, every dollar lost represents a direct loss to your company as the costs will be incurred regardless of whether or not the customer believes in this or not. To put this into perspective, 11.93 tasks worth of work can be saved if the customer comes up with a change earlier. Esteemed customer, this is the cost of you not telling the contractor about changing your mind early enough.

By far the worst case is the late testing. The effort goes up by almost an order of magnitude! That’s really not cool. Who does that sort of thing, anyway? Come to think of it, anybody who does classical one-stage waterfall which is an alarming percentage of large government contracts and a lot of EU-funded stuff. Scary. Nobody wins, you see. Even if the contractor, through some miracle of salesmanship combined with accounting magic, manages to hide the huge additional cost somewhere in the budget, they are unlikely to be able to hide the cost and the margin so the contractors overall margin on the project goes way down while the costs for the customer go up. They could choose to change the process instead and split the 2/3 of the savings between themselves… Wouldn’t it be lovely?

Let’s hang on to that thought until next week. Meanwhile, do observer System Dynamics in Action!

Tagged , , ,

Brook’s law: revisited

Ha. I went and read the Mythical Man Month again. Found the spot where he speaks about the effect of adding more people. Then I went and read comments from you guys and thought.

The result is that, indeed, if I dial the effects really high, the project indeed moves slower than it was before. For a while. The critical effects I found were:

  • Productivity drop. The rest of the team needs to effectively to focus on teaching the newcomers
  • Hiring speed. The speed at which the new people come on board has an interestingly strong effect. When the new people get added gradually, the resulting impact is way smaller
  • Error rate. The rookies need to make a ton of mistakes for the result to show

Actually, this is pretty much exactly what Brook says. And actually my model behaves exactly like Brooke says, too. You see, he talks about a late project. A project that is either close to or having already passed a deadline. And if you add so late in the project that it ends before they learn the ropes and the productivity gains show, he is right. In the long run, though, adding more people will work out.

What a relief! Both me and mr. Brook were right! Not that the latter would come as a surprise, though.

Oh, and I have already some tasty numbers for this weeks episode, so stay tuned and observe SD in action!

Tagged ,

Brook’s law

There really is no other way of saying this but I was wrong. I went into this being fully confident that it’d be simple to show how adding more people to a project can make it take longer (which is what Brook stated in his legendary Mythical Man Month). Well, in my model, it doesn’t. Whatever I do, I end up with a work-to-do graph that shows a minuscule blip above the normal behavior when people are added and then a rapid decline to a much faster project end. Mind you, doubling the team size still does not halve the project duration but still. What I get is something like this:

People, obviously, are added at week 105. By varying the number of people added, the the effect they have on productivity and the way they change the error rate, I can change the shape of the curve but it inevitably crosses the blue line (the scenario without people being added) at about the same point.

Well, whaddayaknow. I will be travelling this weekend and settling in next week so am not sure if I’ll get to that but I’d really like to find out what the hell happened. I’ll read The Mythical Man-month again. I’ll look at the model and play around with it. It might be that I neglected some important point Brook is making like add-more-people-productivity-drops-add-even-more-people feedback loops (although based on current results there is too little of effect to trigger that). It might be that I’m just interpreting the output incorrectly or that there’s a bug in the model. In any case, I’m baffled. Which means I’m learning. Which hopefully meant you are learning as well.

Talk to you next week! Take care and enjoy System Dynamics in Action!

Tagged , ,

Why are projects late?

It’s this time of the week again, time for another episode of (drumroll) SD Action!

Last time I introduced a basic project management model, this time let’s look at what this baby can do.

Let our base project be a project with 100 tasks. The team size is 200 people, each of whom can accomplish 0.005 tasks per week, this leads to… Oh, I don’t know. Here’s a graph:

Yup, the amount of work to be done (see the previous post for the model framework) goes down at a steady rate and the project is done by the one hundredth week. Nice. I can hear the more experienced project managers go “yeah, right!” Nothing ever goes as smoothly, people make mistakes! You’re supposed to add buffers and such, 30% is the standard practice.

Hm, let’s see what happens if we allow people to make mistakes. On the model, this amounts to there being 20% of chance of a task to have the need to be re-done and the rework generation and discovery flows kicking in. Given the one fifth chance of a mistake, how much should we add to the project duration? 20%, right? Not exactly. You see, you might make mistakes on the bug fixes as well… You guessed it, here’s a graph:

What kind of sorcery is this? The project duration did not grow by 20% and not even 30%. It grew by 110%! Blimey, we just missed our deadline.

Oh well. Sure. Mistakes happen. But what if the mistakes are costly, generating more work to remove the previously done stuff? Remember the example of having to chip out old concrete before pouring new. Here we go:

Yes, this added another 55 weeks to the project. This is one year. By allowing mistakes to cause additional work. Of course, the relationships are more subtle but they are way too geeky to explain here. The deconstruction rate depends on how much of the project is done: it is 0 for about 50% and grows to 1 (in the later phase, as much of effort goes into deconstruction as into rework) as the project progresses. These assumptions are probably different in your field but in my world, one year got added to the project by making a fairly reasonable assumptions of mistakes costing effort.

As said earlier, the team size is 200 people. Given that at this point we are looking at a five-year project, it would be reasonable to assume that there is employee churn. Of course, the newcomers must learn the ropes before they can be productive and, in fact, the entire team starts out this way having about half the productivity. Let’s assume there is 10% employee churn annually, hiring is started immediately to replace the leavers (6 weeks to fill a position on average) and that it takes four weeks to get acquainted with the project.

This is actually not half bad, we loose only 5 weeks or so. It turns out that 10% churn in 200-person team is not a big deal. What is curious, though, that most of the lag is caused by the the fact that the team size actually goes down. How come? You see, given the parameters, the churn turns out to be faster than hiring. People leave until annual churn drops to the same level as hiring and stops there, the model stabilizes. In our case, this means there are 195 productive people, 3 people are constantly in incubation and 2 are just lost. This is where system dynamic modeling excels: solving this symbolically would have involved constructing and solving a system of differential equations but I just drew a couple of boxes and pressed a button.

How many of you have spotted a fatal flaw in the model? You did? Right! Give the gal a cookie!

Let’s give others a moment, shall we…

Yes, right. The thing is that the current model assumes testing starts immediately. The moment anybody writes a line of code or draws a line, it gets tested and, after a while, possible mistakes end up back at the work queue. Unfortunately, this is not how stuff happens in many cases.

Let’s take construction. Firstly, the architect dreams up a house. Then a bunch of engineers figure out the structure of the thing. Then people come and work on pipes, ventilation and drains. And finally somebody devises a loom of electric wires. And then people go and start building it only to discover that a ventilation duct must pass directly through a structural beam. And a cable ladder crosses a flight of chairs. At about chest height. Bummer. With the way construction is done in this country, I’m assured, there are very little means to discover such mistakes before construction actually begins. In our model, I’ve made it so that there is no rework discovery until about a third through the project, then everything proceeds normally. This is how it goes:

Sweet mother of baby Jesus! 80 weeks! Of course I’m overdoing things a bit. Some testing does happen earlier. True. But the current model does not account for any customer spec changes or for any risk materializing so, broadly speaking, the order of magnitude – about 30% – should be in the ballpark. What is worth, though, is this:

The graph shows the ratio of percentage of work actually done and the percentage of work believed to be done. For all other cases, it peaks pretty early on and starts declining nicely but for late testing, it remains very high until very late. For a project manager this means that they have no idea whatsoever how the project is progressing. Which is a Bad Thing ™.

Let’s recap. By adding only four simple aspects of project behavior, our project has grown 350% in the worst case and about 250% for sensible testing behavior. And we still have not talked about risks or awkward acceptance tests or multiple contractors or, or… Oh God.

See, this is why projects are late. Project managers are faced with dynamically complex systems that can go off on wild tangents for any reason and usually only have their gut to rely on. Of course, being under deadline pressure and lacking concrete evidence they give in and promise these 100 weeks or possibly 150. Well, they should go and simulate their project model and see what comes out the other end. In short, they should observe System Dynamics in Action!

Tagged , , , ,

On managing projects

Projects go wrong. They often do. They tend to go wrong inexplicably, when everything was just about done. They go wrong by orders of magnitude, we hear about a massive project costing billions being closed every other week. What the hell? How come? I mean, these people get paid and get paid well and they still can’t manage a project to be on time, on budget and bang on functionally?

Well, I guess they just can’t help it. The reason, as point out previously, is that humans are notoriously bad at predicting the behavior of even simple dynamic systems let alone a billion dollar 3-year project involving thousands of people in tens of companies.

And you all know what’s coming now. SD to the rescue! Simulate!

This and I suspect a couple of following posts will be on project management and thus it would make sense to establish the basics before plunging into modeling details. This is the basic model structure we’ll be using:

The model assumes that there is a set of work to do and that the work is divided into tasks. The tasks might be writing code, digging holes, it doesn’t matter. The main thing is that work flows out of the “Work to do” box towards two others: “Work done” and “Undiscovered rework”. You see, when you do something it might be OK or it might need changing later. Because you messed up, because somebody else messed up, it does not matter. The main thing is that you don’t know in advance if your work is indeed correctly done or needs to be re-done. That undiscovered rework flows back into work to do via process of rework discovery. Which for us, software folks, is simply called testing. We go “oh, dang” and more work appears on the todo list. Finally, there is a stream flowing in to undiscovered rework called “Deconstruction work”. This one counts for the need to demolish the incorrectly done work. When you pour 200 square feet of concrete incorrectly, you need to bang it to tiny pieces with hammers before it can be poured again. That sort of thing.

Of course, the model as depicted is just a scaffolding. The whole model (based on schoolwork in certain MIT courses but heavily modified) is too complex to go into detail here but the surrounding details can be roughly divided into following parts

  • Scope changes like scope creep, customer changing their mind etc. These things mainly influence the “Work to do” box
  • Personell issues like employee turnover, staffing decisions and such. This is going to have an impact on work flowing out of the Work to Do box. In trade magazines, this is called “productivity”
  • Rework discovery and impact. When and how testing happens and what is the nature of the bugs discovered including the amount, extent and dynamics of deconstruction work

At this point you should be going “dang, this is complex”. You are? Good. Because projects can be incredibly complex. For once, it is not trivial to estimate what the actual amount of time spent on the project would be. Even if you know what the estimates for the factors are, the math is non-trivial and you’d unlikely to be right on the money based on a gut feeling.

We’ll going to go into some more details in the following posts but here are some things this sort of modeling can do for you:

  • Deadline and resource estimates for large projects. Given the process model of your project, given the conditions, project size etc., what does the work estimate and load dynamics are going to be?
  • Process optimization. What happens if we change our development process? What happens if we start testing earlier? What happens if we start doing regular instead of continuous deliveries? Changes in staffing policies?
  • What-if analysis. Given our current project management framework, what happens if half of the team leaves? Customer adds a ton of new requests?
  • Root cause analysis. Our project went like so. OH GOD! WHAT HAPPENED? Model your process, make the result match your project and see if the results improve if you change the policies

Project management is one of the areas, where system dynamics has the most immediate and tangible practical application so the next couple of weeks are going to be interesting!

Allright, that’s all for now. Take care and observe System Dynamics in Action!