For over 38 years, Donald Reinertsen has been working with organizations of all shapes and sizes to create fundamental changes to the way they develop products.
Having authored three best-selling books – Don’s research unlocks the secrets to successful Agile transformations, including the underlying principles, mathematics, and practical methods of Agile and Lean Product Development. Don is the “Maths behind Agile.”
We were fortunate enough to be joined by Don last year when he shared his exceptional, in-depth and well received “Lean Product Development Flow” course and meetup here in London. Ahead of his return trip to London & AWA this June, we interviewed Don about queues, measuring agility and dealing with variability plus much more.
What are queues in software development and why is it important to make them visible?
When a work item sits idle waiting for someone to work on it, we say it is in queue. The total time it takes to finish work is the sum of this idle time and the additional time that the item is actively being worked on. When we shorten this total time, we deliver valuable capability to our customers earlier, and we will normally make more money. Thus, we are interested in reducing queue time because this improves profits. Reducing queues improves profits by shortening cycle time and by accelerating feedback which improves both quality and efficiency.
Our problem in product development is that its queues are normally invisible. When we walk into a crowded coffee shop, we can see the difference between having 100 people in line and having five people in line. When we walk into a busy software development shop it will look the same whether there are 100 active stories or five. The queues in software are information, and this information is intrinsically invisible. We usually call work that has been started, but not yet completed, “Work-in-Process inventory (WIP)” and, as you may suspect, we are unlikely to manage it well when it is both invisible and unquantified.
The single simplest and most important action we can take to make work queues visible is to use a visual control board. Such boards do two important things. First, they associate a physical (or virtual) artifact with each invisible work item. This enables us to see the total amount of work in the system. Second, when properly designed, they show the specific state work is in. This allows us to clearly distinguish between items that are actively being worked on and items that are in queue.
Many organizations focus on cycle time as a primary measure of agility. Do you consider this to be a good measure of agility?
When people refer to cycle time they usually mean the time between when they start and finish an activity. This is economically important, but it is actually a measure of speed not a measure of agility. Let me illustrate the difference. A bullet travelling towards a target has a high speed and therefore a short cycle time. Unfortunately, we can still miss the target with a high speed bullet if the target is moving. A bullet is fast but not agile. In contrast, radar guided missile must be both fast and agile. It must fly faster than the plane it is trying to shoot down, and it must maneuver to constantly head for the target. Speed alone is not a measure of agility.
Personally I like the way the term agility is used in sports – you are agile if, while moving at a high speed, you can change direction quickly without slowing down. To me this ability to smoothly redirect momentum is the essence of agility.
What gives us the ability to quickly redirect momentum? In physics we define momentum as mass times velocity. We redirect momentum by pointing the velocity vector in a different direction. This normally requires what is called an impulse which is defined as a force applied for a period to time. If we apply higher forces we can redirect objects in a shorter period of time. Thus, in a certain sense, cycle time could become a measure of agility but it is not the cycle time we customarily refer to but rather the time it takes to redirect our effort. The momentum analogue of physics is helpful because it suggests that redirecting a large program proceeding at a high speed is more challenging than redirecting a small program that is progressing slowly. It doesn’t challenge Superman much to save Plant Earth from ping pong ball hurtling towards it – a large asteroid is a different matter. You can see why TBTF (Too Big to Fail) projects present a problem for agilists.
If cycle time is not a good measure of agility, does that mean it is not important to measure it? What would you measure instead of cycle time?
Cycle time is useful to measure even though it is not a great measure of agility. Cycle time can be directly related to economic results, and this makes people who are responsible for delivering such results interested in it. The key concept that helps you assess the value of measuring and controlling cycle time is Cost-of-Delay (CoD). This is usually measured as the change in life cycle profit that will occur if a project is delayed.
The problem is what do we do with the metric and this gets back to the fundamental question of why we are measuring. If our intent is to influence profits cycle time is not terribly useful because it is a lagging indicator. If you want to control cycle time you do not measure cycle time, you measure queue size. Why? If you measure both queue size and flow rate (normally called velocity in a software process) you can forecast cycle time. When your boss asks the embarrassing question, “When will you be done?” you actually have some chance of giving an intelligent answer. Queue size is a leading indicator of future cycle time, which makes it rather useful to people who want to influence the future.
Imagine you have just walked into Starbucks and you want to know how long you will have to wait to get your coffee. You could ask somebody who just received their coffee how long they had to wait. This is measuring cycle time. It is not terribly reliable because they might report a longer or shorter wait time depending on how long the line was when they arrived. There can be a big difference in the cycle time for the first person to arrive in line and the last one. The most accurate forecast of your individual wait time depends on the number of people ahead of you and the rate at which they will be served. With this information you can predict when you will reach the head of the line.
The same is true in product development. In practice, queue size tends to vary more than average velocity, so queue size is a great leading indicator of cycle time. Cycle time, which is only known at the end of project is very accurate but it arrives too late to be useful. It is like discovering, when your car sputters to a stop, that a full tank of gas has enabled you to travel precisely 245.4 miles. In contrast, knowing how much gas currently remains in your tank, and how many miles you can travel on a liter of gas, enables you to make a reasonable forecast of how far you can go before you run out of gas. The product development equivalent of this is knowing the amount of unfinished work in queue and the average rate at which you can accomplish this work – this allows you to predict how long it will be until you run out of work, i.e. finish the project.
Most organisations that I work with try to achieve 90+% utilization of staff on feature work. Is this a good thing? What effect does it have on flow and wait time?
Excessively high utilization is a chronic problem for developers and it arises because they are blind to the size and cost of their queues. In manufacturing we think of the investment in a process as being composed of the cost of capacity plus the cost of inventory. Imagine an airport that wanted to have nobody waiting in security queues. They would have many screening checkpoints open, but these people and their equipment would be underutilized most of the time. In this case, they are minimizing the cost of the queue by increasing the investment in capacity. If they decided to minimize their investment in capacity, then the queues would become very large at certain times during the day.
We have a similar situation in product development. We are trying to balance two costs, the cost of work sitting idle in our queues and the cost of workers becoming idle because there is nothing for them to work on. As I said earlier, the queues in product development are invisible so we constantly underestimate the cost of these queues. In contrast the workers are very visible and when they are idle it appears to be a terribly expensive form of waste. We gravitate towards maximizing busyness. Our inability judge the cost of queues drives us into overestimating the importance of achieving high utilization rates. This is a local optimization that only appears rational because we are unaware of the cost of queues. Reveal and quantify the invisible and it transforms your behaviour.
It is important to realize that our goal is not to minimize the queue. We are trying to operate at a point where total cost, the combined cost of the queue and the cost of capacity, has been minimized. Because the system is stochastic, this means an optimized system will experience both periods of idleness and periods of overload. We are trying to optimize on a flat bottomed U-curve so heading for the extremes of high or low utilization hurts our economics.
Sadly, such enlightened management of queues is rather rare. Most companies are unequipped to make rational tradeoffs between the cost of the queue and the benefit of high utilization. Why? They know neither the size of the queue nor the cost of delay. It is as if someone had a credit card and asserted they are managing their credit wisely. You might ask them, “What is the size of your credit balance and what is the monthly cost of maintaining this balance?” What if they answered, “I don’t know how big the balance is and I don’t know what it costs me.” You’d probably tell them to find out how much they owe and how much they pay in interest. That’s what most product developers need to know about their queues.
Most unpredictability comes from high variation. Why can’t we eliminate variation in business?
We must deal with many forms of variability in product development. Our requirements change during a project. We use technologies that are not fully understood. We can’t accurately estimate the duration of work we’ve never done before. Even when we know the exact work content, the productivity of individual workers will vary. Some people argue that we should make this variability go away. Unfortunately, much of this variability is simply a side effect of doing something new – a by-product of innovation. Variability is the perennial travelling companion of innovation. We can make it go away by never trying anything new, but this has an enormous economic cost.
To address this problem I think it is crucial to ask what benefits we expect to achieve from creating lower variability and what the costs are. Very frequently it is better to design a process to function in the presence of variability than it is to eliminate the variability. For example, look at the way venture capitalists approach their investments. If they only invested in companies with certain outcomes they would miss the most interesting investments. Since they want reasonably predictable returns even though they are investing in risky opportunities, they have designed a system to reduce portfolio variability even when they are investing in risky opportunities – they make multiple investments in uncorrelated areas.
In product development we also have opportunities to create processes that perform well in the presence of variability. We see this in the Lean Startup movement where we don’t try to do perfect forecasts, we create experiments that test hypotheses and then we pivot or persevere. Our experiments can have both positive and negative outcomes but, when we pivot, our system stops investment in the negative outcomes. In legacy development systems all of us have seen massive waste as we try to keep a dead opportunity alive.
What reading or pre-requisites would you recommend to product managers / owners who are coming on your course?
It depends how courageous they are. In my experience people who have already read my latest book, The Principles of Product Development Flow, ask better questions and grasp the more advanced ideas in my workshop. Nevertheless, it is not an accident that people say the book is, “not for the faint of heart”. It you are used to reading breezy business novels you may be better off reading the book after the course instead of before it.
There are plenty of less challenging books that are loaded with broadly useful information. For example, I love to recommend Daniel Kahneman’s book, Thinking Fast and Slow, because it is useful to almost everyone.
What books are you reading now? What top three books could you recommend for students of product development improvement?
It’s an interesting question. I have quite a large pile of partially read books on my nightstand, most of which I would not prescribe as reading for other people. If I am incorrectly confident in the adequacy of my knowledge of a subject, then I tend to underinvest in learning more about that subject. Sometimes this judgment of adequacy can be based on an underestimate of the importance of the domain, and sometimes it is based an overestimate of the amount of my knowledge – I fall into both traps. Consequently, these days I mostly try to correct some of the gaping holes in my education, and I have plenty of reading to do.
As wiser man than I, Charlie Munger, the business partner of Warren Buffet said, In my whole life, I have known no wise people (over a broad subject matter area) who didn’t read all the time — none. Zero. It is good advice.
If you want to learn about the impact of flow, batch size, prioritisation, capacity planning, waste, and queues with Don Reinertsen then sign up for his course “Second Generation: Lean Product Development Flow”.