Estimating Effort – Adaptation

I’ve been running the informed intuition (or if you prefer, “disciplined intuition”)  approach to estimating effort for close to nine months now. For the most part, it has gone very well. The primary objective – inspire and support a conversation around the effort needed to complete a story – has most definitely been realized. Along the way the process has shifted to better support both the conversation and the team’s ability to internalize the process.

Originally, it was proposed that teams rate each of the effort characteristics on a sliding scale – 1 to 10 or 1 to 15, or whatever the team decided was most useful. Feedback from the teams lead to the discovery that it is easier to evaluate each effort characteristic using the modified Fibonacci scale rather than a sliding scale. This provides continuity across the method in that everything about a story’s effort value is considered using the same scale. It also reinforces the rationale behind the use of the Fibonacci scale and seems to facilitating the team’s ability to internalize the method. They are moving more quickly when deriving effort values.

A second adaptation is the use of several sets of characteristics, depending on the type of story, the predominant functional area represented by the team, and the nature of the work. For example, a story that involves the development of a computer board has a different set of criteria from stories that involve the creation of firmware for the board or the UI/UX features of the hardware product. The sets usually contain 3 or 4 common characteristics, such as “complexity” or “dependencies.” However, the hardware board may include something like “part sourcing” or “compliance testing.” This illustrates the importance of having the team deconstruct what “effort” means in the context of their world. When they determine the characteristics, the follow-on conversations about the effort are much more robust and meaningful.

In essence, this method is a reflection of the product owner’s responsibility for the “what” of the story and the team’s responsibility for figuring out the “how” of the story. “What I want,” says the product owner, “is an estimate of the effort involved to complete this story.” The teams effort criteria demonstrate to the product owner how they arrive at any particular value.

Intuition and Effort Estimates

In his book, “Blink,” Malcom Gladwell describes an interview between Gary Klein and a fire department commander. A lieutenant at the time, the firemen were attempting to put out a kitchen fire that didn’t “behave” like a kitchen fire should. The lieutenant ordered his men out of the house moments before the floor collapsed due to the fire being in the basement, not the kitchen. Klein later deconstructed the event with the commander and revealed a surprisingly rich set of experienced-based characteristics about that event the commander used to quickly evaluate the situation and respond. The lieutenant’s quick and well-calibrated-to-the-situation intuition undoubtedly saved them from serious injury or worse.

Intuition, however, is domain-specific. This same experienced-based intuition most probably wouldn’t have served the commander well if he suddenly found himself in a different situation – at the helm of a sailboat in rough water, for example, assuming the commander had never been on a sailboat before.

In the context of a software development environment, a highly experienced individual may have very good intuition on the amount of work needed to complete a specific piece of work assigned to them. But that intuition breaks down when the work effort necessarily includes several people or an entire team. So while intuition can serve a useful role in estimating work effort, that value is generally over-estimated, particularly when it needs to be a team estimate.

Consider work effort estimates when framed by Danial Kahneman’s work with System One and System Two thinking. System One is fast, based on experiences, and automatic. However, it isn’t very flexible and it’s difficult to train. This is the source of intuition. System Two, however, is analytical, methodical, intentional, deliberate, and slower. Also, it’s more trainable. It’s when the things that are trained in System Two sink into System One that new behaviors become automatic. With work effort estimates, we must first deliberately train our System Two using a method that is more deliberate about estimating before we can comfortably rely on our System One abilities.

Once calibrated, any number of changes could signal the need to re-calibrate by employing the deliberate process. Change the team composition and the team will need some measure of re-training of System One via System Two. Change a team’s project and the same re-training will need to occur.

The trained intuition approach to estimating effort develops what Kahneman called “disciplined intuition.” Begin with a deliberate, statistical approach to thinking about work effort. Establish a base rate using the value ranges for the effort characteristics. With experience, the team can begin to integrate their intuition later in the project process. If teams lead with their intuition (as is the case with planning poker and t-shirt sizes), they will filter for things that confirm their System One evaluation. With experience and a track record of success from training their intuition, teams can eventually lead with an intuitive approach. But it isn’t a very effective way to begin.

This method also leverages the work of Anders Ericsson and deliberate practice. The key here is the notion of increasing feedback into the process of estimating work effort. The deliberate action of working through a conversation that evaluates each of the work effort characteristics introduces more and better feedback loops that help the team evaluate the quality of their decision. Over time, they get better and better at correcting course and internalizing the lessons.

It’s like learning to drive a car. A new driver will leverage System Two heavily before they can comfortably rely on System One while driving. This is good enough for most driving situations. However, it wouldn’t be good enough if that same driver who is competent at driving in city traffic was suddenly placed on a NASCAR track in a powerful machine going 200 miles per hour.

A NASCAR track might be where we would go look for expert drivers but not where we would look for competent delivery truck drivers. For work estimates on software projects, we’re looking for a level of good enough that’s a reasonable match for the project work at hand. And we’re looking for better than untrained intuitive guesses.