Team Composition

When a potter begins to throw a pot, she picks up a lump of clay, shapes it into a rough sphere, and throws it onto the spinning potter’s wheel. It may land off-center, and she must carefully begin to shape it until, it is a smooth cylinder. Then she works the clay, stretching and compressing it as it turns. First it is a tower, then it is like a squat mushroom. Only after bringing it up and down several times does she slowly squeeze the revolving clay until its walls rise from the wheel. She cannot go on too long, for the clay will begin to “tire” and then sag. She gives it the form she imagines, then sets it aside. The next day, the clay will be leather hard, and she can turn it over to shape the foot. Some decoration may be scratched into the surface. Eventually, the bowl will be fired, and then the only options are the colors applied to it; its shape cannot be changed.

This is how we shape all the situations in our lives. We must give them rough shape and then throw them down into the center of our lives. We must stretch and compress, testing the nature of things. As we shape the situation, we must be aware of what form we want things to take. The closer something comes to completion, the harder and more definite it becomes. Our options become fewer, until the full impact of our creation is all that there is. Beauty or ugliness, utility or failure, comes from the process of shaping.Deng Ming-Dao, '365 Tao - Daily Meditations'

Building a high-performance team from scratch is just as difficult as turning a low-performing team into a high-performing team. However, there are very different reasons why each of these scenarios are difficult.

Like the potter beginning with a lump of clay, when forming a new team we must understand what we have to work with and have a clear idea of the outcomes we want. As we shape the team, we have to be mindful of how the individuals on the team are changing – or not – and whether those changes are moving toward the outcome. If not, we either need to change the desired outcome or alter the material we have to work with, that is, change out one or more people so that the shape of the team is better suited to reaching the desired outcome. It is also important to monitor the speed at which the team is formed or shaped. Too fast, and the team may not coalesce in a way that is healthy or productive. Too slow and they may not coalesce at all, they may “tire” of the slow pace and disengage.

With existing teams, we may have a limited range of options to change the roster. This is more like the an existing piece of pottery that has been fully set.

Estimating Effort – Adaptation

I’ve been running the informed intuition (or if you prefer, “disciplined intuition”)  approach to estimating effort for close to nine months now. For the most part, it has gone very well. The primary objective – inspire and support a conversation around the effort needed to complete a story – has most definitely been realized. Along the way the process has shifted to better support both the conversation and the team’s ability to internalize the process.

Originally, it was proposed that teams rate each of the effort characteristics on a sliding scale – 1 to 10 or 1 to 15, or whatever the team decided was most useful. Feedback from the teams lead to the discovery that it is easier to evaluate each effort characteristic using the modified Fibonacci scale rather than a sliding scale. This provides continuity across the method in that everything about a story’s effort value is considered using the same scale. It also reinforces the rationale behind the use of the Fibonacci scale and seems to facilitating the team’s ability to internalize the method. They are moving more quickly when deriving effort values.

A second adaptation is the use of several sets of characteristics, depending on the type of story, the predominant functional area represented by the team, and the nature of the work. For example, a story that involves the development of a computer board has a different set of criteria from stories that involve the creation of firmware for the board or the UI/UX features of the hardware product. The sets usually contain 3 or 4 common characteristics, such as “complexity” or “dependencies.” However, the hardware board may include something like “part sourcing” or “compliance testing.” This illustrates the importance of having the team deconstruct what “effort” means in the context of their world. When they determine the characteristics, the follow-on conversations about the effort are much more robust and meaningful.

In essence, this method is a reflection of the product owner’s responsibility for the “what” of the story and the team’s responsibility for figuring out the “how” of the story. “What I want,” says the product owner, “is an estimate of the effort involved to complete this story.” The teams effort criteria demonstrate to the product owner how they arrive at any particular value.

Intuition and Effort Estimates

In his book, “Blink,” Malcom Gladwell describes an interview between Gary Klein and a fire department commander. A lieutenant at the time, the firemen were attempting to put out a kitchen fire that didn’t “behave” like a kitchen fire should. The lieutenant ordered his men out of the house moments before the floor collapsed due to the fire being in the basement, not the kitchen. Klein later deconstructed the event with the commander and revealed a surprisingly rich set of experienced-based characteristics about that event the commander used to quickly evaluate the situation and respond. The lieutenant’s quick and well-calibrated-to-the-situation intuition undoubtedly saved them from serious injury or worse.

Intuition, however, is domain-specific. This same experienced-based intuition most probably wouldn’t have served the commander well if he suddenly found himself in a different situation – at the helm of a sailboat in rough water, for example, assuming the commander had never been on a sailboat before.

In the context of a software development environment, a highly experienced individual may have very good intuition on the amount of work needed to complete a specific piece of work assigned to them. But that intuition breaks down when the work effort necessarily includes several people or an entire team. So while intuition can serve a useful role in estimating work effort, that value is generally over-estimated, particularly when it needs to be a team estimate.

Consider work effort estimates when framed by Danial Kahneman’s work with System One and System Two thinking. System One is fast, based on experiences, and automatic. However, it isn’t very flexible and it’s difficult to train. This is the source of intuition. System Two, however, is analytical, methodical, intentional, deliberate, and slower. Also, it’s more trainable. It’s when the things that are trained in System Two sink into System One that new behaviors become automatic. With work effort estimates, we must first deliberately train our System Two using a method that is more deliberate about estimating before we can comfortably rely on our System One abilities.

Once calibrated, any number of changes could signal the need to re-calibrate by employing the deliberate process. Change the team composition and the team will need some measure of re-training of System One via System Two. Change a team’s project and the same re-training will need to occur.

The trained intuition approach to estimating effort develops what Kahneman called “disciplined intuition.” Begin with a deliberate, statistical approach to thinking about work effort. Establish a base rate using the value ranges for the effort characteristics. With experience, the team can begin to integrate their intuition later in the project process. If teams lead with their intuition (as is the case with planning poker and t-shirt sizes), they will filter for things that confirm their System One evaluation. With experience and a track record of success from training their intuition, teams can eventually lead with an intuitive approach. But it isn’t a very effective way to begin.

This method also leverages the work of Anders Ericsson and deliberate practice. The key here is the notion of increasing feedback into the process of estimating work effort. The deliberate action of working through a conversation that evaluates each of the work effort characteristics introduces more and better feedback loops that help the team evaluate the quality of their decision. Over time, they get better and better at correcting course and internalizing the lessons.

It’s like learning to drive a car. A new driver will leverage System Two heavily before they can comfortably rely on System One while driving. This is good enough for most driving situations. However, it wouldn’t be good enough if that same driver who is competent at driving in city traffic was suddenly placed on a NASCAR track in a powerful machine going 200 miles per hour.

A NASCAR track might be where we would go look for expert drivers but not where we would look for competent delivery truck drivers. For work estimates on software projects, we’re looking for a level of good enough that’s a reasonable match for the project work at hand. And we’re looking for better than untrained intuitive guesses.

Haiku #20

A simple framework.
Infinite possible ways.
One step at a time.

Strategies for Remote Interviews with Team Candidates

In a recent New York Times column, Adam Grant wrote:

Credentials are overrated, and motivation is underrated. It doesn’t matter how much experience people have if they lack the drive to think creatively, work collaboratively and keep on learning. We’re not just hiring people to do a job today — we’re hiring them to make their team and their organization better tomorrow.

Once upon a time – last century, actually – employers could rely on the conferring of a college degree as evidence of a certain level of competence in the degree subject. In some areas, this is probably still true. Generally speaking, this would apply to the scientific areas of study: chemistry, physics, mathematics, etc. Unfortunately, even these area are becoming suspect as academic rigor is eroded in the interests of removing perceived barriers to this or that special interest group. To be very clear, I’m referring to the importance of thorough and complete understanding of the subject. The mine field that academia has become is indeed rife with self-inflicted and often insurmountable barriers to learning. The egregious rise in the cost of tuition, grade inflation, and credential dilution are but a few examples.

There are other factors in play. The speed at which society moves in the 21st Century is simply too fast for the four-year degree to to have any hope of staying relevant, let alone keeping up. Almost every major university offers free courses in a wide variety of subjects so it is possible for a high school graduate to craft the equivalent of a Bachelors or Masters and complete it for a fraction of the cost and in half the time. Ah, but without having completed the paper chase, how can such an industrious individual establish for a potential employer that they have the requisite competence?

Adam Grant has it right. Credentials are overrated. So how can we assess the quality and potential of team candidates? Grant identifies three key mistakes interviewers make in the interview process.

  1. They ask they wrong kinds of questions.
  2. They focus on the wrong criteria.
  3. They’re overly influenced by the best talkers.

If, as Grant suggests, job interviews are broken than conducting remote job interviews in the midst of a pandemic are significantly more challenging. In this post, I wish to speak to the second mistake identified by Grant and write about what we can do to identify our criteria, what we can do during an interview to elicit information about the candidate’s qualifications, and a strategy for improving the efficacy of remote job interviews.

Identify Important Criteria

For the sake of example, we’ll engage in a little time travel into the future and imagine having hired the perfect product owner candidate. What tasks encountered in your work day are no longer an issue with the new candidate on board? Is the product backlog now well-maintained and in a healthy state? Does the sprint runway extend out 4-5 (or more) sprints? Has a stable sprint velocity emerged (suggesting that the user stories are of higher quality and understood better by the team)? Do conflicts between areas of the business occur less frequently than in the past? Are stakeholders pleased with the results they see at sprint and increment reviews?

If our example were for a scrum master candidate, we would ask ourselves different questions for eliciting important criteria for the position. Is there less conflict among team members? Does the team understand the purpose and value for determining the effort involved to complete a user story? And again, has a stable sprint velocity emerged?

In addition to considering what hasn’t been working well (and therefore illuminating what skills you want a candidate to bring to the table) it is also important to include what has been working. It will not serve the organization if one set of problems are swapped for another. Perhaps, for example, the previous product owner was well liked by the team and helped the team maintain a positive morale, but had a poorly maintained product backlog that prevented a good approximation for a release date. It wouldn’t be much of an improvement if the new product owner kept a healthy product backlog but did so by driving the team as a tyrant might.

Test for Matching Skills

With a good feel for the criteria needed to hire the best candidate you can then craft a strategy for determining how well the candidate’s abilities satisfy your criteria. Prepare tasks for the candidate that will verify congruity between what a candidate says they can do and what they can actually do. One approach, which I use frequently, is to present the candidate with a series of scenarios, each designed to build on how the candidate responded to the previous scenario. While I may only present a candidate 3-4 scenarios, I have several dozen in the queue and present the sequence based on how well the candidates responses to the challenge.

For example, for a scrum master role – a high-touch role that requires consummate communication skills, flexibility, and the ability to solve people problems – I may present an initial scenario as follows:

“I’m going to give you several scenarios. You are free to ask any questions you wish about the scenario and state any assumptions you are making in your responses.

You are being considered for a position as scrum master for a team that is developing a healthcare related web application for use in hospitals. This team is responsible for developing the UI/UX components and works closely with another team responsible for much of the database and middle tier components. As a new scrum master, what questions would you ask of anyone in the organization to help you quickly understand what you need to do to become effective as a scrum master for your team?”

There are many things I would hope to hear in the candidate’s answer. To mention a few, I’d like to hear that they want to speak to the product owner, the stakeholders, and, of course, each of the team members. I’d like to hear that they plan to spend time in information gathering mode rather than work immediately to shape the team into some version of teams they’ve worked with at other jobs. I’d like to hear questions from them about what kinds of metrics does the team use and what have they shown.

There are no right and wrong answers to a scenario like this. Just answers that are better than others. And I don’t expect the candidate to deliver an exhaustively thorough response.

From their responses, I might learn that they are a recipe follower or that they are flexible in adapting to the needs of the business while working to establish good scrum practices. I might learn that they really don’t know scrum at all and are only good at parroting text book examples and jargon. I might hear how they would attempt to leverage several things from previous experience while acknowledging those attempts would be experiments and subject to adaptation based on feedback.

Assuming the candidate responded to the first scenario in a way that scores high marks for satisfying my criteria, I might offer the next scenario as follows:

“Assume you have been serving successfully as scrum master for this team for six months now. The product owner calls the team together and says ‘I need to swap out some of the stories in the sprint for work that marketing wants done before the end of the week.’ As scrum master, how would you respond to this development?”

As with the previous scenario, the candidate’s response would be measured against the criteria I have established for the position. Depending on what I’ve heard, I may continue to offer additional scenarios that build on the candidates developing experience with the scenario scrum team.

This strategy is pursued until I’m satisfied the candidate knows what they claim to know or not. A short interview does not bode well for the candidate. A long interview does.

(I would be interested in hearing about any questions, comments, or creative ways you’ve applied this strategy.)

Determining Effort Value – Tactics

While the concept and practice is straightforward, shifting a team from intuitive guesses about story points to a more deliberate approach for determining effort value (a.k.a. story points) can be a challenge at first. The following approach may help start the process.

  1. Begin by focusing on product backlog items (PBIs) that the team has estimated using their previous approach that are at a 5 or greater. There isn’t much to be gained by applying this approach to PBIs estimated at 1 or 2. PBIs that the team knows are a bigger effort but may not be able to articulate why that is the case are good candidates for learning how to apply this technique.
  2. Ask the team how much time it may take to complete a PBI. While I have written before about the importance of excluding time criteria when determining effort values, this can be a good place to start. It is what teams are most familiar with – for better or worse. Teams usually have not problem throwing out a time: 8 hours, 16 hours, etc.
  3. With the time estimate in hand ask the team:

“If you sit in front of your computer and start the clock, will the PBI be done if you do nothing and the estimated time elapses?”

I would hope the team would answer “No.”

  1. With the answer to the first question in hand, ask the following question:

If the passage of time alone won’t get the PBI work completed, what will you be doing (actions and behaviors) to complete the work?

The conversation that follows from this questions is the basis for determining the effort criteria the team needs to better describe what they will be doing on their way to completing the PBI. The techniques around establishing effort criteria are described in an earlier post.

What to do while on “vacation” during a pandemic.

Build a Torii Gate, of course.

A vacation originally planed for this week in Utah was scrubbed. So a significant pivot was in order. Priorities, plans, and schedules shifted and forward motion was begun once again.

I wanted to build a Torii Gate on the East side of my property for several years. The gate and fence that was there worked well enough so it never made it very far up on the backlog. That changed last fall when the Chinook winds – which are frequent, sudden, and fierce in this part of the country – snapped  the two supporting gate posts. (The same storm also blew off the gate on the North side of the property, but that’s another project.) The gate and fence have been braced up by 2×4’s all winter. Not a good look.

Worked on the hashira (posts) over the winter. They needed to withstand the Chinooks. So, 6′ steel post – 3′ bolted within 3 2x6x10s and 3′ sunk into a concrete base – ought to hold for a while.

Time to begin the outside work.

First post had to be set perfectly. This is after it had set for a few days and most of the supports had been pulled away.

Next, the nuki (lower beam) and the shimagi and kasagi (two upper beams.)

Add a little extra flair trim to the kasagi, stain, and seal.

All that was need to complete the Torii gate part of the gate was the gakuzuka – a small brace in the center between the shimagi and kasagi – with an inscription. The weather intervened and brought us about 9″ of fresh snow.

Weather cleared, snow melted, still self-isolating – back to work to build and install the new swinging gate.

Next, dress up the top of the swinging gate with a pattern to match the fence on the north side of the property.

Finally, add the gakuzuka. The Japanese kanji on the way into the gate is “Love.” Find love here, all ye who enter.

The kanji on the way out through the gate is “Peace.” Take peace with you into the world.

Add an exterior handle crafted from ceder and the gate is done. The street view is quite nice, even before the summer vines and surrounding flowers wake up.

Time now to clean up the work site and do a little path repair.

Update – 2020.07.25

Just following a rain storm and the summer foliage starting to grow back.

Transparency, Source Code Quality, and Metrics

I’ve been reading “Hello World: Being Human in the Age of Algorithms” by Hannah Fry. She relates this story:

In 2012, a number of disabled people in Idaho were informed that their Medicaid assistance was being cut. Although they all qualified for benefits, the state was slashing their financial support – without warning – by as much as 30 per cent, leaving them struggling to pay for their care. This wasn’t a political decision; it was the result of a new ‘budget tool’ that had been adopted by the Idaho Department of Health and Welfare – a piece of software that automatically calculated the level of support that each person should receive.

Unable to understand why their benefits had been reduced, or to effectively challenge the reduction, the residents turned to the American Civil Liberties Union (ACLU) for help.

[The ACLU] began by asking for details on how the algorithm worked, but the Medicaid team refused to explain their calculations. They argued that the software that assessed the cases was a ‘trade secret’ and couldn’t be shared. Fortunately, the judge presiding over the case disagreed. The budget tool that wielded so much power over the residents was then handed over, and revealed to be – not some sophisticated AI, not some beautifully crafted mathematical model, but an Excel spreadsheet.

Within the spreadsheet, the calculations were supposedly based on historical cases, but the data was so badly riddled with bugs and errors that it was, for the most part, entirely useless. Worse, once the ACLU team managed to unpick the equations, they discovered ‘fundamental statistical flaws in the way that the formula itself was structured’. The budget tool had effectively been producing random results for a huge number of people. The algorithm – if you can call it that – was of such poor quality that the court would eventually rule it unconstitutional.

My first thoughts were, “How bad a spreadsheet hack do you gotta be to have your work be declared unconstitutional? And just how many hacks does it take to build an unconstitutional spreadsheet?”

To be fair, math is hard. Government is complex. And I’m comfortable with the assumption that everyone who had a hand in building this spreadsheet had good intentions. Venturing a guess, the breakdown happened at the manager/politician/lawyer level.

It is probable that the complexity of the task quickly overtook the abilities of the spreadsheet author(s) and the capabilities of the tool. Eventually, no single person understood how the whole thing worked. Consequently, making a change in one place affected how the spreadsheet worked in n other places and no one was capable of regression testing the beast. But the manager/politician/lawyer types knew what to do: Hide behind the “trade secret” smoke.

There are many lessons from this story. Plenty of points of failure. What I’m interested in writing about is the importance of transparency and how a good set of performance metrics can help in maintaining transparency.

The externally facing opacity in this story is readily apparent. What we don’t (and probably never will) see is the lack of transparency prevalent internally to the Idaho Department of Health and Welfare and whomever designed and built the spreadsheet tool. I’d bet a round of drinks that neither has heard of Agile much less employed its principles and practices. These by themselves – when actually practiced long term – go a long way toward establishing a culture of transparency. This is the key. Long term practice. A period of time is needed to change behaviors, mindsets, attitudes, beliefs, and when necessary, personnel. Even over the long term, implementing an Agile methodology isn’t improvisational theater. A strategy and a way to measure progress is needed.

Which gets me to metrics.

Selecting metrics and tuning them over time is critical to measuring team performance and developing improvement plans. Metrics that inform meaningful actions are the goal. Leave the vanity metrics that verify what managers want to hear or already “know” to the competition.

I’ve encountered my share of overly complex ways to measure the performance of individuals and teams. Often the metrics taken from machine-like task work (for example, assembly line work) are applied to creative or intellectual/knowledge tasks. This type of re-purposing results in, for example, counting lines of code or the number of source code check-ins as an indicator of software developer productivity. It never ends well.

When working to define a set of metrics to track an individual or team’s performance it is more effective to begin by asking several questions.

  • What problems are you trying to solve?
  • What questions will your chosen metrics answer?
  • What questions will your chosen metrics not answer?
  • How, specifically, will you know you can trust you metrics? How will you know when they are right and how will you know when they are wrong?
  • How well do your metrics compliment each other? That is, by combining them do you end up with a much better picture of individual or team performance the you do by considering individual metrics?
  • Do your metrics support any planned actions for improvement? Are you collecting actionable metrics or vanity metrics?

Finally, it is important to understand the limits of performance metrics. Displaying velocity charts that have fractions of story points implies an accuracy that simply isn’t there. Significantly adjusting project timelines based on the first three sprints worth of velocity data can have adverse secondary effects on the project.

There is no perfect set of metrics, no divine set of measures that match an impossible standard of perfect objectivity and fairness. The best possible set of metrics is one that supports useful decisions rather than simply instructs managers where to apply the stick. They should help show the way to performance improvement rather than simply report results.

I work to have 3-5 metrics, depending on the individual, the team, and the project. Less than 3 and the picture starts to look rather flat. More then 5 and the task of performance monitoring can become overly complicated and cumbersome. Keep it lean and manageable. That way, it’s easier to tell when things aren’t working and your metrics are much less likely to violate your team’s constitutional rights.