How to play with LEGO when you should be testing

How To Play with LEGO® When You Should be Testing

There is a gap between automating your testing and implementing automated testing. The first is taking what you do and automating it; the second is writing your tests in an executable manner. You might wonder about the distinction that makes these activities different. And indeed, some testers will view these two events as parts of a single process.

However, for someone who reads test cases, for instance, your customer support, it is a big difference; what is a transparent process to one party, is often an opaque process to another. In fact, if you are a nontechnical person reading this explanation, you may very well be scratching your head trying to see the difference and understand why it is crucial.

Writing Tests versus Reading Tests

Automating involves looking at your interactions and your processes, both upstream and downstream, and then automating the part in between while maintaining those interactions. This automation process should mean that nontechnical people remain able to understand what your tests do, while also providing your developers the necessary detail of how to reproduce bugs.

The second piece, communicating the detailed information to your developers is relatively easy as you can log any operation with the service or browser; however, explaining these results to nontechnical people is often significantly more difficult.

When it comes to communicating the information, you have a few options; the first is to write detailed descriptions for all tests. Writing descriptions works, at least initially.

Regrettably, what tends to happen is that these descriptions can end up inaccurate without any complaints from the people who read your tests. If testing goes smoothly, then nothing will happen, and no one will notice the discrepancy. The problems only arise when something goes wrong.

Meaningful Tests Deliver Confidence

The worst problem an automated test implementation can have is one that erodes confidence in the results. And when a test fails, and the description does not match the implementation then you have suddenly undermined confidence in the entire black box that certifies how the software works.

And now, the business analyst cannot sleep at night after release testing, because there is a nagging suspicion something might not be right.

You might respond that keeping the descriptions updated accurately should be easy, and technically that should be true. However, the reality is that description writing (and updating) is often only one aspect of someone’s job.

Breaking Down the Problem

As a project progresses and accumulates tasks, particularly as a project falls behind schedule and or over budget, the individuals writing the descriptions are often too worried about many other details to focus proper attention on their descriptions.

And whether we like it or not when we move to automated testing we become prone to hacks and shortcuts just like everyone else. There is the simple reality that testing a new feature appears to be a more pressing priority than updating the comments on some old test cases.

Moreover, above and beyond your ability to remember or accurately update the comments, there is another point to consider. There is an underlying problem with these detailed descriptions as they violate one of the more useful rules of programming: DRY (Don’t Repeat Yourself).

One of the most important reasons to practice DRY is that duplicates all too easily get out of sync and cause systems to behave inconsistently. In other words, two bits that should do the same bit, now do slightly different things. Oops. Or in the case of documentation and automated tests; two bits that should mean the same bit are now out of sync. Double oops.

How do we avoid duplication and implement DRY?

We can use technologies that solve this process, such as using a natural language API, so that your tests read like English.

For Example:

This example is readable and executable in just about every language, but regrettably, to get to this form you will have to write a lot of wrappers and helpers so that the syntax is easy to follow.

And this means that you will likely then need to rely on writers with significant experience in your specific software language and we may not have someone with the right expertise on hand.

An alternative is to create a domain specific language in which you write your tests. A domain specific language means that you create tests in something like Gherkin/Cucumber or you write a proprietary parser and lexer. Of course, this path again relies on someone who has a lot of experience with API / Language design.

The Simplest Solution

To me the preferred method is to use Gherkin; mostly because it is easier to maintain after your architect wins the lottery and moves to Bermuda. With Gherkin when you run into a problem you can Google the answer, or hire specialist. There is a sense of mystery and adventure about working on issues where you can’t Google the answer, at the same time, it’s not necessarily a reliable business practice.

The most significant benefit that I’ve discovered is that for many people, this method no longer feels like programming. This statement undoubtedly seems odd coming from a programmer, but hear me out as there is a method to my madness.

Solutions for the People You Have

To begin, let’s acknowledge that there is a shortage of programmers, especially programmers that are experts in your particular business. Imagine if you had a tool that you could hand to anyone who knows your field and that would allow this individual to understand and define tests? Wouldn’t that be grand?

How would this tool look? What would it require? To accomplish readability (understanding) and functionality (definition) you’d need to be able to hand off something that is concrete (versus abstract) and applicable to the business that you are in, but most importantly it needs to have a physical appearance.


Imagine if you could design tests with LEGO®? There is nothing you can build out of Lego bricks that you couldn’t create in a wood shop. Unfortunately, most people are too intimidated to make anything in a woodshop. Woodworking is the domain of woodworkers. On the flipside, give anyone a box of Lego bricks, and they will get to work building out your request.

Software development runs into the woodworker conundrum: programming is the domain of developers. Give a layperson C#, Java or JavaScript and assign them to a project to build and they’ll get so flustered they won’t even try. Give them Lego bricks, and they will at the least try to build out their assignment.

Reducing the barrier to accomplishing something new is extremely important for adoption; we know this is a barrier to getting customers to try our software, but we often forget the same rule applies to adopting new things for our teams.

This desire for something concrete to visualize is why we come across people who would “like to learn to program” while building complicated Excel sheets filled with basic visual formulas. These folks can see Excel so they naturally can use this formula thing, and don’t even realize that they are programming. My method is similar in concept.

To successfully automate our testing we need to reduce the barriers to trying to use the technology we plan to produce tomorrow.  As an organization adopts change, we need to find ways to make changes transparent and doable; we need to convince our people that they will succeed or they might not even try.

Gherkin Lego bricks

As I wrote in my post Variables in Gherkin:

“The purpose of Cucumber is to provide a clear and readable language for people (the humans) who need to understand a test’s function. Cucumber is designed to simplify and clarify testing.

For me, Cucumber is an efficient and pragmatic language to use in testing, because my entire development team, including managers and business analysts, can read and understand the tests.

Gherkin is the domain-specific language of Cucumber. Gherkin is significant because as a language, it is comprehensible to the nontechnical humans that need to make meaning out of tests.

In simpler terms: Gherkin is business readable.”

This explanation shows how Gherkin is the perfect language for our Lego bricks, to build testing “building blocks.” To create an infrastructure that is readable by and function for the people you have on hand, I like to develop components that I then provide to the testers.

Instead of needing to have a specific technical understanding, such as a particular programming language the testers now just need to know which processes they need to test.

For example, I would like to:

  • Open an order = a blue 2×4 brick
  • Open a loan = a blue 2×6 brick
  • Close a loan = a red 2×4 brick
  • Assign a ticket = a green 2×4 brick

This method addresses the issue of DRY because each “brick” has its process, so if you need to change the process, you change out the brick. If a process is broken, you pass it back to the development team, but it is very precise. It makes it concrete and removes a lot of the abstract parts inherent in software development.

⇨ This method addresses the issue of readability because each brick is a concrete process. Your testers can be less technical while producing meaningful automated tests.

⇨ This method solves the problem of confidence because problems are isolated to bricks. If one brick is broken, it doesn’t hint at the possibility that all bricks are broken.

⇨ This method also solves the problem of people, because it’s much easier to find testers who understand your business process and goals, such as selling and closing mortgages, without also having to understand the abstract nature of the underlying software that makes it all work.

The reality is that in this age every company has software while few are software companies.  Companies rely on software to deliver something concrete to their customers. My job as an automation and process improvement specialist is to make the testing of the software piece as transparent and as painless as possible so that your people can focus on your overarching mission.

“LEGO®is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this site”.

The Cost of Software Bugs: 5 Powerful Reasons to Get Upset

If you read the PossumLabs blog regularly, you know already that I am focused on software quality assurance measures and why we should care about implementing better and consistent standards. I look at how the software quality assurance process affects outcomes and where negligence or the effects of big data might come into play from a liability standpoint. I also consider how software testing methodologies may or may not work for different companies and situations.

If you are new here, I invite you to join us on my quest to improve software quality assurance standards.

External Costs of Software Bugs

As an automation and process improvement specialist, I am somewhat rare in my infatuation with software defects, but I shouldn’t be. The potential repercussion of said bugs is enormous.

And yet you ask, why should YOU care?

Traditional testing focuses on where in the development lifecycle a bug is found and how to reduce costs. This is the debate of Correction vs. Prevention and experience demonstrates that prevention tends to be significantly more budget-friendly than correction.

Most development teams and their management have a singular focus when it comes to testing: they want to deliver a product that pleases their customer as efficiently as possible. This self-interest, of course, focuses on internal costs. In the private sector profit is king, so this is not surprising.

A few people, but not many, think about the external costs of software defects. Most of these studies and the interested parties tend to be government entities or academic researchers. In this

In this article, I discuss five different reasons that you as a consumer, a software developer or whomever you might be, should be concerned with the costs of software bugs to society.

#1 No Upper Limit to Financial Cost

The number one reason that we should all be concerned is that in reality software costs for defects, misuse or crime likely have no upper limit on their expense.

In 2002 NIST compiled a detailed study looking at the costs of software bugs and what we could do to both prevent and reduce costs, not only within our own companies but also external societal costs. The authors attempted to estimate how much software defects cost different industries. Based on these estimates they then proposed some general guidelines.

Although an interesting and useful paper, the most notable black swan events over the last 15 years demonstrate that these estimates provide a false sense of security.

For example, when a bug caused $500 million US in damage with the Ariana 5 rocket launch failure, observers treated it like a freak incident. At the time, little did we know that the financial cost of freak incident definition would continue to grow a few orders of magnitude just a few years later.

This behavior goes by many names, Black Swans, long tails, etc. What it means is that there will be extreme outliers. These outliers will defy any bell curve models, they will be rare, they will be unpredictable, and they will happen.


Black Swan is an unpredictable event as so named by Nassim Nicholas Taleb in his book The Black Swan: The Impact of the Highly Improbable. It is predicted that the next Black Swan will come from Cyberspace.

Long tail refers to a statistical event in which most events will happen in a specific range whereas a few rare events will occur at the end of the tail.

Of course, it is human nature always to try and assemble the clues that might lead to predicting a rare event.

Let’s Discuss Some Examples:

4 June 1996

A 64-bit integer is written to a 16-bit value, and 500 million dollars went up in flames. As you see in Table 6-14 (page 135), as published in the previously mentioned NIST study, the estimated cost for software defects for the aerospace industry for a company this size was only $1,289,167. And so, 500 million blows that estimate right out of the water.

This single bug cost 200 times the expected annual cost of defects for a company.

May 2007

A startup routine for engine warm-up is released with some new conditions. The estimate for the automotive industry’s cost of software bugs in 2002; per company, per year as seen in Table 6-14 (page 135). Company Costs Associated with Software Errors and Bugs Automotive for a company bigger than 10,000 was only $2,777,868. That is not even a dent in the cost to Volkswagen — this code cost Volkswagen 22 Billion dollars.

That equates to about 10,000 times the expected costs of defects per company per year.

This behavior goes by many names, Black Swans, long tails, etc. What it means is that there will be extreme outliers. These outliers will defy any bell curve models, they will be rare, they will be unpredictable, and they will happen.

It is human nature always to try and assemble the clues that might lead to predicting a rare event. Unfortunately, when it comes to liability, it seems only academics are interested in this type of prediction, but given the possibility of exponential costs to a single company, shouldn’t we all be concerned?

#2 Data Leaks: Individual Costs of Data Loss?

Data leaks of 10-100 million customers are becoming routine. These leaks are limited by the size of the datasets and thus unlikely to grow much more. In large part that is because not many companies have enough data to move into the billions of records data breaches.

Facebook has ~2 billion users, the theoretical limit of a data breach is therefore limited to Facebook, or a Chinese or Indian government system. We only have 7.5 billion people on earth so to have a breach of 10 billion users we first need more people.

Security Breaches are limited by the Human Population Factor

That is what makes security breaches different, the only thing that it tells us is that we will approach the theoretical limit of how bad it could be. The Equifax breach affected 143 million users.

When it comes to monetary damages for the cost of the data breach, there is not a limiting factor, such as population size.

As we saw with Yahoo and more recently Equifax, cyber security software incidents show a similar pattern of exponential growth when it comes to costs. Direct financial costs are trackable, but the potential for external costs and risks should concern everyone.

#3 Bankrupt Companies and External Social Costs

From its inception no one would have predicted that this simple code pasted below might cost VW $22 billion US:

if (-20 /* deg */ < steeringWheelAngle && steeringWheelAngle < 20 /* deg */)


lastCheckTime = 0;

cancelCondition = false;




if (lastCheckTime < 1000000 /* microsec */)


lastCheckTime = lastCheckTime + dT;

cancelCondition = false;


else cancelCondition = true;


else cancelCondition = true;


Even if you argue that this is not an example of a software defect, but rather deliberate fraud, it’s unlikely you’d predict the real cost. Certainly, one was different, unexpected, not conforming to our expectations of a software defect. But that is the definition of a Black Swans. They do not conform to expectations, and as happened here the software did not act according to expectations. The result is that it cost billions.

How many companies can survive a 22 million dollar hit? Not many. What happens when a company we rely heavily on suddenly folds? Say the company that manages medical records in 1/5th of US states? Or a web-based company that provides accounting systems to clients in 120 countries just turns off?

#4 Our National Defense is at Risk

This one doesn’t take a lot to understand the significance, and yet it is one issue currently in the limelight. Software defects, faults, errors, etc. have the potential to produce extreme costs, despite infrequent occurrences. Furthermore, the origins of the costs of long tail events may not always be predictable.

After all what possible liability would Facebook have for real-world damages regarding international tampering in an election? It is all virtual, just information; until that information channel is misused.

There is very little chance that when actuaries for Facebook thought about election interference that they looked for such an area of risk. Sure they considered liability, people live broadcasting horrible and inhumane things, but did they contemplate foreign election interference? And even if they did consider the possibility, how would they have been able to predict or monitor the entry point?

And that is the long tail effect; it is not what we know, or can imagine, it is the unexpected. It is the bug that can’t be patched, as the rocket exploded, it is the criminal misuse of engine optimization routines or the idea that an election could be swayed due to misinformation. These events are so costly that we can’t assume that we know how bad it could be because the nature of software means that things will be as bad as they possibly can get.

#5 Your Death or Mine

Think of the movie IT, based off of Stephen King’s book by the same name. A clown that deceives children and leads them to death and destruction. What happens when a piece equipment runs haywire, masquerading as one thing and doing yet another? Software touches a great enough aspect of our lives that from the hospital setting to self-driving cars, a software defect could undoubtedly lead to death.

We’ve already had a case, presumably settled out of court, where a Therac-25 radiation therapy machine irradiated people to death. What happens when a cloud update to a control system removes fail-safes on hundreds or thousands of devices in hospitals or nursing homes? Who will be held liable for those deaths?

Mitigation is often an attempt at Prediction

A large part of software quality assurance is risk mitigation as an overlapping safety net to look for unexpected behaviors. Mitigation is an attempt to make it less likely that your company unintentionally finds the next “unexpected event.”

There has been a lot written about how there is an optimal way to get test coverage on your application. Most of this comes down to testing the system at the lowest level (unit test) that is feasible and has resulted in the testing pyramid. This is mathematically true. Unfortunately, the pyramid assumes that there are no gaps in coverage. Less overlap means that a gap in coverage at a lower level is less likely to be caught at a higher level.

The decision of test coverage and overlapping coverage can be approximated using Bernoulli trial, which delivers one of two results: success or failure.

Prioritizing the Magnitude Of Errors and their Effects

When we look at the expected chance of a defect and multiply that with the cost of a defect, we can compare that to the chance of a defect with overlapping coverage, multiplied by the cost.

We are usually looking at the cost of reducing the chance of a defect slipping through and comparing that to our estimated cost of a defect.

Unfortunately, the likelihood that we underestimate the cost of a defect due to long tail effects is very high. Yes, it is improbable that your industry will have a billion dollar defect discovered this year; but how about in the next 10 years? Now the answer becomes a maybe, let us call it a 10% chance and let us say that there are 100 companies in your industry. What is the cost of one of those outlier defects per year?

1,000,000,000 * .01 (1% chance per year) * .01 (1% chance of it hitting your company) = 100,000 per year as an expected cost for outlier defects per year.

The problem with outlier events is that despite their rare nature, even with a significantly small probability that your company might be the victim, the real outliers have the potential to be so big and expensive that it may, in fact, be worth your time investing in considering the possibility.

Enduring the Effect of a Black Swan

In reality, companies might use bankruptcy law to shield themselves from the full cost of one of these defects. VW’s financial burden for their expensive defect stems from the fact that they could afford it without going bankrupt. The reality is that most companies couldn’t afford to pay the costs of this type of event and would ultimately be forced to dissolve.

We cannot continue to ignore that software defects, faults, errors, etc. have the potential to produce extreme costs, despite infrequent occurrences. Furthermore, the origins of the costs of long tail events may not always be predictable.

The problem with “rarity of an event” as an insurance policy is that the costs of significant black swan bug events are that their risk goes beyond simple financial costs borne by individual companies. The weight of these long tail events is borne by society.

And so the question is, for how long and to what extent will society continue to naively or begrudgingly bear the cost of software defects? Sooner or later the law will catch up with software development. And software development will need to respond with improved quality assurance standards and improved software testing methodologies.

What do you think about these risks? How do you think we should address the potential costs?


Tassey, G., Ph.D. (2002, May). Report02-3: The Economic Impacts of Inadequate Infrastructure for Software Testing [PDF]. Gaithersburg: RTI for National Institute of Standards and Technology.

Unrealistic Expectations: The Missing History of Agile

Unrealistic expectations or “why we can’t replicate the success of others…

Let’s start with a brain teaser to set the stage for questioning our assumptions.

One day a man visits a church and asks to speak with the priest. He asks the priest for proof that God exists. The priest takes him to a painting depicting a group of sailors, safely washed up on the shore following a shipwreck.

The priest tells the story of the sailors’ harrowing adventure. He explains that the sailors prayed faithfully to God and that God heard their prayers and delivered them safely to the shore.

Therefore God exists.

This is well and good as a story of faith. But what about all the other sailors who have prayed to God, and yet still died? Who painted them?

Are there other factors that might be at play?

When we look for answers, it’s natural to automatically consider only the evidence that is easily available. In this case, we know that the sailors prayed to God. God listened. The sailors survived.

What we fail to do, is look for less obvious factors.

Does God only rescue sailors that pray faithfully? Surely other sailors that have died, also prayed to God? If their prayers didn’t work, perhaps this means that something other than faith is also at play?

If our goal is to replicate success, we also need to look at what sets the success stories apart from the failures. We want to know what the survivors did differently from those that did not. We want to know what not to do, what mistakes to avoid.

In my experience, this is a key problem in the application of agile. Agile is often presented as the correct path; after all lots of successful projects use it. But what about the projects that failed, did they use Agile, or did they not implement Agile correctly? Or maybe Agile is not actually that big a factor in the success of the project?

Welcome to the history of what is wrong with Agile.

Consider this, a select group of Fortune 500 companies, including several technology leaders decides to conduct an experiment. They hand pick some people from across their organization to complete a very ambitious task. A task of an order of magnitude different from anything they’d previously attempted and with an aggressive deadline.

Question 1: How many do you think succeeded?

Answer 1: Most of them.

Question 2: If your team followed the same practices and processes that worked for these teams do you think your team would succeed?

Answer 2: Probably not.

The Original Data

In 1986, Hirotaka Takeuchi and Ikujiro Nonaka published a paper in the Harvard Business Review titled the “The New New Product Development Game.” In this paper, Takeuchi and Nonaka tell the story of businesses that conduct experiments with their personnel and processes to innovate new ways to conduct product development. The paper introduces several revolutionary ideas and terms, which most notably developed the practices that we now know as agile (and scrum).

The experiments, run by large companies and designed for product development (not explicitly intended for software development), addressed common challenges of the time regarding delays and waste in traditional methods of production. At the root of the problem, the companies saw the need for product development teams to deliver more efficiently.

The experiment and accompanying analysis focused on a cross-section of American and Japanese companies, including Honda, Epson, and Hewlett-Packard. To maintain their competitive edge each of these companies wished to rapidly and efficiently develop new products. The paper looks at commonalities in the production and management processes that arose across each company’s experiment.

These commonalities coalesced into a style of product development and management that Takeuchi and Nonaka compared to the rugby scrum. They characterized this “scrum” process with a set of 6 holistic activities. When taken individually, these activities may appear insignificant and may even be ineffective. However, when they occur together as part of cross-functional teams, they resulted in a highly effective product development process.

The 6 Characteristics (as published):

  1. Built-in instability;
  2. Self-organizing project teams;
  3. Overlapping development phases;
  4. Multilearning;
  5. Subtle control;
  6. And, organizational transfer of learning.

What is worth noting, is what is NOT pointed out in great detail.

For instance that the companies hand-picked these teams out of a large pool of, most likely, above average talent. These were not random samples, they were not even companies converting their process, these were experiments with teams inside of companies. The companies also never bet the farm on these projects, they were large, but if they failed the company would likely not go under.

If we implement agile, will we be guaranteed success?

First, it is important to note that all the teams discussed in the paper delivered positive results. This means that Takeuchi and Nonaka did not have the opportunity to learn from failed projects. As there were no failures in the data set, they did not have the opportunity to compare failures with successes, to see what might have separated the successes from failures.

Accordingly, it is important to consider that the results of the study, while highly influential and primarily positive, can easily deceive you into believing that if your company implements the agile process, you are guaranteed to be blessed with success.

After years in the field, I think it is vitally important to point out that success with an agile implementation is not necessarily guaranteed. I’ve seen too many project managers, team leads, and entire teams banging their heads up against brick walls, trying to figure out why agile just does not work for their people or their company. You, unlike the experiments, have a random set of people that you start with, and agile might not be suited for them.

To simplify this logical question; if all marbles are round, are all round things marbles? The study shows that these successful projects implemented these practices, it did not claim these practices brought success.

What is better: selecting the right people or the right processes for the people you have?

Consider that your company may not have access to the same resources available to the companies in this original experiment. These experiments took place in large companies with significant resources to invest. Resources to invest in their people. Resources to invest in training. Resources to invest in processes. Resources to cover any losses.

At the outset, it looks like the companies profiled by Takeuchi and Nonaka took big gambles that paid off as a result of the processes they implemented. However, it is very important to realize that they, in fact, took very strategic and minimal risk, because they made sure to select the best people, and did not risk any of their existing units. They spun up an isolated experiment at an arm’s length.

If you look at it this way, consider that most large multinational companies already have above average people, and then they cherry pick the best suited for the job. This is not your local pick-up rugby team, but rather a professional league. As large companies with broad resources, the strategic risks they took may not be realistic for your average small or medium-sized organization.

The companies profiled selected teams that they could confidently send to the Olympics or World Cup. How many of us have Olympians and all-star players on our teams? And even if we have one or two, do we have enough to complete a team? Generally, no.

The Jigsaw Puzzle: If one piece is missing, it will never feel complete.

Takeuchi and Nonaka further compare the characteristics of their scrum method to that of a jigsaw puzzle. They acknowledge that a single piece of the puzzle or a missing piece mean that your project will likely fail. You need all the pieces for the process to work. They neglect to emphasize that this also means that you need the right people to correctly assemble the puzzle.

The only mention they make regarding the people you have is the following:

“The approach also has a set of ‘soft’ merits relating to human resource management. The overlap approach enhances shared responsibility and cooperation, stimulates involvement and commitment, sharpens a problem-solving focus, encourages initiative taking, develops diversified skills, and heightens sensitivity toward market conditions.”

In other words, the solution to the puzzle is not only the six jigsaw puzzle pieces, but it is also your people. These “soft merits” mean that if your people are not able to share responsibility and cooperate, focus, take the initiative, develop diverse skills and so on, they aren’t the right people for an agile implementation.

If you don’t have all the pieces, you can’t complete the puzzle. And if you don’t have the right people, you can’t put the pieces together in the right order. Again, you might be round, but you might not be a marble.

Human-Centered Development for the People You HAVE

As with any custom software development project, the people who implement are key to your project’s success. Implementing agile changes the dynamics of how teams communicate and work. It changes the roles and expectations of all aspects of your project from executive management to human resources and budgeting.

Agile may work wonders for one company or team, but that success doesn’t mean that it will work wonders for YOUR team. Especially if all stakeholders do not understand the implications and needs of the process or they lack the appropriate aptitudes and skills.

In other words, if these methods don’t work for your people, don’t beat up yourself or everyone else. Instead, focus on finding a method that works for you and for your people.

Agile is not the only solution …

Why do people select agile? People implement agile because they have a problem to solve. However, with the agile approach managers need to step back and let people figure things out themselves. And that is not easy. Especially when managers are actively vested in the outcome. Most people are not prepared to step back and let their teams just “go.”

Maybe you have done the training, received the certifications, and theoretically “everyone” is on board. And yet, your company has yet to see Allstar success. Are you the problem? Is it executive management? Is it your team? What is wrong?

I cannot overemphasize that the answer is as simple as the people you have. Consider that the problem is unrealistic expectations. The assumption when using agile and scrum is that it is the best way to do development, but what if it is not the best way for you?

If you don’t have the right people or the right resources to implement agile development correctly, then you should probably do something else. At the same time, don’t hesitate to take the parts of agile that work for you. 


Nonaka, H. T. (2014, August 01). The New New Product Development Game. Retrieved July 19, 2017, from

Process Design for the Team You Have: Surgical Team

Maximizing Productivity and Creating Value Series

Human-centered Development Strategy: Article I

The Brook’s Surgical Team: Archaic or Cutting Edge?

The surgical team as described by Frederick Brooks in the Mythical Man Month admittedly feels a little archaic to the modern development team. To be fair, Brook’s audience probably saw medical science as a bit sexier back then.

Sexy or not, the surgical team concept remains an effective and pragmatic tactic when implemented as part of a human-centered development strategy. Effective implementation of the “surgical team” increases productivity and creates value. The concept is based on the likelihood that one particular developer is likely a lot, even 10x, more efficient than your other developers. Rather than a team of equals working on a project, your team will instead support this most efficient individual, with the result being an all around increase in efficiencies.

Whether the 10x number is precisely accurate, the concept remains a viable way to take advantage of the people you have on hand. In my experience, and likely in yours as well, it is rare to find a team in which all developers demonstrate equivalent abilities and output. There is always an outlier or two. Some people communicate better, some see the big picture, some are generalists and so on.

As I discussed in “You don’t work at Google and neither do I,” it is important to note that the surgical team is a non-egalitarian system designed so that your average contributors support and augment your best contributor(s). It may not work for every team or every individual. And, even if it is good for your team overall, some people may choose to leave rather than work under this arrangement. This may actually be a gift in disguise, given that any such individual is likely not an ideal team player under any condition.

And, even if it is good for your team overall, some people may choose to leave rather than work under this arrangement. Although, this may actually be a gift in disguise, given that any such individual is likely not an ideal team player under any condition.

What exactly is a “Surgical team” in custom software development?

And how can we develop an effective surgical team to maximize productivity for your custom software development process?

Let’s talk about what a surgical team might look like in a contemporary workplace, starting with a hypothetical team: a bunch of people trying to make a deadline work without any particular or official hierarchical structure.

The Organic Surgical Team

Many activities will take place concurrently and spontaneously.

A common event is for one individual to start making tools. These tools will then help others to get their work done faster, test existing code, set up some code generation, rig up frameworks that deploy code and so on.

The tools may take on many different shapes, but ultimately the end result is that they change how work gets done while improving the possibility that the project is successful.

This is an organic example of the surgical team. In all probability, the organic surgical team is already a familiar pattern of work for your people.

The Problem with Organic Surgical Teams

surgical team

Embracing this system creates an unofficial surgical team, where one individual takes care of a large percentage of the original work, and the others follow along to work in a system that is of the “lead” individual’s design.

Ultimately, this is likely to happen in any project that is subject to adequate chaos and enough people. It happens, because it works. Unfortunately, it is also a pattern that can give rise to potentially significant problems depending on a variety variables and on the interplay between management and team members.

The most common problem that I have experienced, is the creation of a gap between the formal and informal organizational structure. If managers address problems that arise early on, they generally resolve with positive or minimal negative effects.

If the existing structures do not address the problems, friction within the team will develop and potentially cascade out of control prior to resolving. This friction is often rooted in the perception of unequal expectations and unfair privileges across the team.

Human-centered Development Strategy

Working with the people you have, how can you intentionally structure an effective surgical team and what will this process look like?

To begin with, you need to know your team. Your team is made up of people. And everyone is different. Unique. You need to know individuals’ strengths and weaknesses.

Questions to ask yourself or your project management team:

  • Who is the person who always takes ownership? Is this person a tool maker because s/he sees the big picture? Or because s/he simply sees that the team needs tools to proceed?
  • Who is best at communication?
  • Who is a natural manager?
  • Who understands the whole organization and the need for a system?
  • Who do you have that is a generalist?

Once you identify everyone’s skills and strengths, you need to look at how they can be combined and maximized. Can the tool maker fulfill the communication and team management rolls? Or should there be a second team member that delegates and supports the tool maker? Or perhaps this is the role of an outside project manager?

Working with the team you have…

Whatever you do and whatever the makeup of your unique development team, keep in mind the surgical team only works if you honestly acknowledge, recognize and accept the team you have on hand. This is why I call this human-centered development.

Many factors from geography to budget to the type of company will have an effect on the types of people at your disposal. You very well may not have your ideal team on hand.

Furthermore, in today’s world, not every project will require the same skill sets and yet most of us will not have the liberty to hire those skills, we must make do with the people we already have on hand.

The US Bureau of Labor and Statistics predicts that the software development field will experience a 17% increase in jobs from 2014 to 2024. With an existing workforce of 1,114,000 software developers that means an increase of almost 190,000 new jobs. A number that can likely just barely be covered by new graduates.

Add to the mix the number of people continuing to exit the workforce in the next decade and there is a high likelihood that there will be a surplus of software development jobs. Finding good people is already difficult, with entire recruiting and outsourcing industries capitalizing on and dedicated to alleviating the problem. And it looks the future will only be more of a challenge.

In other words, learning how to best work with the people you have is a much more likely path to success than hoping to hire the perfect solution.

Cutting Edge: to Maximize Productivity, Maximize your People

In answer to the original question, the idea of the surgical team is, in fact, cutting edge as an intentional and pragmatic development tactic. Human-centered development strategy means implementing the appropriate tactics for the people you have.

To maximize productivity and create value with the team you have, you need to be honest. If you try and force people into roles you will generate stress and friction, and eventually, something somewhere will break.

In situations that naturally promote the organic development of surgical teams, by all means, the intentional creation of surgical teams will improve productivity. An effective use of the surgical team as a tactic means that you can engineer productivity, maximizing your people and your resources for the best possible outcome.

Keep in mind that the idea of the surgical team is not to put one team member on a pedestal while relegating other team members into menial jobs. Instead, the goal is to maximize the contribution of individual skill sets and personalities, as needed for each project. It may not work for every team or every situation. But when applied intentionally and thoughtfully the surgical team can be a highly effective solution for your people.

Variables in Gherkin: Readable by Humans

Clear and Readable Language

The purpose of Cucumber is to provide a clear and readable language for people (the humans) who need to understand a test’s function. Cucumber is designed to simplify and clarify testing.

For me, Cucumber is an efficient and pragmatic language to use in testing, because my entire development team, including managers and business analysts, can read and understand the tests.

Gherkin is the domain-specific language of Cucumber. Gherkin is significant because as a language, it is comprehensible to the nontechnical humans that need to make meaning out of tests.

In simpler terms: Gherkin is business readable.

Why add variables to Gherkin & Cucumber?

An unfortunate side effect of Cucumber is that in order to keep things readable, especially when dealing in multiples, it is all too easy to explode the number of behind the scenes steps. To avoid the confusion caused by an exorbitant number of steps, the simplest fix is to create user variables in the Gherkin. Variables are not natively supported by Gherkin, but that does not mean they cannot be used or added. Adding variables that allow you to reduce steps and maintain readability.

In the years since the development of Cucumber, many tips and tricks have proven useful. The usage of variables is by far the most valuable of these tricks. Variables are ways to communicate data from one step to the next, to pass a reference that other steps can act upon. This is most useful when dealing with data that has hierarchical aspects. This can range from machines that have parts, customers that have orders to make, and blogs that have posts to be posted. The idea is that sooner or later you have multiples of something, let’s call them widgets, in a single test, and you need a way to communicate which widget you are referring to in your step. A simple way to solve this is to give them names, and hence variables were born.

Efficient use of variables in Gherkin keeps your Cucumber in it’s intended state: clear and readable.

Let us consider a human resources application for this example. Say we want to create different data setups to simulate the retirement of a person in the hierarchy. When a person retires we want to make sure that any reports of this person are moved to report to their manager. First, we need to define the organization:

Given a tiny organization


Given a Team Lead with 3 employees

Or, we could use some history:

Given a Team Lead
And add 3 employees

All three of the above issues have their perks and can handle the scenario simply enough. However, if we want to add a second level, an additional employee, the scenario quickly becomes more complicated. Let’s look at the 3 options with another layer of complexity.

Given a small organization

Chances are that people have to dig into the code to figure out the precise meaning. And once you have to look at the code behind the Gherkin, the efficiency value of using Cucumber is lost, as the goal is to communicate clearly and concisely.

Given a Director with a Team Lead with 3 Employees

This would be a new step, not an ideal scenario; let’s look at option 2:

Given a director
And add a Team Lead
And add 3 employees

This becomes unclear, as we no longer know where the employee is being added, it could be the team lead or the director.

Now let’s look at what we could do with variables:

Given the employees
| Var |
| director |
| team lead|
| report1 |
| report2 |
| report3 |
And ‘director’ has reports ‘team lead’
And ‘team lead’ has reports ‘report1, report2, report3’

This is a lot more verbose, it is 10 lines versus the previous 3 lines. But there is a big difference, as these 3 steps can model any hierarchy, no matter how deep. And if we wanted to, we could make the first step implicit (creating employees once they are referred to by another step).

At the start, for instance, we could do the following:

Given the employees
| Var |
| director |
| team lead |
| report1 |
| report2 |
| report3 |
And ‘director’ has reports ‘team lead’
And ‘team lead’ has reports ‘report1, report2, report3’
When ‘team lead’ retires
Then ‘director’ manages ‘report1, report2, report3’

As you can see the “Then” verification step is very easy to re-use.  We can refer to specific employees by name which allows us to do detailed verifications of specific points in the hierarchy. Not only that, now that the language is clear, there is no reason to open up the steps to see what exactly is happening behind the scenes.

As we refer to data entities by name, we can also treat them as variables; for instance, if we want to check the “totalReportsCount” and “directReportsCount” property we could say:

Given the employees
| Var |
| director |
| team lead|
| report1 |
And ‘director’ has reports ‘team lead’
And ‘team lead’ has reports ‘report1’
When ‘team lead’ retires
Then ‘director.totalReportsCount’ equals ‘1’
And ‘director.directReportsCount’ equals ‘1’

To implement this we would need to build a resolver, a class that knows about all the different objects. When the resolver is called it checks all the different repositories, in this case, employees, and asks for the variable by name, in this case, ‘director.’ When the resolver finds the variable it uses reflection to look for the property ‘totalReportsCount’ and it evaluates the expression. We now have the generic capability to validate variables and their properties.

Adding variables in the Gherkin allows your testers to create reusable steps with a minor increase to the infrastructure of the test framework. By using the resolver before evaluating a table you can even refer to variable properties inside of tables. And, naming the objects you deal with allows you to refer to them later on, making steps more generic and keeping the language clear and complete.

Meaningful Tests and Confidence

With this usage of variables in Gherkin, Cucumber remains an efficient and pragmatic language to use for your tests. Your development team, from the managers to the business analysts will be able to understand and gather value from the tests. And, as we all know, meaningful tests create confidence and prove the value in your quality assurance efforts.

If you enjoyed this article, please share!

Why Minibars and Reactive Decision Making are a Waste of Money

Playing with Time to Save Money: Reactive vs Predictive Decision Making

For only $5,000 buy our “Make Life Easy Tool Kit” and we will give you a bonus Software Developer!*

*Mean annual salary for Software Developer valued at $99,453

The following is an attempt to understand why it can be so difficult, despite the prospect of great financial savings, to get a tool on a project in a reasonable amount of time. I would argue that it comes down to the tendency to be overly reliant on predictive processes when we could sometimes benefit from reactive decision making.


Every established company has a set of generally well-developed rules and regulations on spending money. These policies may be pragmatic, such as service contracts, spending limits, employee policies, contractor policies, double signatures and whatever else the top brass or the attorneys deem necessary.

These policies are of course good predictive management practices. Policies created with the logical intention to save money, but do they always really achieve this goal? Certainly negotiating a service contract with Xerox or predicting common employee pitfalls makes sense, but how do we measure other seemingly intangible metrics, such as spending policies on third-party tools and associated project budgets?

The Mini Bar

Let’s start with a little anecdote about minibars: those sometimes enticing, but more often than not, annoying little not-a-real-fridge boxes found in business traveler hotel rooms. Some hotels have been kind enough to phase them out, but apparently, they still create expensive headaches in the corporate world.

I was one of two technical representatives sitting in on contract negations for a new product for which the total cost over the first two years was expected to be less than $5,000 USD. Negotiations quickly digressed into the creation of a clause related to the mini-bar for any possible future consultancy services. On this conference call were ten staff from two separate billion-dollar companies, including two technical resources (yours truly plus one); four attorneys; and several management and sales staff.

The topic ate-up a good hour of time during our call. If we were to estimate a generous minimum of $100/hr per person billable time, plus the time required for preparation and follow-up, this little clause most likely never paid for itself, while costing at least one-fifth the price of the actual product. Like me, you are probably left wondering why one of the trusted sales managers or one of the contract attorneys on the call didn’t simply pop in the phrase “no mini bar charges” and leave it at that? Boom. We’re done.

Why did this happen the way it did?? Because for whatever reason, the policies in place at one or both of the companies were not created to consider actual processes. And or the individuals involved were not given appropriate authority to make situation-specific, time and money saving, decisions. Inefficiencies like this exist, because companies don’t have a metric to measure the cost of implementing their policies. They don’t trust their employees to make good decisions. Nor do they have a way to measure if this really saves money in the long-term.

This may seem like an idiosyncratic example, but if you pause to think about it, you will be surprised by how many similar experiences that you start to recall. Why is it that accepting or declining an expense so complicated? Perhaps you are in the middle of something similar at the moment. Unfortunately for project managers, when it comes to decision making and buying third party software — software created specifically to time, money and labor — the utility of the software is too often overshadowed by the hurdles management must jump simply to get an expense approved. Predictive management too often assumes that employees will make bad financial decisions. How many of us have been subject to (or responsible for!) long meandering meetings and delayed projects that have been derailed by the decision to recreate an inexpensive third-party product, simply because the effort and the cost of trying to buy the product is a known metric, whereas the time/money/labor spent in recreating it is unknown?

Where is the logic?

The reason that managers and their teams fall into time sucks is indeed the lack of access to reasonable metrics. Rarely have we defined ways to measure non-monetary resources, which means that we can’t place a precise value on the time and resources spent. We don’t have a metric to show that paying a developer $99,453 a year to recreate a $5,000 product is just plain foolish. We do have rules that make it difficult for managers to approve purchasing third party products. The consequences is that project managers take the semi-rational approach of not buying off the shelf products and instead spend more time and money to recreate the wheel.

Trying to add measurements to this process so that the costs are tied to time is likely to be extremely laborious and requires relatively large changes in standard business models. As a result, this approach will not functionally solve the problem, given that time is money. Unfortunately, despite the desire to increase profit margin we still consistently lack the processes and the authority (ability?) to designate metrics or assign a value to “negotiating contracts” outside of the knowledge that skipping out on contracts can be very costly.

So how can we minimize the costs of decision making? Although there are many many processes to speed up decisions, there is one that is not often mentioned or even considered to be plausible: the act of making decisions in hindsight. Hindsight you say? This seems impossible. Or at the least unwise.

Predictive Decision Making =Empowered &amp; Efficient (1)

Hindsight is 20/20

Let me explain. Most purchasing decisions follow a predictive process. Predictions are often inaccurate and increasing accuracy requires a lot of analysis. The hindsight scenario is a reactive process, where information is cheap and easy to acquire, but unfortunately, the data arrives too late to prevent any negative outcomes.

For some decisions, the predictive process is worth the expense and the time spent on data analysis. Big capital expenses, such as data centers or other high-risk expenses, such as people generally require predictive decision-making processes. However, less expensive and less risky processes, such as the purchase of 3 licenses for a $100 piece of software are significantly less risky and thus merit a reactive approach. Clearly, institutionalizing a reactive approach needs to have set limits, but these can be set and managed fairly easily.

Empowered and Efficient

A simplistic model of a reactive approach might be to offer project managers or development leads a set budget per developer, per year to improve efficiency. Teams or individuals who can show their expenses to be positive investments can then earn augmentations to their original allotted budget. In effect, a developer or a group of developers can then use their experience and expertise to decide to buy a license on an afternoon and without jumping through 3 weeks of approval hoops. If after implementing the licenses they can show the results were worth the investment, they can request their allotment be reimbursed.

Let’s say we have a development team with a budget of $1000 per year to invest in project efficiencies. After purchasing a piece of third party software to solve a development inefficiency they report back:

“On January 5th we spent $580 to buy Product ABC. This product has been implemented to solve task Y and thus reduced developer workload by 40 hours per week. We estimate this has saved us $12,000 in staff hours. Please augment/reimburse our initial budget of $1000 by $580 to account for this worthwhile investment in efficiency.”

Or, perhaps the software turned out to take too much time to implement and thus negated any benefit, so the developers are not eligible to ask for a reimbursement. There is little risk involved and getting an accurate decision is still relatively cheap. People who make good decisions are rewarded with augmented budgets, people who make bad decisions run out of money.

In the long run, streamlining the decision process for acquiring third-party tools will make the effective tools more accessible and attractive. And, I predict that overall efficiency will be increased as people will spend less time re-inventing the wheel.

How have you solved the problem of budgeting, expenses and third-party product acquisition? Do you use reactive decision making? What other successful ways do you manage your budget and achieve increases in management efficiencies?