Making Sense of Test Results

The goal of Quality Assurance is to ensure that your product meets requirements, is debugged, and delivers happy customers. To obtain satisfied customers your testing and QA processes must provide meaningful information during the development lifecycle and after that to efficiently assist engineers and decision-makers in informing choices.

Dashboards as a Tool

The challenge for many is making sense of the data. As a tool, dashboards serve as a functional way to deliver pretty information with a simple overview. Unfortunately, they can also provide too much data, hide essential details, or create easily ignorable aggregates.

Let’s look at a TFS dashboard from Visual Studio Reporting Tools:

This particular dashboard looks great and provides meaningful readings: things are good or bad (green or red), and it is easy to see that things are ok. In reality, dashboards rarely look this nice. Many teams see prominent displays of yellow and orange with little substantial evidence of green and red.

Where Dashboards Go Wrong

Flakey tests: When there are a lot of tests, with concurrency issues for instance, that fail sometimes, there will inevitably be the occasional failing test. This situation results in teams whose dashboard delivers too much unhelpful orange.

False Sense of Security: Dashboards can also provide a false sense of security. If you set a goal to keep errors under 1%, but then you only look at the percentage number of errors, and you fail to check out what is failing; you will likely miss that the same key defects may have been failing for quite some time.

A Call to Action

In the end, the goal of a dashboard is not to convey information, but rather to get people to act (assuming the data reported indicate something bad). The dashboard when thoughtfully constructed will thus be a warning flag and a call to action, but only if it says something that causes people to take action.

If the dashboard is not practical, people will ignore what it displays.

It is like the threat level announcements at the airport or the nightly news, you know all is not well, but you either learn to tune it out or find another way to numb the information. Developers that get told every day that there are some failures will do the same, sure they spend some time chasing unreproducible failures, but after awhile they will stop paying attention.

A Good Dashboard Generates an Emotional Response

The genius of a thoughtfully constructed dashboard is that it generates an emotional response.

Yay! Good!


Oh. No, not again!

So how do we get people back to the emotional reaction of a simple read / green dashboard in a world where things are noisy?

This question leads to the creation of a new dashboard that focuses on the history of defects and failures. In this case, we categorize failures that are new as orange, consecutive failures over multiple days as red, with red failures that exist for several days then going black.

Createing Useful Dashboards

Screenshot © Possum Labs 2018

It is easy to see from the chart that even though there are some flay tests, there are also a few consistent failures. And it is easy to see that we have a few regular failures over the last few weeks.

This occurrence creates an opportunity to start drawing lines in the sand, for example, setting the goal to eliminate any black on the charts. Even when getting to a total pass rate seems infeasible for a project, it should be more manageable to at least fix the tests that fail consistently.

This type of goal is useful because it is something that can get an emotional response, people can filter the signal from the noise visually, so they get to the heart of the information, which is that there are indeed failures that matter. When we make it evident how things change over time, if things get worse, people will notice.

Dashboards are a Tool — not a Solution

Each organization is different, and each organization has its challenges, getting to a 100% pass rate is much easier when it is an expectation from the beginning of the project, but often systems were designed years before the first tests crept in. In those scenarios, the best plan is to create a simple chart and listen to people when they tell you it is meaningless. These are the people that give you your actual requirements; these are the people who will spell out where and what to highlight to get an emotional response to the dashboard data.  

Dashboards do not always convey meaningful data about your tests. If they lack thoughtful construction, it is likely they may fail at getting people to act when the information is bad. Bad results on your panel should be a call to action, a warning flag, but the results need to mean something or your people simply ignore and move on.

At Possum Labs, meaningful dashboards are not a solution, but instead just one of many tools available to assure that your QA delivers effective and actionable data to your teams.

Effective QA is Not an Option it’s a Necessity: Here’s How to do it Right

As has long been the case in the software and technology industry, immigration is an important source for accessing Quality Assurance people.

Part of the reason is that, as per usual, QA isn’t always a job that attracts US-based developers. The US job market is already competitive for developers and moving developers on your team to quality assurance is often undesirable for a number of reasons. The result is that relocating QA developers to the USA is in many cases the most satisfactory solution for the industry.

Unfortunately, barriers to immigration continue to drive-up demand for qualified QA, which only further exacerbates the shortage of people with an appropriate development background already in the US.


Quality Assurance is not Optional

Compounding the problems experienced by a physical shortage of qualified developers for QA is the fact that some decision makers continue to consider QA an “optional” line item in their budget. Unfortunately for them, QA and the surrounding processes are anything, but “optional” and a good quality assurance engineer is a key player on any team.

In companies where QA is an accepted need, it is still often considered more of a necessary evil than a definite benefit. QA is too easily blamed when a project that’s been on schedule suddenly gets hung up in testing. Another common problem appears when development teams pick up speed after implementing workflow changes, only to discover that QA is still a bottleneck keeping them from delivering a product to their customer.


A Look at the Current Situation

As we’ve seen, numerous factors contribute to the myths surrounding Quality Assurance that contribute to this questionable climate that makes some engineers shy away from the moniker or even the field.  Moreover, we are experiencing an actual shortage of qualified engineers, which means that QA in many instances ends up being not an afterthought, but rather a luxury.

Immigration and work status has been hot-topics for the last few years. And regardless of where you fall on the political spectrum, if you work in software, you’ve likely experienced first hand the effect on the jobs market.

For those of you that have been lucky enough to hire engineers, many of you may have also been unlucky enough to discover that said engineer has to go back to India (or wherever his or her country of origin might be). And what started out as a short trip, quickly devolves into a lengthy, even a 6-month long, process to renew or regularize an H-1B visa, following changes made to the requirements in 2017.

Whether your experience with the dwindling QA applicant pool is first hand or anecdotal,  here are some statistics for you to munch on:

The US Bureau of Labor and Statistics expects the market for software developers to blow-up by 24% from 2016 to 2026. That means a need for approximately 294,000 additional software developers in an already tight market. If you think it’s hard to convince an engineer to join your QA team now, just wait and see what it will be like in 2026.

We can’t know for sure the number of H-1B visas currently held up due to changes in requirements, but this article does a decent job of discussing the demand and actual need for H-1B Visa’s in for the USA with a focus on the State of Massachusetts. If you’d like to know more, I’d suggest taking a look; however, for our purposes, I don’t think importing QA staff is necessarily the answer.

So, you need to have QA, but you can’t hire qualified staff to take care of the work. What can you do?

The Foundations of Quality Assurance

Before I answer this question, let’s take a look at why Quality Assurance exists in the first place. From a business perspective, there are a few things that pretty much every customer expects from a software product. The following three expectations are the crucial reasons to the story behind why quality assurance is not optional::

  1. Customers expect that the programs will work as requested and designed, and within the specified environments;
  2. They hope that software will be user-friendly;
  3. And, they assume that software will have been successfully debugged: meaning that QA must deliver a product that is at the least free of the bugs that would result in numbers 1 or 2 becoming false.

Historically, software teams were small, but since the early 80s, due to the need to scale quickly and keep up with changing requirements and other advances in technology and globalization, we’ve experienced rapid growth in the size of development teams and companies. This growth has led to implementing a wide variety of tactics from workflow solutions, think Waterfall or Agile, to different methods of increasing productivity and efficiencies, such as offshore teams and microservices.

Take a look at this simple chart approximating the increase in development team size from the original release of Super Mario Brothers in 1985 to Super Mario World in 1990 and Super Mario 64 in 1996. (Noting that in the credits, by 1996 the occasionally thank entire teams, not just individuals, so actual number is likely even higher).

Super Mario Release Team Size

© Possum Labs 2018

Where we haven’t (at least in the USA) been able to keep up is in the training of new software engineers. QA departments, regardless of size, are challenged to carry-out all the different required processes to follow software through the development lifecycle to delivery and maintenance (updates), while also keeping abreast of changes to technology and integrations.

A misfortunate result of this shortage of QA engineers is that the point in the development cycle where most companies fall short is in testing. And, yet, the ability to provide useful and meaningful testing is crucial to the successful delivery of quality assurance to one’s client, whether building an in-house product, such as for a financial institution, or a commercial product for the public market.

While offshore teams may be a solution for some companies, many companies are too small to make offshore building teams practical or cost-effective.

What’s more is that many engineers tend to be good at one thing — development — they may not have a good sense of your organization’s business goals or even an understanding of what makes a good customer experience. And while your high paid development staff might excel at building clever solutions, it doesn’t necessarily mean that they also excel at testing their own goods. And do you really want to pay them to do your testing, when their time could be better invested in innovations and features? At Possum Labs we’ve determined that it is often most efficient to design workflows and teams to adjust to the people you have.

This disconnect between development requirements and a full understanding of business goals is in fact often the culprit in a pervasive disconnect between testing and business outcomes. What do I mean by disconnect? Let’s consider the four following statements and then talk about some real-life examples:

  • Users prefer seamless interfaces, intuitive commands, and technology that makes them feel smart, not dumb.
  • Businesses prefer software that assures their business goals are met and that technology allows their employees to work smarter with greater efficiency thus promoting growth and profit.
  • Today the average person is adequately adept and familiar with technology to know when your software exhibits even moderately lousy UX. And companies can also experience public shaming via social media when they make a particularly dumb or inopportune mistake.
  • And then there are the security risks.

In 2017 we saw several major security snafus experienced by large corporations that from the few details publicized were the direct result of inaction on the part of the decision-makers, despite being notified by engineering.

One might think that the decision makers, while moderately acknowledging the risk, may have simply gambled that nothing would come of the risks while taking steps to protect themselves.

I would like to go a step further. I’d wager that everyone involved fell victim to a set of common testing pitfalls.

Indeed, one of the most challenging aspects of testing is figuring out not only how to effectively and efficiently create and run tests, but most importantly to figure out how to confidently deliver meaningful results to the decision makers. Whether you are a software business or a business that uses software, successful quality assurance is crucial to the long-term health and success of your business.


Let’s do a quick recap of what we’ve covered so far:

  1. There is a shortage of qualified test engineers.
  2. Users want products they can rely on and that are friendly to use.
  3. Companies want products they can trust that improve efficiencies, their bottom line and that, of course, make their clients happy.
  4. It is difficult to create tests that deliver meaningful results when the testing is done by engineers that don’t necessarily understand the businesses end goals.
  5. Decision makers don’t want to understand the tests; they want to have meaningful results so that they can make effective decisions.


So what if we could solve all of these problems at once?

This type of solution is what Possum Labs achieves through the clever use of solutions and tools that integrate with your existing systems, processes, and people. We build out quality assurance so that anyone who understands the business goals can successfully carry-out testing and efficiently uses the results.

Not only does this solve problems 1 to 5 above, but it is also, in fact, a super solution in that it prevents most companies from having to hire or train new developers. Instead, you need to hire for people who with a keen understanding of your business and that can be trained to work with your tools. Possum Labs methods allow you to implement and upgrade your quality assurance — sometimes even reducing your staffing load — while delivering better and more meaningful results, so that your end of service recipients get better services or better products than before.


How does Possum Labs do this?

Each of our solutions varies a bit from company to company, but in general, we use several tools, including proxies and modules (think Lego) to make it so existing tests can be modified and new tests written with simply reorganizations of the “bricks.” This focus on custom solutions allows a non-technical individual with a solid understanding of business goals to generate tests and result that deliver meaningful results with confidence for him or her to share with decision makers.

The result is that testing bottlenecks open up allowing for a more efficient flow of information and better feedback through all channels. Products are delivered faster. Information flows smoothly. Better decisions are made, and efficiencies are gained. Developers can focus on development and decision makers in achieving their strategic goals. Meanwhile, you’ve got happy customers, and everyone can get a good night’s rest.

3 Risks to Every Team’s Progress and How to Mitigate

When looking at improving performance the first thought is often to increase the size of our development team; however, a larger group is not necessarily the only or the best solution. In this piece, I suggest several reasons to keep teams small and why to stop them from getting too tiny. I also look at several types of risk to consider when looking at team size: how team size effects communication, and the possibility of individual risk and systematic risk.

Optimal Team Size for Performance

The question of optimal team size is a perpetual debate in software organizations. To adjust, grow and develop different products we must rely on various sizes and makeups of teams.

We often assume that fewer people get less done, which results in the decision of adding people to our teams so that we can get more done. Unfortunately, this solution often has unintended consequences and unforeseen risks.

When deciding how big of a team to use, we must take into consideration several different aspects and challenges of team size. The most obvious and yet most often overlooked is communication.

Risk #1: Communication Costs Follow Geometric Growth

The main reason against big teams is communication. Adding team members results in a geometric growth of communication patterns and problems. This increase in communication pathways is easiest illustrated by a visual representation of team members and communication paths. 

Geometric Growth of Communication Paths

Bigger teams increase the likelihood that we will have a communication breakdown.

From the standpoint of improving communication, one solution that we commonly see is the creation of microservices to reduce complexity and decrease the need for constant communication between teams. Unfortunately, the use of microservices and distributed teams is not a “one size fits all” solution, as I discuss in my blog post on Navigating Babylon.

Ultimately, when it comes to improving performance, keep in mind that bigger is not necessarily better. 

Risk #2: Individual Risk & Fragility

Now a larger team seems like it would be less fragile because after all, a bigger team should be able to handle one member winning the lottery and walking out the door pretty well. This assumption is partially correct, but lottery tickets are usually individual risks (unless people pool tickets, something I have seen in a few companies).

When deciding how small to keep your team, make sure that you build in consideration for individual risk and be prepared to handle the loss of a team member.

Ideally, we want to have the smallest team as is possible while limiting our exposure to any risk tied to an individual. Unfortunately, fewer people tend to be able to get less work done than more people (leaving skill out of it for now).

Risk #3: Systematic Risk & Fragility

Systematic risk relates to events that will affect multiple people in the team. Fragility is the concept of how well structure or system can handle hardship (or changes in general). Systemic risks are aspects shared across the organization, this can be leadership, shared space, or shared resources.

Let’s look at some examples:

  • Someone brings the flu to a team meeting.
  • A manager/project manager/architect has surprise medical leave.
  • An affair between two coworkers turns sour.

All of these events can grind progress to a halt for a week (or much more). Events that impact morale can be incredibly damaging as lousy morale can be quite infectious.

In the Netherlands, we have the concept of a Baaldag (roughly translated as an irritable day) where team members limit their exposure to others when they know they won’t interact well. In the US with the stringent sick/holiday limits, this is rare.

Solutions to Mitigate Risk 

Now there are productive ways to minimize risk and improve communication. One way to do this is by carefully looking at your structure and goals and building an appropriate team size while taking additional actions to mitigate risk. Another effective technique for risk mitigation is through training. You shouldn’t be surprised, however, that my preferred method to minimize risk is by developing frameworks and using tests that are readable by anyone on your team.

The case for continuing education

Do you have job security? The surprising value in continuing education.

If the last 2 decades have taught us anything about change, they’ve shown that while software development may be one of the most rapidly growing and well-paid industries, it can also be highly unstable.

You may already invest in professional development in your free time. In this piece, I’ll show you how to convince your employer to invest in professional development as part of your job.

My Personal Story

I started my first software job at the height of the dot-com boom. I’d yet to finish my degree, but this didn’t matter because the demand for developers meant that just about anyone who merely knew what HTML stood for could get hired. Good developers could renew contract terms 2 or 3 times per year. Insanity reigned and some developers financially made out like bandits.

Of course, then came the crash came. The first crash happened just about the time I finished my degrees. By the time I graduated, I’d gone through three rounds of layoffs during my internships. By the time I actually started full-time work things had stabilized a bit, with layoffs settling down to once a year events in most companies. In 2007 we saw an uptick in a twice a year layoff habit for some companies, but then it quieted down again.

Of late, in most companies and industries software developer layoffs are less frequent. The more significant problem is, in fact, finding competent brains and bodies to fill open positions adequately.

My initial move into consultancy stemmed from a desire to take success into my own hands. Contracting and fulfilling specific project needs leaves me nimble and in control of my own destiny. My success is the happiness of my customer, and that is within my power.  Indeed, I am not immune to unexpected misfortune, but I rarely risk a sense of false security.  And I particularly enjoy the mentoring aspect of working as a consultant.

Despite the growth, I’d say software is still a boom and bust cycle.

Despite the relative calm (for the developers, not the companies), I think that as a software developer it is wise to accept that our work can vanish overnight or our salaries cut in half next month. Some people even leave the industry in hopes of better job security, while others deny the possibility that misfortune will ever knock on their door.

Not everyone has the desire, personality or the aptitude to be a consultant. However, everyone does have the ability to plan for and expect change. I wager that in any field it is wise to always have your next move in the back of one’s mind. This need to be prepared is particularly true in the area of software development. And while some people keep their resume fresh and they may even make a habit of annual practice interviews. Others have no idea which steps they’ll need to take to land their next job.

Landing that next job has some steps, and while the most straightforward step may be to make sure your resume and your LinkedIn profile are fresh with the right keywords (and associated skills) sprinkled throughout, it is even more important to stay on top of your game professionally.

Position yourself correctly, and you will fly through the recruiters’ hands into your next company’s lap. For many companies, keywords are not enough — they also need to know that you have experience with the most current versions and recent releases. Recruiters may not be able to tell you the difference between .Net 3.5 vs. 4.0; but if their client asks for only 4.0, they will filter out the 3.5 candidates. Versions are tricky, Angular 1 to 2 is a pretty big change, Angular 2 to 4 is tiny (and no, there is no Angular 3), it is not reasonable to expect recruiters to make heads or tails off of these versions.    

Constant Change Means Constant Learning

So how do you position yourself to leap if and when you need to? In the field of software development, new tools, methods, and practices are continually appearing. Software developers frequently work to improve and refine the trade and their products.

The result of this constant change is that for software engineers who maintain legacy products; you are at risk of losing your competitive edge. Staying at one job often results in developers becoming experts in software that will eventually be phased out.

Not surprisingly, the companies that rely on software to get their work done, but that are not actually software companies by trade tend to overlook professional development for their employees. The decision makers at these companies concern themselves with their costs more than the competitiveness of their employees and so they often remain entirely ignorant of the realities for their software engineers.

In some companies, from the decision makers’ point of view, they don’t see any logic in investing in training their employees or upgrading their software, when what they have works just fine. It’s easy to make a budget for a software upgrade, what is less evident is the cost of reduced marketplace competitiveness of their employees. Even worse, in some companies, there is an expectation that instead of investing in training, they’ll simply hire new people with the skills they need when their existing staff gets dated.

I once met a brilliant mathematician in Indianapolis that had worked on a legacy piece of software. One day after 40 years of loyal employment he found himself without a job due to a simple technology upgrade. With a skill set frozen circa 1980, he ended up working the remainder of his career in his neighborhood church doing administrative tasks and errands. Most people do not want to find themselves in that position, and they want to keep their economic prospects safe.

Maintain Your Own Competitive Edge

Another reason that many software engineers (and developers) move jobs every few years is to maintain their competitive edge and increase their pay. Indeed, earlier last year Forbes published a study showing that employees who stay longer than two years in a position tend to make 50% less than their peers who hop jobs.

“There is often a limit to how high your manager can bump you up since it’s based on a percentage of your current salary. However, if you move to another company, you start fresh and can usually command a higher base salary to hire you. Companies competing for talent are often not afraid to pay more when hiring if it means they can hire the best talent.”

More Important than Pay is the Software Engineer’s Fear of Irrelevance

As a software engineer working for a company that uses software (finance, energy, entertainment, you name it) there is nothing worse than seeing version 15 arrive on the scene when your firm remains committed to version 12.

Your fear is not that version 12 technology will phase out tech support, as these support windows are often a good decade in length. You fear that this release means that your expertise continues to become outdated and the longer you stay put, the harder it will be to get an interview, let alone snag a job. You feel a sinking dread that your primary skill-set is suddenly becoming irrelevant.  

Your dated skill-set has real financial implications and will eventually negatively impact your employability.

A Balancing Act

For companies, the incentive is to develop software cheaply, and cheap means that it is easy to use, quick to develop and let’s be realistic here, that you can Google the error message and copy your code from stack exchange.

A problem in software can often gobble up a few days when you are on the bleeding edge. All too often I stumble upon posts on a stack exchange where people answer their own question, often days later; or even worse I see questions responded to months after having asked for help. It makes sense that companies want to avoid the costs of implementing new releases.

Why would companies jump on the latest and greatest when the risk of these research problems is amplified in the latest version?

Companies are Motivated to Maintain Old Software, while employees are motivated to remain competitive.

This balancing act is a cost transfer problem; the latest framework is a cost to companies due to the research aspect, whereas an older framework is a cost to developers by reducing their marketability. At the moment where it is hard to hire good people, it will be hard to convince developers to bear the costs of letting their skills fall out of date.

New language and framework features can add value, but they are often minor, and there are often just ways to do something people can already do better and faster (but this is only true after the learning curve, and even then the benefits rarely live up to expectations (see No Silver Bullet). Chances are that the benefit of a new version of a framework will often outweigh the costs of learning the new framework, especially for existing code bases.

It seems like there should be some room for the middle ground; in the past, there was a middle ground. This was called the training budget.

Corporate Costs

With software developers jumping ship every few years to maintain their competitive edge, it is understandable that some management might find it difficult to justify or even expect a return on investment on training staff. In many cases, you’d need to break even on your training investment in less than a year.

At the same time, the need for developers to keep learning will never go away. Developers are acutely aware that having out of date skills is a direct threat to their economic viability.

For the near future developers will remain in high demand and the effects of refusing to provide on the job continuing education will only backfire. Developers are in demand, and they want to learn on the job. Today we do our learning on the production code, and companies pay the price (quite likely with interest). Whereas before developers were shipped off to conferences once a year, now they Google and read through the source code of the new framework on stack overflow for months as they try to solve a performance issue.

In Conclusion: Investing in Continuing Education Pays Off

The industry has gone through a lot of changes, in the dot-com boom developers were hopping jobs at an incredible speed, and companies reacted by changing how they treated developers and cut back on training as they saw tenures drop. This all makes perfect sense. Unfortunately, this has led to developers promoting aggressive and early adoption of frameworks so that developers keep their skills up-to-date with the market. And as more and more companies adapt to frequent updates, the pressure to do so will only increase.

Training provides a way to break the cycle and establish an unspoken agreement that companies will leave developers as competitive as they were when they were hired by regular maintenance through training. So how to support continuing education and maintain a stable and loyal development pool? Send your developers to conferences, host in-house training, lunch and learns, and so on to ensure that they feel both technically competitive and financially secure.  

Despite their reluctance, in the end, there is a real opportunity and a financial incentive for companies to go back to the training budget approach. Companies want to have efficient development, developers want to feel economically secure. If developers are learning then they feel like they are improving their economic prospects and remaining competitive. Certainly, some will still jump ship when it suits their professional goals, but many will chose to stay put if they feel they remain competitive.

“Advancement occurs through the education of practitioners at least as much as it does by the advancement of new technologies.” Improving Software Practice Through Education



How to play with LEGO when you should be testing

How To Play with LEGO® When You Should be Testing

There is a gap between automating your testing and implementing automated testing. The first is taking what you do and automating it; the second is writing your tests in an executable manner. You might wonder about the distinction that makes these activities different. And indeed, some testers will view these two events as parts of a single process.

However, for someone who reads test cases, for instance, your customer support, it is a big difference; what is a transparent process to one party, is often an opaque process to another. In fact, if you are a nontechnical person reading this explanation, you may very well be scratching your head trying to see the difference and understand why it is crucial.

Writing Tests versus Reading Tests

Automating involves looking at your interactions and your processes, both upstream and downstream, and then automating the part in between while maintaining those interactions. This automation process should mean that nontechnical people remain able to understand what your tests do, while also providing your developers the necessary detail of how to reproduce bugs.

The second piece, communicating the detailed information to your developers is relatively easy as you can log any operation with the service or browser; however, explaining these results to nontechnical people is often significantly more difficult.

When it comes to communicating the information, you have a few options; the first is to write detailed descriptions for all tests. Writing descriptions works, at least initially.

Regrettably, what tends to happen is that these descriptions can end up inaccurate without any complaints from the people who read your tests. If testing goes smoothly, then nothing will happen, and no one will notice the discrepancy. The problems only arise when something goes wrong.

Meaningful Tests Deliver Confidence

The worst problem an automated test implementation can have is one that erodes confidence in the results. And when a test fails, and the description does not match the implementation then you have suddenly undermined confidence in the entire black box that certifies how the software works.

And now, the business analyst cannot sleep at night after release testing, because there is a nagging suspicion something might not be right.

You might respond that keeping the descriptions updated accurately should be easy, and technically that should be true. However, the reality is that description writing (and updating) is often only one aspect of someone’s job.

Breaking Down the Problem

As a project progresses and accumulates tasks, particularly as a project falls behind schedule and or over budget, the individuals writing the descriptions are often too worried about many other details to focus proper attention on their descriptions.

And whether we like it or not when we move to automated testing we become prone to hacks and shortcuts just like everyone else. There is the simple reality that testing a new feature appears to be a more pressing priority than updating the comments on some old test cases.

Moreover, above and beyond your ability to remember or accurately update the comments, there is another point to consider. There is an underlying problem with these detailed descriptions as they violate one of the more useful rules of programming: DRY (Don’t Repeat Yourself).

One of the most important reasons to practice DRY is that duplicates all too easily get out of sync and cause systems to behave inconsistently. In other words, two bits that should do the same bit, now do slightly different things. Oops. Or in the case of documentation and automated tests; two bits that should mean the same bit are now out of sync. Double oops.

How do we avoid duplication and implement DRY?

We can use technologies that solve this process, such as using a natural language API, so that your tests read like English.

For Example:

This example is readable and executable in just about every language, but regrettably, to get to this form you will have to write a lot of wrappers and helpers so that the syntax is easy to follow.

And this means that you will likely then need to rely on writers with significant experience in your specific software language and we may not have someone with the right expertise on hand.

An alternative is to create a domain specific language in which you write your tests. A domain specific language means that you create tests in something like Gherkin/Cucumber or you write a proprietary parser and lexer. Of course, this path again relies on someone who has a lot of experience with API / Language design.

The Simplest Solution

To me the preferred method is to use Gherkin; mostly because it is easier to maintain after your architect wins the lottery and moves to Bermuda. With Gherkin when you run into a problem you can Google the answer, or hire specialist. There is a sense of mystery and adventure about working on issues where you can’t Google the answer, at the same time, it’s not necessarily a reliable business practice.

The most significant benefit that I’ve discovered is that for many people, this method no longer feels like programming. This statement undoubtedly seems odd coming from a programmer, but hear me out as there is a method to my madness.

Solutions for the People You Have

To begin, let’s acknowledge that there is a shortage of programmers, especially programmers that are experts in your particular business. Imagine if you had a tool that you could hand to anyone who knows your field and that would allow this individual to understand and define tests? Wouldn’t that be grand?

How would this tool look? What would it require? To accomplish readability (understanding) and functionality (definition) you’d need to be able to hand off something that is concrete (versus abstract) and applicable to the business that you are in, but most importantly it needs to have a physical appearance.


Imagine if you could design tests with LEGO®? There is nothing you can build out of Lego bricks that you couldn’t create in a wood shop. Unfortunately, most people are too intimidated to make anything in a woodshop. Woodworking is the domain of woodworkers. On the flipside, give anyone a box of Lego bricks, and they will get to work building out your request.

Software development runs into the woodworker conundrum: programming is the domain of developers. Give a layperson C#, Java or JavaScript and assign them to a project to build and they’ll get so flustered they won’t even try. Give them Lego bricks, and they will at the least try to build out their assignment.

Reducing the barrier to accomplishing something new is extremely important for adoption; we know this is a barrier to getting customers to try our software, but we often forget the same rule applies to adopting new things for our teams.

This desire for something concrete to visualize is why we come across people who would “like to learn to program” while building complicated Excel sheets filled with basic visual formulas. These folks can see Excel so they naturally can use this formula thing, and don’t even realize that they are programming. My method is similar in concept.

To successfully automate our testing we need to reduce the barriers to trying to use the technology we plan to produce tomorrow.  As an organization adopts change, we need to find ways to make changes transparent and doable; we need to convince our people that they will succeed or they might not even try.

Gherkin Lego bricks

As I wrote in my post Variables in Gherkin:

“The purpose of Cucumber is to provide a clear and readable language for people (the humans) who need to understand a test’s function. Cucumber is designed to simplify and clarify testing.

For me, Cucumber is an efficient and pragmatic language to use in testing, because my entire development team, including managers and business analysts, can read and understand the tests.

Gherkin is the domain-specific language of Cucumber. Gherkin is significant because as a language, it is comprehensible to the nontechnical humans that need to make meaning out of tests.

In simpler terms: Gherkin is business readable.”

This explanation shows how Gherkin is the perfect language for our Lego bricks, to build testing “building blocks.” To create an infrastructure that is readable by and function for the people you have on hand, I like to develop components that I then provide to the testers.

Instead of needing to have a specific technical understanding, such as a particular programming language the testers now just need to know which processes they need to test.

For example, I would like to:

  • Open an order = a blue 2×4 brick
  • Open a loan = a blue 2×6 brick
  • Close a loan = a red 2×4 brick
  • Assign a ticket = a green 2×4 brick

This method addresses the issue of DRY because each “brick” has its process, so if you need to change the process, you change out the brick. If a process is broken, you pass it back to the development team, but it is very precise. It makes it concrete and removes a lot of the abstract parts inherent in software development.

⇨ This method addresses the issue of readability because each brick is a concrete process. Your testers can be less technical while producing meaningful automated tests.

⇨ This method solves the problem of confidence because problems are isolated to bricks. If one brick is broken, it doesn’t hint at the possibility that all bricks are broken.

⇨ This method also solves the problem of people, because it’s much easier to find testers who understand your business process and goals, such as selling and closing mortgages, without also having to understand the abstract nature of the underlying software that makes it all work.

The reality is that in this age every company has software while few are software companies.  Companies rely on software to deliver something concrete to their customers. My job as an automation and process improvement specialist is to make the testing of the software piece as transparent and as painless as possible so that your people can focus on your overarching mission.

“LEGO®is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this site”.

The Cost of Software Bugs: 5 Powerful Reasons to Get Upset

If you read the PossumLabs blog regularly, you know already that I am focused on software quality assurance measures and why we should care about implementing better and consistent standards. I look at how the software quality assurance process affects outcomes and where negligence or the effects of big data might come into play from a liability standpoint. I also consider how software testing methodologies may or may not work for different companies and situations.

If you are new here, I invite you to join us on my quest to improve software quality assurance standards.

External Costs of Software Bugs

As an automation and process improvement specialist, I am somewhat rare in my infatuation with software defects, but I shouldn’t be. The potential repercussion of said bugs is enormous.

And yet you ask, why should YOU care?

Traditional testing focuses on where in the development lifecycle a bug is found and how to reduce costs. This is the debate of Correction vs. Prevention and experience demonstrates that prevention tends to be significantly more budget-friendly than correction.

Most development teams and their management have a singular focus when it comes to testing: they want to deliver a product that pleases their customer as efficiently as possible. This self-interest, of course, focuses on internal costs. In the private sector profit is king, so this is not surprising.

A few people, but not many, think about the external costs of software defects. Most of these studies and the interested parties tend to be government entities or academic researchers. In this

In this article, I discuss five different reasons that you as a consumer, a software developer or whomever you might be, should be concerned with the costs of software bugs to society.

#1 No Upper Limit to Financial Cost

The number one reason that we should all be concerned is that in reality software costs for defects, misuse or crime likely have no upper limit on their expense.

In 2002 NIST compiled a detailed study looking at the costs of software bugs and what we could do to both prevent and reduce costs, not only within our own companies but also external societal costs. The authors attempted to estimate how much software defects cost different industries. Based on these estimates they then proposed some general guidelines.

Although an interesting and useful paper, the most notable black swan events over the last 15 years demonstrate that these estimates provide a false sense of security.

For example, when a bug caused $500 million US in damage with the Ariana 5 rocket launch failure, observers treated it like a freak incident. At the time, little did we know that the financial cost of freak incident definition would continue to grow a few orders of magnitude just a few years later.

This behavior goes by many names, Black Swans, long tails, etc. What it means is that there will be extreme outliers. These outliers will defy any bell curve models, they will be rare, they will be unpredictable, and they will happen.


Black Swan is an unpredictable event as so named by Nassim Nicholas Taleb in his book The Black Swan: The Impact of the Highly Improbable. It is predicted that the next Black Swan will come from Cyberspace.

Long tail refers to a statistical event in which most events will happen in a specific range whereas a few rare events will occur at the end of the tail. https://en.wikipedia.org/wiki/Long_tail

Of course, it is human nature always to try and assemble the clues that might lead to predicting a rare event.

Let’s Discuss Some Examples:

4 June 1996

A 64-bit integer is written to a 16-bit value, and 500 million dollars went up in flames. As you see in Table 6-14 (page 135), as published in the previously mentioned NIST study, the estimated cost for software defects for the aerospace industry for a company this size was only $1,289,167. And so, 500 million blows that estimate right out of the water.

This single bug cost 200 times the expected annual cost of defects for a company.

May 2007

A startup routine for engine warm-up is released with some new conditions. The estimate for the automotive industry’s cost of software bugs in 2002; per company, per year as seen in Table 6-14 (page 135). Company Costs Associated with Software Errors and Bugs Automotive for a company bigger than 10,000 was only $2,777,868. That is not even a dent in the cost to Volkswagen — this code cost Volkswagen 22 Billion dollars.

That equates to about 10,000 times the expected costs of defects per company per year.

This behavior goes by many names, Black Swans, long tails, etc. What it means is that there will be extreme outliers. These outliers will defy any bell curve models, they will be rare, they will be unpredictable, and they will happen.

It is human nature always to try and assemble the clues that might lead to predicting a rare event. Unfortunately, when it comes to liability, it seems only academics are interested in this type of prediction, but given the possibility of exponential costs to a single company, shouldn’t we all be concerned?

#2 Data Leaks: Individual Costs of Data Loss?

Data leaks of 10-100 million customers are becoming routine. These leaks are limited by the size of the datasets and thus unlikely to grow much more. In large part that is because not many companies have enough data to move into the billions of records data breaches.

Facebook has ~2 billion users, the theoretical limit of a data breach is therefore limited to Facebook, or a Chinese or Indian government system. We only have 7.5 billion people on earth so to have a breach of 10 billion users we first need more people.

Security Breaches are limited by the Human Population Factor

That is what makes security breaches different, the only thing that it tells us is that we will approach the theoretical limit of how bad it could be. The Equifax breach affected 143 million users.

When it comes to monetary damages for the cost of the data breach, there is not a limiting factor, such as population size.

As we saw with Yahoo and more recently Equifax, cyber security software incidents show a similar pattern of exponential growth when it comes to costs. Direct financial costs are trackable, but the potential for external costs and risks should concern everyone.

#3 Bankrupt Companies and External Social Costs

From its inception no one would have predicted that this simple code pasted below might cost VW $22 billion US:

if (-20 /* deg */ < steeringWheelAngle && steeringWheelAngle < 20 /* deg */)


lastCheckTime = 0;

cancelCondition = false;




if (lastCheckTime < 1000000 /* microsec */)


lastCheckTime = lastCheckTime + dT;

cancelCondition = false;


else cancelCondition = true;


else cancelCondition = true;


Even if you argue that this is not an example of a software defect, but rather deliberate fraud, it’s unlikely you’d predict the real cost. Certainly, one was different, unexpected, not conforming to our expectations of a software defect. But that is the definition of a Black Swans. They do not conform to expectations, and as happened here the software did not act according to expectations. The result is that it cost billions.

How many companies can survive a 22 million dollar hit? Not many. What happens when a company we rely heavily on suddenly folds? Say the company that manages medical records in 1/5th of US states? Or a web-based company that provides accounting systems to clients in 120 countries just turns off?

#4 Our National Defense is at Risk

This one doesn’t take a lot to understand the significance, and yet it is one issue currently in the limelight. Software defects, faults, errors, etc. have the potential to produce extreme costs, despite infrequent occurrences. Furthermore, the origins of the costs of long tail events may not always be predictable.

After all what possible liability would Facebook have for real-world damages regarding international tampering in an election? It is all virtual, just information; until that information channel is misused.

There is very little chance that when actuaries for Facebook thought about election interference that they looked for such an area of risk. Sure they considered liability, people live broadcasting horrible and inhumane things, but did they contemplate foreign election interference? And even if they did consider the possibility, how would they have been able to predict or monitor the entry point?

And that is the long tail effect; it is not what we know, or can imagine, it is the unexpected. It is the bug that can’t be patched, as the rocket exploded, it is the criminal misuse of engine optimization routines or the idea that an election could be swayed due to misinformation. These events are so costly that we can’t assume that we know how bad it could be because the nature of software means that things will be as bad as they possibly can get.

#5 Your Death or Mine

Think of the movie IT, based off of Stephen King’s book by the same name. A clown that deceives children and leads them to death and destruction. What happens when a piece equipment runs haywire, masquerading as one thing and doing yet another? Software touches a great enough aspect of our lives that from the hospital setting to self-driving cars, a software defect could undoubtedly lead to death.

We’ve already had a case, presumably settled out of court, where a Therac-25 radiation therapy machine irradiated people to death. What happens when a cloud update to a control system removes fail-safes on hundreds or thousands of devices in hospitals or nursing homes? Who will be held liable for those deaths?

Mitigation is often an attempt at Prediction

A large part of software quality assurance is risk mitigation as an overlapping safety net to look for unexpected behaviors. Mitigation is an attempt to make it less likely that your company unintentionally finds the next “unexpected event.”

There has been a lot written about how there is an optimal way to get test coverage on your application. Most of this comes down to testing the system at the lowest level (unit test) that is feasible and has resulted in the testing pyramid. This is mathematically true. Unfortunately, the pyramid assumes that there are no gaps in coverage. Less overlap means that a gap in coverage at a lower level is less likely to be caught at a higher level.

The decision of test coverage and overlapping coverage can be approximated using Bernoulli trial, which delivers one of two results: success or failure.

Prioritizing the Magnitude Of Errors and their Effects

When we look at the expected chance of a defect and multiply that with the cost of a defect, we can compare that to the chance of a defect with overlapping coverage, multiplied by the cost.

We are usually looking at the cost of reducing the chance of a defect slipping through and comparing that to our estimated cost of a defect.

Unfortunately, the likelihood that we underestimate the cost of a defect due to long tail effects is very high. Yes, it is improbable that your industry will have a billion dollar defect discovered this year; but how about in the next 10 years? Now the answer becomes a maybe, let us call it a 10% chance and let us say that there are 100 companies in your industry. What is the cost of one of those outlier defects per year?

1,000,000,000 * .01 (1% chance per year) * .01 (1% chance of it hitting your company) = 100,000 per year as an expected cost for outlier defects per year.

The problem with outlier events is that despite their rare nature, even with a significantly small probability that your company might be the victim, the real outliers have the potential to be so big and expensive that it may, in fact, be worth your time investing in considering the possibility.

Enduring the Effect of a Black Swan

In reality, companies might use bankruptcy law to shield themselves from the full cost of one of these defects. VW’s financial burden for their expensive defect stems from the fact that they could afford it without going bankrupt. The reality is that most companies couldn’t afford to pay the costs of this type of event and would ultimately be forced to dissolve.

We cannot continue to ignore that software defects, faults, errors, etc. have the potential to produce extreme costs, despite infrequent occurrences. Furthermore, the origins of the costs of long tail events may not always be predictable.

The problem with “rarity of an event” as an insurance policy is that the costs of significant black swan bug events are that their risk goes beyond simple financial costs borne by individual companies. The weight of these long tail events is borne by society.

And so the question is, for how long and to what extent will society continue to naively or begrudgingly bear the cost of software defects? Sooner or later the law will catch up with software development. And software development will need to respond with improved quality assurance standards and improved software testing methodologies.

What do you think about these risks? How do you think we should address the potential costs?


Tassey, G., Ph.D. (2002, May). Report02-3: The Economic Impacts of Inadequate Infrastructure for Software Testing [PDF]. Gaithersburg: RTI for National Institute of Standards and Technology.