There is a significant focus on efficient testing through the use of the testing pyramid. The testing pyramid focuses on finding a defect in as small of a segment as possible. This focus, in turn, creates efficient coverage of the functionality.

This is logically correct.

This is also emotionally flawed.

Dale Carnegie.png

The Challenge: relating failures

The challenge with low-level tests is that it is hard to connect failures to their end-user effects. Looking hard at failed unit tests never provides a straightforward answer. Rarely do we ask, “so what does this result mean for our users?” and as a result, it is all too easy to comment out and ignore specific failures. This is in part because determining user effects requires us to stop and make an effort to decide what commenting this out (or marking the test as ignored) would actually mean to the end user.

Most software development teams either have some failing tests or all passing tests. (And, we ignore some tests.) Ideally, we would want nothing commented out, and nothing fails. I’m sure that there are teams who can make this claim. However, the reality is that testing exists because it provides value. And despite our best intentions and our best developers, we are all only human, and sooner or later we find ourselves in a situation where we have failing tests or the need to comment them out.

The testing pyramid is a logical solution that optimizes the problem of test coverage. It would be rational to respect all tests under these conditions because you are trying to minimize the duplication of coverage. It is correct. There is just one relatively significant problem, and that is that we have humans, living breathing, imperfect people implementing the testing pyramid.


Humans love to assign meaning.

Astrology exists because of this perpetual search for meaning. We want to be connected. Associated. The more abstract something is, the more distant it becomes. Testing pyramids create detached results in many different jargons. People don’t do well with the anonymity; it doesn’t bring out the best in anyone. By making tests anonymous, we make it easy to comment them out or ignore their failure. We, in fact, promote the avoidance of finding the meaning in our results. This should not be surprising, given that we are asking humans to assess the results.

Engineering Quality and Positive Results

On the other hand, successful practices are those that deliver desired results (this is in contrast to those that should get desired results, but don’t necessarily). Positive (desired) results come from the prescription and adherence to said prescription. If there is a problem with adherence of the prescription, it might be that you need to adjust the prescription (simplify, modify, require less, etc.). In turn, you may have a less efficient prescription, but you will compensate with better adherence.

Logic vs. Emotion: Remember that you are working with humans.

There has been a lot of thought put into adherence, and there are plenty of books talking about how to create change in organizations. One of the common elements is that people need to have a visceral reaction when making the decision either to follow or to deviate from the prescription. A solution to reducing anonymity and improving the meaning in our testing is thus to make test failures visceral.

How can we do this? We must find a way to recreate a real-life scenario that demonstrates what happens when a test fails. As a practice, this proposal comes into direct conflict with the testing pyramid. The pyramid focuses on individual components, component interactions, API’s, UI, and finally users in the case of integration tests. The tests that are easiest to relate to are those that we have fewest of, while unfortunately the tests that are the least relatable, are the most common. Furthermore, the chances are that the business analyst does not know the names of the classes and interfaces referenced in unit tests. All of which contributes to making it difficult for decision-makers to apply meaning to the results of a testing pyramid functionally.

What about Best Practices?

When it comes to best practices, it bears reminding that they are intended for a generic company with generic developers. In contrast, your goal is to improve your unique organization with the individuals inside. Best practices are certainly worthwhile recommendations that can help you; however, they will not always accommodate the specifics for your particular scenario. If, for example, unit test are not resilient (get deleted, commented out, remain broken) then the unit test might not be the right solution for you and integration tests would be more valuable. You can choose to change the organization, but this is not always practical. The pragmatic choice is to optimize your solution to meet your organization where it is. Meaningful tests are ideally suited for scenarios where there is not enough time to attain the aspired end-quality. Often we must deliver imperfect solutions, but we need a solid understanding of acceptable versus unacceptable imperfections.

All this does not diminish the testing pyramid as the optimal coverage from a logical perspective; it just means that if you work with humans; your organization may find better long-term results from a different breakdown of types of tests.


Leave a Reply