Effective QA is Not an Option it’s a Necessity: Here’s How to do it Right

As has long been the case in the software and technology industry, immigration is an important source for accessing Quality Assurance people.

Part of the reason is that, as per usual, QA isn’t always a job that attracts US-based developers. The US job market is already competitive for developers and moving developers on your team to quality assurance is often undesirable for a number of reasons. The result is that relocating QA developers to the USA is in many cases the most satisfactory solution for the industry.

Unfortunately, barriers to immigration continue to drive-up demand for qualified QA, which only further exacerbates the shortage of people with an appropriate development background already in the US.

 

Quality Assurance is not Optional

Compounding the problems experienced by a physical shortage of qualified developers for QA is the fact that some decision makers continue to consider QA an “optional” line item in their budget. Unfortunately for them, QA and the surrounding processes are anything, but “optional” and a good quality assurance engineer is a key player on any team.

In companies where QA is an accepted need, it is still often considered more of a necessary evil than a definite benefit. QA is too easily blamed when a project that’s been on schedule suddenly gets hung up in testing. Another common problem appears when development teams pick up speed after implementing workflow changes, only to discover that QA is still a bottleneck keeping them from delivering a product to their customer.

 

A Look at the Current Situation

As we’ve seen, numerous factors contribute to the myths surrounding Quality Assurance that contribute to this questionable climate that makes some engineers shy away from the moniker or even the field.  Moreover, we are experiencing an actual shortage of qualified engineers, which means that QA in many instances ends up being not an afterthought, but rather a luxury.

Immigration and work status has been hot-topics for the last few years. And regardless of where you fall on the political spectrum, if you work in software, you’ve likely experienced first hand the effect on the jobs market.

For those of you that have been lucky enough to hire engineers, many of you may have also been unlucky enough to discover that said engineer has to go back to India (or wherever his or her country of origin might be). And what started out as a short trip, quickly devolves into a lengthy, even a 6-month long, process to renew or regularize an H-1B visa, following changes made to the requirements in 2017.

Whether your experience with the dwindling QA applicant pool is first hand or anecdotal,  here are some statistics for you to munch on:

The US Bureau of Labor and Statistics expects the market for software developers to blow-up by 24% from 2016 to 2026. That means a need for approximately 294,000 additional software developers in an already tight market. If you think it’s hard to convince an engineer to join your QA team now, just wait and see what it will be like in 2026.

We can’t know for sure the number of H-1B visas currently held up due to changes in requirements, but this article does a decent job of discussing the demand and actual need for H-1B Visa’s in for the USA with a focus on the State of Massachusetts. If you’d like to know more, I’d suggest taking a look; however, for our purposes, I don’t think importing QA staff is necessarily the answer.

So, you need to have QA, but you can’t hire qualified staff to take care of the work. What can you do?

The Foundations of Quality Assurance

Before I answer this question, let’s take a look at why Quality Assurance exists in the first place. From a business perspective, there are a few things that pretty much every customer expects from a software product. The following three expectations are the crucial reasons to the story behind why quality assurance is not optional::

  1. Customers expect that the programs will work as requested and designed, and within the specified environments;
  2. They hope that software will be user-friendly;
  3. And, they assume that software will have been successfully debugged: meaning that QA must deliver a product that is at the least free of the bugs that would result in numbers 1 or 2 becoming false.

Historically, software teams were small, but since the early 80s, due to the need to scale quickly and keep up with changing requirements and other advances in technology and globalization, we’ve experienced rapid growth in the size of development teams and companies. This growth has led to implementing a wide variety of tactics from workflow solutions, think Waterfall or Agile, to different methods of increasing productivity and efficiencies, such as offshore teams and microservices.

Take a look at this simple chart approximating the increase in development team size from the original release of Super Mario Brothers in 1985 to Super Mario World in 1990 and Super Mario 64 in 1996. (Noting that in the credits, by 1996 the occasionally thank entire teams, not just individuals, so actual number is likely even higher).

Super Mario Release Team Size

© Possum Labs 2018

Where we haven’t (at least in the USA) been able to keep up is in the training of new software engineers. QA departments, regardless of size, are challenged to carry-out all the different required processes to follow software through the development lifecycle to delivery and maintenance (updates), while also keeping abreast of changes to technology and integrations.

A misfortunate result of this shortage of QA engineers is that the point in the development cycle where most companies fall short is in testing. And, yet, the ability to provide useful and meaningful testing is crucial to the successful delivery of quality assurance to one’s client, whether building an in-house product, such as for a financial institution, or a commercial product for the public market.

While offshore teams may be a solution for some companies, many companies are too small to make offshore building teams practical or cost-effective.

What’s more is that many engineers tend to be good at one thing — development — they may not have a good sense of your organization’s business goals or even an understanding of what makes a good customer experience. And while your high paid development staff might excel at building clever solutions, it doesn’t necessarily mean that they also excel at testing their own goods. And do you really want to pay them to do your testing, when their time could be better invested in innovations and features? At Possum Labs we’ve determined that it is often most efficient to design workflows and teams to adjust to the people you have.

This disconnect between development requirements and a full understanding of business goals is in fact often the culprit in a pervasive disconnect between testing and business outcomes. What do I mean by disconnect? Let’s consider the four following statements and then talk about some real-life examples:

  • Users prefer seamless interfaces, intuitive commands, and technology that makes them feel smart, not dumb.
  • Businesses prefer software that assures their business goals are met and that technology allows their employees to work smarter with greater efficiency thus promoting growth and profit.
  • Today the average person is adequately adept and familiar with technology to know when your software exhibits even moderately lousy UX. And companies can also experience public shaming via social media when they make a particularly dumb or inopportune mistake.
  • And then there are the security risks.

In 2017 we saw several major security snafus experienced by large corporations that from the few details publicized were the direct result of inaction on the part of the decision-makers, despite being notified by engineering.

One might think that the decision makers, while moderately acknowledging the risk, may have simply gambled that nothing would come of the risks while taking steps to protect themselves.

I would like to go a step further. I’d wager that everyone involved fell victim to a set of common testing pitfalls.

Indeed, one of the most challenging aspects of testing is figuring out not only how to effectively and efficiently create and run tests, but most importantly to figure out how to confidently deliver meaningful results to the decision makers. Whether you are a software business or a business that uses software, successful quality assurance is crucial to the long-term health and success of your business.

 

Let’s do a quick recap of what we’ve covered so far:

  1. There is a shortage of qualified test engineers.
  2. Users want products they can rely on and that are friendly to use.
  3. Companies want products they can trust that improve efficiencies, their bottom line and that, of course, make their clients happy.
  4. It is difficult to create tests that deliver meaningful results when the testing is done by engineers that don’t necessarily understand the businesses end goals.
  5. Decision makers don’t want to understand the tests; they want to have meaningful results so that they can make effective decisions.

 

So what if we could solve all of these problems at once?

This type of solution is what Possum Labs achieves through the clever use of solutions and tools that integrate with your existing systems, processes, and people. We build out quality assurance so that anyone who understands the business goals can successfully carry-out testing and efficiently uses the results.

Not only does this solve problems 1 to 5 above, but it is also, in fact, a super solution in that it prevents most companies from having to hire or train new developers. Instead, you need to hire for people who with a keen understanding of your business and that can be trained to work with your tools. Possum Labs methods allow you to implement and upgrade your quality assurance — sometimes even reducing your staffing load — while delivering better and more meaningful results, so that your end of service recipients get better services or better products than before.

 

How does Possum Labs do this?

Each of our solutions vary a bit from company to company, but in general, we use several tools, including proxies and modules (think Lego) to make it so existing tests can be modified and new tests written with simply reorganizations of the “bricks.” This focus on custom solutions allows a non-technical individual with a solid understanding of business goals to generate tests and result that deliver meaningful results with confidence for him or her to share with decision makers.

The result is that testing bottlenecks open up allowing for a more efficient flow of information and better feedback through all channels. Products are delivered faster. Information flows smoothly. Better decisions are made, and efficiencies are gained. Developers can focus on development and decision makers in achieving their strategic goals. Meanwhile, you’ve got happy customers, and everyone can get a good night’s rest.

3 Risks to Every Team’s Progress and How to Mitigate

When looking at improving performance the first thought is often to increase the size of our development team; however, a larger group is not necessarily the only or the best solution. In this piece, I suggest several reasons to keep teams small and why to stop them from getting too tiny. I also look at several types of risk to consider when looking at team size: how team size effects communication, and the possibility of individual risk and systematic risk.

Optimal Team Size for Performance

The question of optimal team size is a perpetual debate in software organizations. To adjust, grow and develop different products we must rely on various sizes and makeups of teams.

We often assume that fewer people get less done, which results in the decision of adding people to our teams so that we can get more done. Unfortunately, this solution often has unintended consequences and unforeseen risks.

When deciding how big of a team to use, we must take into consideration several different aspects and challenges of team size. The most obvious and yet most often overlooked is communication.

Risk #1: Communication Costs Follow Geometric Growth

The main reason against big teams is communication. Adding team members results in a geometric growth of communication patterns and problems. This increase in communication pathways is easiest illustrated by a visual representation of team members and communication paths. 

Geometric Growth of Communication Paths

Bigger teams increase the likelihood that we will have a communication breakdown.

From the standpoint of improving communication, one solution that we commonly see is the creation of microservices to reduce complexity and decrease the need for constant communication between teams. Unfortunately, the use of microservices and distributed teams is not a “one size fits all” solution, as I discuss in my blog post on Navigating Babylon.

Ultimately, when it comes to improving performance, keep in mind that bigger is not necessarily better. 

Risk #2: Individual Risk & Fragility

Now a larger team seems like it would be less fragile because after all, a bigger team should be able to handle one member winning the lottery and walking out the door pretty well. This assumption is partially correct, but lottery tickets are usually individual risks (unless people pool tickets, something I have seen in a few companies).

When deciding how small to keep your team, make sure that you build in consideration for individual risk and be prepared to handle the loss of a team member.

Ideally, we want to have the smallest team as is possible while limiting our exposure to any risk tied to an individual. Unfortunately, fewer people tend to be able to get less work done than more people (leaving skill out of it for now).

Risk #3: Systematic Risk & Fragility

Systematic risk relates to events that will affect multiple people in the team. Fragility is the concept of how well structure or system can handle hardship (or changes in general). Systemic risks are aspects shared across the organization, this can be leadership, shared space, or shared resources.

Let’s look at some examples:

  • Someone brings the flu to a team meeting.
  • A manager/project manager/architect has surprise medical leave.
  • An affair between two coworkers turns sour.

All of these events can grind progress to a halt for a week (or much more). Events that impact morale can be incredibly damaging as lousy morale can be quite infectious.

In the Netherlands, we have the concept of a Baaldag (roughly translated as an irritable day) where team members limit their exposure to others when they know they won’t interact well. In the US with the stringent sick/holiday limits, this is rare.

Solutions to Mitigate Risk 

Now there are productive ways to minimize risk and improve communication. One way to do this is by carefully looking at your structure and goals and building an appropriate team size while taking additional actions to mitigate risk. Another effective technique for risk mitigation is through training. You shouldn’t be surprised, however, that my preferred method to minimize risk is by developing frameworks and using tests that are readable by anyone on your team.

Unrealistic Expectations: The Missing History of Agile

Unrealistic expectations or “why we can’t replicate the success of others…

Let’s start with a brain teaser to set the stage for questioning our assumptions.

One day a man visits a church and asks to speak with the priest. He asks the priest for proof that God exists. The priest takes him to a painting depicting a group of sailors, safely washed up on the shore following a shipwreck.

The priest tells the story of the sailors’ harrowing adventure. He explains that the sailors prayed faithfully to God and that God heard their prayers and delivered them safely to the shore.

Therefore God exists.

This is well and good as a story of faith. But what about all the other sailors who have prayed to God, and yet still died? Who painted them?

Are there other factors that might be at play?

When we look for answers, it’s natural to automatically consider only the evidence that is easily available. In this case, we know that the sailors prayed to God. God listened. The sailors survived.

What we fail to do, is look for less obvious factors.

Does God only rescue sailors that pray faithfully? Surely other sailors that have died, also prayed to God? If their prayers didn’t work, perhaps this means that something other than faith is also at play?

If our goal is to replicate success, we also need to look at what sets the success stories apart from the failures. We want to know what the survivors did differently from those that did not. We want to know what not to do, what mistakes to avoid.

In my experience, this is a key problem in the application of agile. Agile is often presented as the correct path; after all lots of successful projects use it. But what about the projects that failed, did they use Agile, or did they not implement Agile correctly? Or maybe Agile is not actually that big a factor in the success of the project?

Welcome to the history of what is wrong with Agile.

Consider this, a select group of Fortune 500 companies, including several technology leaders decides to conduct an experiment. They hand pick some people from across their organization to complete a very ambitious task. A task of an order of magnitude different from anything they’d previously attempted and with an aggressive deadline.

Question 1: How many do you think succeeded?

Answer 1: Most of them.

Question 2: If your team followed the same practices and processes that worked for these teams do you think your team would succeed?

Answer 2: Probably not.

The Original Data

In 1986, Hirotaka Takeuchi and Ikujiro Nonaka published a paper in the Harvard Business Review titled the “The New New Product Development Game.” In this paper, Takeuchi and Nonaka tell the story of businesses that conduct experiments with their personnel and processes to innovate new ways to conduct product development. The paper introduces several revolutionary ideas and terms, which most notably developed the practices that we now know as agile (and scrum).

The experiments, run by large companies and designed for product development (not explicitly intended for software development), addressed common challenges of the time regarding delays and waste in traditional methods of production. At the root of the problem, the companies saw the need for product development teams to deliver more efficiently.

The experiment and accompanying analysis focused on a cross-section of American and Japanese companies, including Honda, Epson, and Hewlett-Packard. To maintain their competitive edge each of these companies wished to rapidly and efficiently develop new products. The paper looks at commonalities in the production and management processes that arose across each company’s experiment.

These commonalities coalesced into a style of product development and management that Takeuchi and Nonaka compared to the rugby scrum. They characterized this “scrum” process with a set of 6 holistic activities. When taken individually, these activities may appear insignificant and may even be ineffective. However, when they occur together as part of cross-functional teams, they resulted in a highly effective product development process.

The 6 Characteristics (as published):

  1. Built-in instability;
  2. Self-organizing project teams;
  3. Overlapping development phases;
  4. Multilearning;
  5. Subtle control;
  6. And, organizational transfer of learning.

What is worth noting, is what is NOT pointed out in great detail.

For instance that the companies hand-picked these teams out of a large pool of, most likely, above average talent. These were not random samples, they were not even companies converting their process, these were experiments with teams inside of companies. The companies also never bet the farm on these projects, they were large, but if they failed the company would likely not go under.

If we implement agile, will we be guaranteed success?

First, it is important to note that all the teams discussed in the paper delivered positive results. This means that Takeuchi and Nonaka did not have the opportunity to learn from failed projects. As there were no failures in the data set, they did not have the opportunity to compare failures with successes, to see what might have separated the successes from failures.

Accordingly, it is important to consider that the results of the study, while highly influential and primarily positive, can easily deceive you into believing that if your company implements the agile process, you are guaranteed to be blessed with success.

After years in the field, I think it is vitally important to point out that success with an agile implementation is not necessarily guaranteed. I’ve seen too many project managers, team leads, and entire teams banging their heads up against brick walls, trying to figure out why agile just does not work for their people or their company. You, unlike the experiments, have a random set of people that you start with, and agile might not be suited for them.

To simplify this logical question; if all marbles are round, are all round things marbles? The study shows that these successful projects implemented these practices, it did not claim these practices brought success.

What is better: selecting the right people or the right processes for the people you have?

Consider that your company may not have access to the same resources available to the companies in this original experiment. These experiments took place in large companies with significant resources to invest. Resources to invest in their people. Resources to invest in training. Resources to invest in processes. Resources to cover any losses.

At the outset, it looks like the companies profiled by Takeuchi and Nonaka took big gambles that paid off as a result of the processes they implemented. However, it is very important to realize that they, in fact, took very strategic and minimal risk, because they made sure to select the best people, and did not risk any of their existing units. They spun up an isolated experiment at an arm’s length.

If you look at it this way, consider that most large multinational companies already have above average people, and then they cherry pick the best suited for the job. This is not your local pick-up rugby team, but rather a professional league. As large companies with broad resources, the strategic risks they took may not be realistic for your average small or medium-sized organization.

The companies profiled selected teams that they could confidently send to the Olympics or World Cup. How many of us have Olympians and all-star players on our teams? And even if we have one or two, do we have enough to complete a team? Generally, no.

The Jigsaw Puzzle: If one piece is missing, it will never feel complete.

Takeuchi and Nonaka further compare the characteristics of their scrum method to that of a jigsaw puzzle. They acknowledge that a single piece of the puzzle or a missing piece mean that your project will likely fail. You need all the pieces for the process to work. They neglect to emphasize that this also means that you need the right people to correctly assemble the puzzle.

The only mention they make regarding the people you have is the following:

“The approach also has a set of ‘soft’ merits relating to human resource management. The overlap approach enhances shared responsibility and cooperation, stimulates involvement and commitment, sharpens a problem-solving focus, encourages initiative taking, develops diversified skills, and heightens sensitivity toward market conditions.”

In other words, the solution to the puzzle is not only the six jigsaw puzzle pieces, but it is also your people. These “soft merits” mean that if your people are not able to share responsibility and cooperate, focus, take the initiative, develop diverse skills and so on, they aren’t the right people for an agile implementation.

If you don’t have all the pieces, you can’t complete the puzzle. And if you don’t have the right people, you can’t put the pieces together in the right order. Again, you might be round, but you might not be a marble.

Human-Centered Development for the People You HAVE

As with any custom software development project, the people who implement are key to your project’s success. Implementing agile changes the dynamics of how teams communicate and work. It changes the roles and expectations of all aspects of your project from executive management to human resources and budgeting.

Agile may work wonders for one company or team, but that success doesn’t mean that it will work wonders for YOUR team. Especially if all stakeholders do not understand the implications and needs of the process or they lack the appropriate aptitudes and skills.

In other words, if these methods don’t work for your people, don’t beat up yourself or everyone else. Instead, focus on finding a method that works for you and for your people.

Agile is not the only solution …

Why do people select agile? People implement agile because they have a problem to solve. However, with the agile approach managers need to step back and let people figure things out themselves. And that is not easy. Especially when managers are actively vested in the outcome. Most people are not prepared to step back and let their teams just “go.”

Maybe you have done the training, received the certifications, and theoretically “everyone” is on board. And yet, your company has yet to see Allstar success. Are you the problem? Is it executive management? Is it your team? What is wrong?

I cannot overemphasize that the answer is as simple as the people you have. Consider that the problem is unrealistic expectations. The assumption when using agile and scrum is that it is the best way to do development, but what if it is not the best way for you?

If you don’t have the right people or the right resources to implement agile development correctly, then you should probably do something else. At the same time, don’t hesitate to take the parts of agile that work for you. 

Citations:

Nonaka, H. T. (2014, August 01). The New New Product Development Game. Retrieved July 19, 2017, from https://hbr.org/1986/01/the-new-new-product-development-game

A call for a "FAA" of Software Development

Mayday! Bad UX in Airplanes

Software development teams constantly learn lessons. Some of these lessons are more significant than others.
Due to the fact that there is not a universal method or process for sharing lessons learned, many lessons are learned in parallel within different companies. Many times different teams in our very own companies make the same mistakes over and over again, simply because there is not a shared repository of knowledge and experience.

Even when a developer and or quality assurance professional attempts to research when, where and why things have gone wrong, it is very difficult to find documented and pertinent information. 

These unnecessary mistakes comprise avoidable expenses to both consumers and companies, and should at a certain price point, especially a public price point, make it very useful to have a public method for accessing “lessons learned.”

Not just a report of the problematic lines of code, but inclusive of an analysis of the effects of that code (who, what, when, where, why and, how).

What’s more, in addition to the time and financial costs of problematic code, there is also risk and liability to consider. From privacy to financial health to business wealth, the risk is great enough that I propose the creation of an organization, similar to the FAA, for documenting, reporting and making software “travel” safer.

There are many examples of bad UX to be found. Just for fun, let’s look at some real life examples of lessons learned in software code in regards to Shareholder Value and Liability in Airline Travel.

Mayday: Bad UX in Airplanes

As often happens in life, not so long ago I decided I needed a change of pace in my evening routine and one way or another and I stumbled upon Mayday a show about air crash investigations. My natural human curiosity into bad things that happen to other people caught me at first, but after a few episodes, the in-depth analysis of all the various factors that cause airplane crashes really caught my attention. As a testing and QA expert, I found it disheartening to see the frequency with which bad UX is a contributing factor to airplane crashes. In most cases, the bad UX is not the instigating cause (phew!), but the stories in the show make it evident that bad UX can easily make a bad situation worse.

Scenario #1: Meaningful Warnings

For instance, on a Ground Control display, there is a 7 character field next to each aircraft indicating its expected altitude in comparison to its published height (theoretically actual altitude). In the case of one episode, an airplane intended to be at an altitude of 36o reported flying at 370, as indicated by the display which read “360-370.”

If the broadcast stopped the display would be “360Z370.” This would indicate a final broadcast of 370 versus an expected broadcast of 360. If the broadcast stopped with this discrepancy shown the display did not set off an alarm or even display a color change, just the character “Z” in the middle of a 7 digit string that implies that half of the rest of the numbers is garbage.

This piece of information on its own is not terribly exciting nor is it something that could on its own cause an aircraft to go down. Furthermore, there is not much of a reason for a system tracking altitude to simply go off.

A bad UX design process uncovered by “hidden” warnings

That is, of course, unless the button to activate or deactivate the system is placed behind the pilot’s footrest. And the display for the error message is then placed next to the button; presumably down below the foot.

No audible alarm, no flashing lights, nothing else of note to catch a pilot’s attention. In this scenario (based on a true story) the system can easily be turned off accidentally and without warning. Then just add to the mix another plane flying in the same vicinity and the unfortunate result is many dead people spread out over the jungle. The resulting system update is the addition of an audible alarm if and when the system is switched off.

Scenario #2: How to Handle Bugs

Another episode profiles an airplane crash precipitated by an airspeed sensor that bugged. As in a bug appears to have built a nest in the sensor. In this situation, the system created contradictory warnings, while also leaving out expected warnings.

For instance, the plane went from warning of excess speed to stall warnings. The inconsistencies sadly managed to confuse the pilots into their deaths.

Now it is required (standard) flight training to include how to respond when the cockpit says: “Hey! We are receiving conflicting airspeeds!”

Scenario #3: Half-Full vs. Empty

Another show profiles a UX design process failure that came about on the maintenance side. Somehow, two similar looking modules for gauging fuel reserves came into use for similar, but different models of an airplane.

Initially, it appeared that the gauges could be installed and worked interchangeably, even going so far as to updating readings. The problem is that the readings would be a bit off — well, let’s call it like it is — completely off.

If you put the incorrect model in the wrong model of plane, the gauge will read half full, when the tank is actually empty. An easy fix turned out to be putting a key into the socket making it so the two gauges are no longer interchangeable. Proof that good UX is not just about design, but about also tracking lessons learned.

Of course, the fix, unfortunately, did not get implemented until a plane crashed into the sea before reaching land (and an airport).

Documenting Problems and Solutions

When planes go down there is a loss of human life and great financial expense. This means that all of these issues have been addressed and fixed, and they likely won’t happen again. Documentation and prevention are one of many reasons that airplanes really don’t go down very often these days. Or at least in they don’t go down very often in the markets where shareholder interest and liability make downed craft unacceptable. And, significant investment is made in the UX design process.

From my perspective, the most interesting aspect of the show Mayday is that it highlights many small UX problems discovered only because people died. The death of a person is something that has a near infinite cost associated with it in Western civilization and therefore causes a very large and detailed process to get down to root causes and implement changes. Especially when it is the death of 100 people at once. Mayday is a great show for learning how to analyze problems, while also giving viewers a deep appreciation for good UX design.

Looking at the small issues that are the root causes of these airline crashes and the unremarkable UX changes made to prevent them; it really drives home the number of small UX errors that cause small non-life threatening losses that take place every day due to lousy UX. Adding up all of these small losses might actually result in a quite significant financial cost for the various businesses involved.

And although the majority of them don’t directly cause the loss of life, life might be better (safer, less stressful, more financially secure, etc.) if these small UX design flaws could be reliably flagged and a system put in place to prevent them from recurring.

Standard tracking and reporting of software failures

This brings us back to my statement at the beginning of this piece regarding the need for a body or a system to track and report lessons learned. In airline travel in the USA, we have the Federal Aviation Administration (FAA) to make sure that airline travel is safe.

The purpose of the FAA is the Following: The Federal Aviation Administration (FAA) is the agency of the United States Department of Transportation responsible for the regulation and oversight of civil aviation within the U.S., as well as operation and development of the National Airspace System. Its primary mission is to ensure the safety of civil aviation.

Now imagine, we had a Federal Software Administration, whose primary mission is to ensure the safety and reliability of software? What if we held ourselves accountable to report and to document not only when bad UX precipitated an airplane crash, but also when people would be held to a reporting standard for all kinds of software defects that cause significant loss?

Software Asset Managment (SAM) already exists as a business practice within some organizations, but not in enough. And there is still not any central organization to pull the information documented by business with successful SAM practices.

In 2016, the Software Fail Watch identified over a billion dollars in losses just from software failures mentioned in English-language news sources and they estimate that this is only a “scratch on the surface” of actual software failures worldwide. There is much to be debated here, but if a US based or even an international agency simply started to record failures and their causes, without bothering to get into the official business of issuing of guidelines, the simple acts of investigation and reporting could create an opportunity for significant and widespread improvements in design.

I think we can all agree that our industry could greatly benefit from the creation of an efficient and productive repository for the sharing of lessons learned. The strategy of sharing lessons learned, lessons that are often relegated to oral histories shared between developers would greatly benefit our industry.

Companies may not initially be motivated to report details of their problems, however, the long-term benefits would surely outweigh any perceived negative costs. As an industry, software development can only benefit from sharing knowledge and lessons learned.

Group intelligence is known to exceed that of individual members, and in a world of increasingly unanticipated scenarios and risks, we need able to effectively anticipate and solve challenges around the security and uptime challenges faced by many companies. As an industry, perhaps instead of fearing judgment, we can instead focus on the benefits of embracing our imperfect nature, while facilitating and co-creating a more efficient and productive future.

As an industry, perhaps instead of fearing judgment, we can instead focus on the benefits of embracing our imperfect nature, while facilitating and co-creating a more efficient and productive future.

If you enjoyed this piece, please share it. 

If you have something to say, please join the discussion!

Navigating Babylon Part II

How to Introduce DomainSpeak in Testing

First, let’s start with a quick overview of the problem I discussed in Navigating Babylon Part I. Microservices create efficiencies in development in a world dependent on remote work environments and teams. Unfortunately, the separation of workers and teams results in the tendency for microservices to encourage the development of multiple languages or dialects that obfuscate communication and further complicating testing. We have our anti-corruption layer and we don’t want to pollute our code by spilling in sub-system language.

A Domain-specific Vocabulary for Testing: DomainSpeak

There is, however, a pragmatic solution: we can build on the anti-corruption layer by creating tests in a specific language that has been created to clearly describe business concepts. We can and should create DomainSpeak, a domain-specific vocabulary or language, to be used for testing. Once we publish information in this language it can be shared across microservices and thus influence the workflow. Periodically, as is done in the English language, we may need to improve definitions of certain vocabulary, by re-defining usage and disseminating it widely, thus influence its meaning.

How will this DomainSpeak improve testing?

For integration tests, all the different dialects should not permeate your integration tests. You need to be very clear that a word can only have a single meaning. This requires a two-part process:

  1. You need to verify that you are not inconsistently naming anything inside an actual test; and,
  2. You need to do translations in an anti-corruption layer so everything inside is consistent.

What does DomainSpeak look like in a practical sense?

When you consider how brands influence pop culture, it is through language.

In the business world, marketing professionals use domain specific languages to create a brand vocabulary or a BrandSpeak. All major and influential brands and even smaller yet influential brands have a specific vocabulary, with specific definitions and meanings, to communicate their brand to the public. All communications materials are integrated into this system.

Brand specific, intentional vocabulary, has the ability to invade and permeate. Many people are completely unaware that it was a DeBeers commercial in the 1940s that created the cultural tradition “a diamond is forever.” Other examples, “Don’t mess with Texas” came from an anti-litter campaign and although we know it’s a marketing ploy, just about everyone is on board with the idea that “What happens in Vegas, stays in Vegas.” On an international level, if you order a “coke” you will most likely get a carbonated beverage, but you won’t necessarily get a Coca-cola.

As I referenced in my first discussion on Navigating Babylon, I recommend implementing a mapping layer between the microservices and the test cases. Next, when deciding to address the language used, we take it a step further. Now focus in on the language or the DomainSpeak and how this domain-specific vocabulary improves the associated output of the test cases. This means that for example, a Customer, a User, and a Client all have specific meanings and that they cannot be interchanged.

What is the process to create this language?

The initial process is an exploratory step. To create your own DomainSpeak your testing department will need to communicate regularly with the business owners and developers. Their goal will not be to dictate what words the business owners and developers use, but to learn what words already have meanings and to document these usages. The more your communicate, recognize and document adopted meanings, the more you will discover how, where and why meanings differentiate.

For instance, the business may see a Customer as a User with an active subscription, whereas a microservice might use the words interchangeably as they do not have the concept of a subscription. You will also notice that sometimes situations may give rise to conflicting meanings. A developer may have picked up the word “Client” from a third party API he integrated with for “User,” whereas the business may use “Client” for a specific construct in a billing submodule for “customers of their customer.” In such situations, to avoid confusion and broken stuff, you will need to specify which definition is to be used and possibly introduce a new concept or word to account for the narrowing of the definition. Perhaps the “customers of their customer” will now be a “vendee” instead of a “client.” Don’t dismay if there is not an existing word that accurately matches your concept, you can always create a new word or make a composite word to meet your needs.

Indeed, by being consistent and by distributing your use of language to a wide audience you can introduce new words and shape the meaning of existing words. This means that your tests have to be very close to a formal and structured form of English. This can be accomplished by using Aspect-oriented testing or by creating fluid API wrappers on top of the microservices. Aspect-oriented testing would look like this (cucumber syntax):

Given a User
When the user adds a Subscriptions
Then User is a Customer
Whereas a fluid API would be something like this (C# syntax)
User user1 = UserManager.CreateNewUser();
SubscriptionManager.AddNewSubscriptionFor(user1);
Assert(UserManage.Get(user1).IsCustomer());

This creates a lot of focus on syntactic sugar* and writing code behind the scenes to ensure that your code incorporates your business logic (your test) and looks the way you want it to. Every language has their own way to solve this challenge. Even in C, you could use macros to take a pseudo code and turn it into something that would compile, and chances are that your results would be far superior to that of your current language usage.

For my uses, the cucumber syntax, with a side of extra syntactic sugar that allows me to define variables, is very effective. (I will get into this in more detail another day.) Whichever language you use, keep in mind that the goal of creating a DomainSpeak vocabulary is not to make your code look pretty, but rather to ensure that your code communicates clearly to the business and developers and that meanings are defined with precise and consistent language.

The End Goal is Efficient Quality Assurance

The goal, after all, is to improve productivity and deliver a quality product. Clear communication will not only benefit your team internally, it will also influence other teams. By communicating your results in consistently clear and concise language to a wide audience, you will influence their behavior. You will be able to efficiently respond to questions along the lines of “we use ‘customer’ for all our ‘users.’” You will also be able to easily define and answer where the rule may not hold and why you use the word you use. Again, the goal is not to dictate to folks what words to use, but to explain the meanings of words and to encourage consistent usage. Adoption will follow slowly, and over time usage will become a matter of habit. Over time services will be rewritten and you should be able to delete a few lines from your anti-corruption layer every time one gets rewritten.

*syntactic sugar allows you to do something more esthetically, but not in a necessarily new way. Different ways of saying the same thing. Just looks different. Not important because it’s not new, but significant because it makes things more readable and understandable. Clean up language/clearer code and therefore easier to find a bug.

If you enjoyed this piece please share it with your colleagues! If you have something to add, please join the discussion!

Navigating Babylon: Part I

Navigating Babylon

“What do you mean?” is a common phrase, it is the communication equivalent to a checksum; making sure that the words are interpreted correctly. This is what testing does, it ensures that the business concepts are interpreted correctly. There are bugs where the code does not do what the developer intended, these have been addressed by tools and methodologies, and are becoming rare. The bugs that we want to talk about here are the ones where it does do what the developer intended, but not what the business desired.

What better tooling has not solved is the misinterpretation of requirements.

Again we are back to the problem of working with humans. Errors that result from misinterpretations are becoming the dominant target in testing, but to find them, it is imperative that the language that describes the tests is unambiguous. This is becoming increasingly challenging as distributed teams create isolated dialects.

Let’s look at two different ways to overcome communication problems. One option is to isolate ourselves from dialects using the anti-corruption layer pattern. A second option is to make language across teams cohesive by sharing information with clear and specific language. Effective, clear communication flows naturally and increases efficiency, which is our ultimate goal.

Microservices are adapted to today’s remote workforce and they are increasingly used because of their efficiency. Microservices on the development end reduce complexity and decrease the need for constant communication between teams. An unfortunate side effect is that over time individual microservices naturally go their own way and as a byproduct, a microservice specific code is created using unique language/dialect/slang. Microservices tendency to promote the creation of distinct vocabulary and meanings increases the likelihood for mistakes/bugs and broken tests.

Improving on the Monoliths of the Past

One of the benefits of monolithic architecture is that everyone spoke the same language. Yes, the user object was enormous and was far more complex than necessary, but everyone spoke the same language because everyone used the same object. Everyone agreed that User.Id was the identifier.

The intention behind microservices is to make life simpler and reduce complexity, but at the same time, they have increased complexity by creating many teeny distinct monoliths that naturally encourage speaking slightly different languages or dialects. We have gone from the Pyramids to the Microservices Towers of Babel. Another analogy might be to say that we have created geographically isolated human populations that encourage the development of distinct dialects and regionalisms. Where we once had User.Id, we may now find one microservice has chosen User.Identifier, while the next microservice has settled on Customer.Id, and yet in another one we find Agent.Uid.

Despite this drawback, Microservices are still a natural efficiency.

The solution is not to eliminate microservices, as they fit the increasingly popular corporate structure of remote work. Conway’s Law states that companies design software that mirror their internal structure, and as the individual developers become more isolated by remote work it makes sense that an architecture is used that supports the build out of small isolated components. Microservices are born as the way to accommodate the distributed nature of teams. However, as each service is built, we start to see slightly different terminology, slightly different assumptions, methods for error handling, exception behaviors and so on.

The Dangers of Misinterpretation

Writing the code for integration tests within microservices, we often end up with something like this:

Assert.Equal (a.SerialNumber, b.AssetId);

One service calls it a serial number, the other an AssetId. Technically, it is just a small issue that is easily understood in a conversation, but potentially grounds for a larger problem. Problems like this are only amplified when the respective developers work in different parts of the world and several different time zones. Sure, the company can clock development time 24 hours per day, but developer A has to wait 12 hours for developer B to get in and by that time developer A is already back in bed. And so, developers put in their “one line fix,” but the test cases that integrate these services tend to repeat this incongruent language.

There are patterns that solve this, the Adapter design pattern or the Anti-Corruption Layer domain driven design concept both look at solving these issues. The core idea is that you take the service that speaks a different dialect and wrap it in some code that takes care of all of your mapping for you. Effectively creating a layer where you put in all your “one line fixes.”

If one service thinks a field is a GUID, and the other a String, you convert it in there to a common format across all tests. The same goes for nullable fields, or fields named slightly different (Id, id, Uid, UserId, Identifier, UserIdentifier, uIdentifier, etc.). Create a layer for each service, even if there is no special mapping, to isolate your test cases from all the different dialects. Then create your own version of the message objects that have consistent naming, and then have code that maps your test objects to the microservice objects. If you use reflection to map the fields that match, you should be able to achieve this with relatively little code.

Now you might look at this process and think: “Wouldn’t it be easier if they just all named it the same thing?” Unfortunately, the answer is actually “no.” This problem is the direct result of the communication structure of an organization. To change the structure of communication you will have to change the organization, which is rarely a pragmatic solution.

Harnessing Babylon

Let’s look at two plausible solutions that are significantly more pragmatic than changing your overall organizational structure:

  1. Create a Department of Naming Stuff (Dons): Need a new field, ask the Dons for the name to use. They will look to see if it is already in the big book of names for stuff and, if not, they will add it and name it.
  2. Direct and consistent communication: Have developers communicate directly. If people talk to each other they start to adopt the same words for the same things, the closer the interaction the closer their use of words correlates.

Option 1

Languages (the human used kind, not the computer used kind) have this same problem. The Dons is comparable to the official Oxford Dictionary. Of course, we may have the problem that the US and UK don’t see eye to eye on what dictionary to use; especially on the common informal words. The result is that in most scenarios, the experience of the Dons is likely to be heavy handed. And, the day the Dons makes a poor naming decision, the result will be to create a ridicule of the concept.

By the time the problem is advanced to the stage where this looks like a good idea you will find that retrofitting to a standard language is cost prohibitive. Especially as the problems it causes means the project is buggy, late and over budget.

Option 2

And, given our pretense for remote work, the second option is inherently impractical as developers work in different locations and across time zones. Option two might also continue to promote regionalisms, slang, and dialects that develop when people who live in close proximity start to adapt words to have specific meanings, meanings which are often not shared across the wider organization.

Breaking out of Babylon

Fortunately, we are no longer limited to two solutions. We have a third solution that first entered the scene with radio and television, but that has now become nearly universal thanks to the digital age. This third type of technology is a unifying language. In daily life, it is what is known to us as Pop Culture.

A Unifying Language

Pop culture targets media for consumption by large segments of the American population. By favoring words with specific meanings, pop culture means that widely distributed words and meanings become adopted not only into American English but around the world. Netflix has members located in 190 countries around the world and Facebook has nearly 2 billion worldwide members. We can google truthiness, take a selfie, and Facebook the results. Pop culture introduces new words, redefines words and narrows the meaning of words by repetitive exposure to specific audiences.

We can do the same thing with microservices: we can create our own culture. We can create and distribute test results with consistent specific meanings. By creating our own uses and definitions, we can appropriate the language that we need and define it for our specific purposes. QA then becomes the company’s “pop” culture influencer and over time effectively influences the meanings that people associate with specific words. This is not a quick process, and measuring change will be difficult.

To be continued next week: How to Create a domain-specific vocabulary or DomainSpeak.

Please share this article and join the discussion in the comments!

A Look at Why I can’t follow advice and neither can you…

Anyone who has tried to lose weight knows there is a significant gap between “knowing” and “doing.”

We can read the books. Follow the science. Listen to the advice. The professional advice and of course the “helpful” information from well-meaning friends, coworkers and even the cashier at the grocery store. Weight-loss is so ubiquitous in our society that everyone feels confident sharing their expert advice and experience. And yet we are a good decade or more into an obesity epidemic…

Lousy Advice or Lack of Questioning?

Software testing ironically has many of the same issues. Testing is part of every organization, and most any developer can tell you how get things done (usually without having done it themselves) and there is plenty of advice, numerous books on best practices that tell you what should be doing. And yet, testing doesn’t always go the way it should nor does it consistently deliver the results that we expect. This is not due to a lack of standard rules or even expert advice, but it is indicative of the fact that each situation has unique conditions. Success is thus much more complicated than simply the following of advice. For example, telling a vegan that the perfect way to lose weight is to cut out carbs is perhaps more complicated and a more significant sacrifice than for someone working (and eating!) at a BBQ shack.

In many situations testing does not work for similar reasons to diet advice: it is not appropriate to the specific organization and the implementing individuals. You need a solution that works for you, for your group, for your particular advantages, and for your limitations. Drugs (pharmaceuticals) are another parallel. Not every drug is right for every person. Some people experience side effects or allergies, where others don’t. Some drugs are only effective for some people. Some drugs make grand claims, but it is unclear whether they are even actually any more effective than a placebo.

Let’s consider the disclaimer on Forbes for Chantix a drug that is supposed to help folks quit smoking.

Purpose: A prescription medicine that contains no nicotine, Chantix can help adults stop smoking. Studies have shown that after 12 weeks of use, 44% of users were able to quit smoking.

Side Effect: Some of the most common side effects of using Chantix include trouble sleeping, as well as vivid, unusual or increased dreaming. In February, the FDA also released a public health advisory stating that severe changes in mood and behavior may be related to use of the drug.

Only 44% effective. Trouble sleeping. Vivid dreams. Possible unknown changes to behavior. Wow. Are we sure this drug is worth the investment and risk? Is it the optimal method to quit smoking?

We should be asking similar questions when we select testing methods.

Time and money can be spent quickly on ineffective methods. Ineffective testing creates frustration, scheduling problems, budget problems and it often results in a lack of morale for the various stakeholders. And so, the next time you consider implementing the latest advice on engineering productivity or a different idea for reducing testing cycles, think carefully about what its side effects might be for your specific situation. Is the testing method truly optimal for your organization? Similarly, if you just completed a testing cycle and your result was moody engineers and not meaningful data, consider writing up a disclaimer for the next group or project that might fall into a similar trap.

One-size-fits-all testing disclosure:

Problems may occur if testing is poorly integrated or a one-size-fits-all approach is taken. Some of the most common side effects of a “one-size-fits-all approach” include unrealistic expectations, inadequate identification of defects and overextended budgets. A NIST study indicated that 25% to 90% of software development budgets might be spent on testing.

For a field where we have a digital track record for everything from requirements to code to bugs to releases, we don’t have much information about what works and what does not work. Everyone tries to share the advice that he or she believes is right; however, before implementing the newest tip make sure to ask questions. Ask the person sharing the advice, if she has any real world experience with the method. And, then ask yourself or your team, if the advice is truly appropriate to and optimal for your testing situation. 

Remember your organization is unique and not a one-size-fits-all.