The QA Team – A Startup Story

A real startup story about how I fell in love with lean processes and single-threaded ownership.

Mar 17, 2024

Somewhere in the North of Finland November 2023. Photo Credit: Me

In this blog post, I will talk about seemingly subtle differences in the QA process of a software product and how these impact your ability to deliver high quality on time. I will tell a real story from the time when I was working for a small startup to make common challenges tangible.

I was part of the software engineering team building the product. After three weeks, I finished feature implementation: a heartbeat mechanism that detects the failure of long-running, distributed background jobs. In case of failure, another background worker machine picks up the job for re-processing. The change increased the reliability of the entire background job framework and was promised to a specific customer using background jobs for large, daily data extractions. I was proud to not have taken any shortcuts and even managed to improve code quality on the fly. I moved my task on the Kanban board from “In Progress” to the “QA” column. At this point, when coding is done, a feeling of reward and accomplishment sets in. It still does every time…even today. My job was done so I thought:

Just in time — made it before the agreed due date for delivery!

The next stage after implementation was testing by QA. It is not like the developer working on the product does not test. However, during the implementation phase testing was mostly limited to writing low-level unit tests. And you know — we still have the QA team. Isn’t it kind of their job to ensure quality?

The QA Handover

The QA engineer, also sometimes called Software Engineer in Testing, will focus on automating more complex integration- and end-to-end tests. Additionally, the QA engineer was responsible for performing manual steps like exploratory- and user acceptance testing. Checking compliance and other non-functional requirements.

I approached Ben from the QA team:

Hi Ben, I finished this feature that increases the reliability of our background task framework for long-running jobs. The customer is following up by asking if the agreed delivery date in 3 days still holds. I would love to confirm the date since they urgently require this improvement. Can someone from the QA team pick this up for testing right away?

Ben the QA Engineer:

Hi Juri! During a recent meeting with management, it was highlighted that the QA team’s top priority is to increase the reliability and quality of existing services. So first we have to finish improvements to our existing regression test suite. We need a couple of days to finally reach the desired code coverage. Afterward, I will personally test your feature with priority

Well… that was disappointing. My job was done and the task was ready for QA, but now there was a pause in delivery. The customer just had to wait. However, for me, there was no point in waiting around. On the next day, I picked the next feature from the product backlog.

The Kanban board visualizes the steps required to build and ship a feature to the customer, also referred to as the value chain of a product. Inherently, the value chain is split into phases like “Todo”, ” In Progress”, “QA” etc. On the one hand, feature implementation was assigned to the engineering team, while on the other, testing was the responsibility of the QA team. The first challenges appeared during the handover to QA:

Different Team Priorities: The engineering team focuses on delivering new features, while the QA team is busy improving test coverage of the existing regression test suite. Each team weighs and interprets company priorities differently. The more close teams are in the organizational chart, the more aligned priorities will be. With increasing distance in the organizational chart, priorities start to diverge. Different or competing priorities cause increased waiting times for handovers.
Manual Testing: Some QA tasks are hard to automate. In such cases, it can be considered cheaper to perform tests manually. Doing repetitive, manual work can not only be bland but is also prone to human error. With growing product size and release frequency, the manual effort
increases exponentially and causes delays in delivery. Everyone is super busy and stressed about how many manual things they still have to do, but the outcome remains limited.
Duplicate Test Suites and Slow Pipelines: Since QA engineers focus on automating end-to-end test suites we ended up with an inverted test pyramid - having lots of end-to-end but not enough fast, low-level unit tests. The fragmented approach to testing caused test scenarios to be duplicated in unit and end-to-end tests. In this organizational setup, it’s not a matter of choosing the right type of test for a given user scenario. Instead, each team blindly creates the types of tests under their ownership.

Handover back to Engineering

A week later, the delivery date agreed with the customer has passed. Finally, Ben reached out to me:

Juri, I have to pass this task back to you. I found this edge case where a specific corrupted background task can cause worker machines to endlessly retry without succeeding. We need to fix this before we continue with the rollout.

In disbelief and somewhat shocked I accepted the feedback. Turns out I did not test enough with corrupted or invalid background jobs.

Let’s take a closer look at what happened here!

Late feedback: QA considerations were not involved early enough in the design and requirements analysis phase. Quality becomes a bit of an afterthought, as it will be covered by the QA team later. The engineer might not be aware of compliance, security, or other non-functional requirements at the start of the implementation. Since end-to-end testing or compliance checks are done late in the value chain related issues are revealed late as well. The later bugs pop up in the value chain the more expensive it becomes to fix them. The worst case happens if a bug is revealed on production by a customer. This will typically require a rollback or an entirely new release cycle for a forward fix.
No holistic view of the value chain: No single-threaded ownership to deliver a feature. Who is responsible for orchestrating and aligning competing priorities from siloed teams and making sure tasks in the value stream can meet agreed due dates?

It is surprising how such a simple process comes with so much waste of time, energy, and conflict potential between involved teams that share the same goal. Agile theories will elaborate a lot about how to increase efficiency and remove “muda” (japanese for waste). Unfortunately, we are not done yet with the muda. Here are some further friction points related to handovers I have observed:

Incomplete Information: Often it’s not clear what exact information the other team needs to do their job. For the QA team to start working they have to deeply understand what are the customer requirements of the change and how it was implemented. A common question I got from QA was: “How can I test this?”. A common question I asked QA was: “How can I reproduce this bug you found?”. Incomplete information during a handover cause communication overhead and further delays.
Task Switching: Once feedback from QA is received the product engineer has to stop working on the current task in progress and switch back to the original feature to address any findings. Task switching causes cognitive load and can take considerable time, especially for complex implementations. The same applies when QA is asked to retest a feature once the bug is fixed. The rate of required task switching increases with the number of features processed in parallel.

To deliver a complex feature it’s common to have multiple iterations of these handovers between QA- and engineering teams. Each of these iterations will come with the full set of challenges discussed. In siloed companies teams that are intended to work together can start to play politics, blaming each other for not being able to deliver quality on time. It can feel a bit like playing ping pong with another team, instead of delivering something of value. Now some software engineers reading this might think: “Why should I care? At my workplace, there is no separate QA team responsible for testing.”

Fair point! However, take a second and ask yourself these questions:

Do you have any sort of dependency on another team required to deliver a feature to a customer?
Do you have a separate release team rolling out application changes?
Do you require help from another team for infrastructure changes?
Do you require any sort of approval from compliance, security, or management before releasing a change?
Do you have access rights required to ship a feature end-to-end to the customer?
Any manual steps in the value chain (besides implementing
the product itself)?

Chances are that you do have a handover with very similar friction points. The theory of constraints states that the throughput of the value chain is limited by its bottleneck. For example, you can increase the input capacity by adding engineers to the team, but the output can stay the same if the bottleneck is not addressed. In conclusion, any process improvement that does not address the actual bottleneck will not increase the throughput of the value chain. Many years back, learning about this theory of constraints was such an insightful moment for me.

And guess what?

I have repeatedly observed the bottleneck of the value chain to be a handover between teams.

To remove the handover bottleneck many software companies have fully integrated testing responsibilities into their engineering teams. No more separate QA team testing before every release. For me as an engineer, this means I’m now fully responsible for quality and testing. There is no one else to blame. More and more teams are enabled to release application changes and own related infrastructure as well. You read and hear about DevOps or “you build it, you run it, you own it” principle. Software Architectures are chosen that enable single-threaded ownership for teams to reduce cross-team dependencies like microservices or serverless.

Conclusion

Modern software companies using continuous deployments can release multiple times per day and will hardly have any cross-team handovers. A lot of praise has been written about these fast-moving, autonomous teams. It might be counterintuitive at first, but there is supporting science indicating that the quality of software products highly correlates with increasing release frequency (Accelerate: The Science Behind Devops). I firmly believe that both quality and release frequency are enabled by teams with minimal dependencies. I never really liked that phrase and I feel a bit old now saying: “That’s nothing new!”. Consider extreme programming: Kent Beck and Co have been showing for a long time what a small team with single-threaded ownership can achieve.

So am I saying we simply add more responsibilities like QA to software engineers? Quality addressed? Agility achieved?

Think about the cognitive load this puts on software engineers. Also, consider the huge range of skills required from a single person to successfully execute all phases of the value chain. Is that even realistic?

In the next blog post, I will discuss strategies for how to keep the cognitive load for engineers manageable while still supporting the idea of single-threaded ownership. You will read about platform teams, self-service tools, and automated quality gates. We will also take a closer look at another manual quality gate — the code review — and how it is different compared to my QA story.

Now over to you: What are your experiences with handovers? Comment and share!

Engineering Decompiled

Discussion about this post