Lightstep from ServiceNow Logo





Lightstep from ServiceNow Logo
< all blogs

Observability Mythbusters: Observability is NOT Only for SREs

When we think about ObservabilityObservability, we usually think about it in terms of SREs or developers. After all, it’s a mighty powerful practice for helping troubleshoot applications in production. But what if I told you that Observability can fit into the world of QA? What would you say to that?

If you’re scratching your head, don’t worry. I had never even considered Observability as a practice that could be leveraged by QA analysts. Until today, that is!

Today I met with Parveen KhanParveen Khan, a Quality Advocate who reached out to me on LinkedInLinkedIn after reading my Unpacking Observability series on MediumUnpacking Observability series on Medium. She took a keen interest in Observability after noticing a few inefficiencies in her day-to-day work, and wanted to understand how it could make her work life better.

Observability for Exploratory Testing

The more Parveen learned about Observability, the more she wondered about how it could fit into her world as a testerThe more Parveen learned about Observability, the more she wondered about how it could fit into her world as a tester. Parveen’s focus is on exploratory testingexploratory testing. It’s a mostly manual process, and it can be very time consuming. Especially when you encounter a bug. I can totally relate.

My first role out of university was as a QA tester. As a tester, I had to manually execute test scenarios written by the more senior QA analysts. One of the things that I remember about my short stint as a QA tester was the amount of waiting I did. After logging a bug with the development team, I would look for other test scenarios to run through. But sometimes I’d encounter a showstopper bug–the kind whereby I couldn’t do any more testing until it got fixed. And so, I found myself waiting and waiting and waiting until the bug got fixed. In my boredom, I taught myself SQL so that I could query the database to better understand why the code was barfing out. It also proved helpful in allowing me to relay more detailed information to the development team when I filed my bugs. Any developer will tell you that detailed bug descriptions are way more productive (and less anger-inducing) than “It doesn’t work.” or “It’s broken.”

As a QA tester, Parveen wondered, “If we had Observability baked into our app, then I could actually try to understand what’s going on in the app that I’m testing!” 💡 And along the same lines as my SQL queries helped the developers in my QA days, Parveen understood the benefits of having better Observability into the systems she was testing, because it both allows and empowers her (and her fellow testers) to dig into what’s going on in the app, and enables them to provide more context when filing a bug for the development team.

And she’s not alone in her thinking!

My lovely Twitter peeps directed me to the works of Abby BangserAbby Bangser, who shares similar views in her O11ycast EpisodeO11ycast Episode. She states that testers can use Observability to their advantage by having access to relevant and explorable data. As Parveen realized in her own explorations, this gives testers the ability to dig into the root cause of a bug, which is helpful for filing detailed bug reports to developers. But wait…there’s more! If a test engineer identifies a problem and isn’t able to track it down in the Observability back-end, it means that the telemetry data emitted by the application is lacking. That is, it shows that the code hasn’t been instrumented well enough, because it’s not exposing data that’s useful to us for troubleshooting!

But we can take this one step further.

TDD, TBT and OOD...oh my!

Ensuring that your system is Observable means that you have to instrument your code. Awesome, right? Now, what if we turned code instrumentation into a quality gate? This means that one of the criteria for passing the tests in your CI/CD pipeline is ensuring that the code is instrumented. in Space can this be accomplished?? The answer is: with Trace-Driven Development (TDD - different from the other TDDother TDD), also known as Trace-Based Testing (TBT)! It was introduced by Ted YoungTed Young at his 2018 KubeCon North America2018 KubeCon North America talk. In it, he shares the idea of leveraging distributed traces to write application tests. If you’re already instrumenting your code and are therefore sending traces out to an Observability back-end, then why not take advantage of these traces and use them to write your tests?

You might be thinking, “Sure, that’s nice, but HOW do you do that?” Fortunately, there are a couple of new tools which do just that. Both MalabiMalabi and TracetestTracetest leverage OpenTelemetry (OTel) traces to define tests. MalabiMalabi is an open-source TBT Javascript framework developed in 2021. TracetestTracetest, which is also open-source and runs on Kubernetes, is a newer entrant into the game, launched in April 2022launched in April 2022 (super fresh!), and they were inspired by Ted’s KubeCon talkthey were inspired by Ted’s KubeCon talk! How cool is that??

Observability Mythbusters: Observability is only for SREs - Discord Tracetest chat screen capture

Aside: At the time of Ted’s talk, Traces weren’t really standardized. OpenTelemetry, which was created in 2019created in 2019, wasn’t even a thing at the he was waaay ahead of his time!

Yeah, yeah…that’s all well and good, but how does TBT help with quality gates? Well, if your QA test engineer writes their test automation by leveraging trace-based tests, it means that the traces must be present in the application code in order for them to be leveraged. The mere fact that the automated QA tests (which are part of the CI/CD pipeline) are written using TBT automagically makes it a quality gate. Ta-da! 🎉

Okay! We’re on a roll here! So we’re instrumenting code now. Yay! Well, if you’re going to instrument your code, then you might as well instrument as you’re writing your code, because that’s just waaaay easier than trying to add instrumentation after the fact? Why? Because as you’re writing the code, it’s still fresh in your mind. It’s like writing comments as you code, compared to trying to understand someone else’s code and commenting after the fact.

Aside: You may find yourself in situations where you may need to instrument after the fact, like at one of my previous jobs, where some code hadn’t been instrumented at all. It’s not an ideal situation, but it’s better than nothing.

The act of instrumenting-as-you-code is known as Observability-Driven DevelopmentObservability-Driven Development, or ODD. ODD is an extension of Behaviour-Driven-DevelopmentBehaviour-Driven-Development (BDD), which is about writing test cases around how a system behaves. Think of ODD as keeping Observability in mind as you code. If you needed to poke into your Observability back-end to troubleshoot an issue, what information would you need to include in your traces?

Now, don’t expect to get it right off the bat. That’s okay though. Testers can help tease out what’s missing in your instrumentation to make your system more observable, as we saw in the first section.

The point is that you’re instrumenting your code, and it can only get better from here!


Chatting with Parveen about her QA’s perspectives on ObservabilityQA’s perspectives on Observability reminded me that Observability doesn’t reside only in the domain of developers and SREs. It applies quite well to QAs as well!

By bringing the Observability conversation to QAs, we see the following benefits:

  • QA testers are empowered to troubleshoot when they encounter a bug, and can file more detailed bug reports. ✅

  • Having a “tracing must be present” quality gate ensures that developers instrument their code with a trace-first approach. This is enabled through TBT, with modern tools like MalabiMalabi and TracetestTracetest. ✅

  • Since traces must be present for QA test engineers to be able to write their automated tests, it “forces” developers to instrument-as-they-code, thereby adopting Observability-driven development practices. ✅

One final thought. Parveen said something that I thought was very powerful: instrumenting code wasn’t just for her. By making Observability a QA concern, she benefited those whose job it is to troubleshoot issues in production. That's some powerful stuff!

Now, please enjoy this picture of my friend Lisa’s guinea pig, Taffy.

Observability Mythbusters: Observability is only for SREs - Taffy the Guinea Pig

Peace, love, and code. 🦄 🌈 💫

If you’d like to share stories of your Observability journey, or just ask Observability-related questions, hit me up on the Lightstep Community DiscordLightstep Community Discord. Hope to hear from y’all!

May 12, 2022
7 min read

Share this article

About the author

Adriana Villela

Adriana Villela

Read moreRead more

How to Operate Cloud Native Applications at Scale

Jason Bloomberg | May 15, 2023

Intellyx explores the challenges of operating cloud-native applications at scale – in many cases, massive, dynamic scale across geographies and hybrid environments.

Learn moreLearn more

2022 in review

Andrew Gardner | Jan 30, 2023

Andrew Gardner looks back at Lightstep's product evolution and what's in store for 2023.

Learn moreLearn more

The origin of cloud native observability

Jason English | Jan 23, 2023

Almost every company that depends on digital capabilities is betting on cloud native development and observability. Jason English, Principal Analyst at Intellyx, looks at the origins of both and their growing role in operational efficiency.

Learn moreLearn more

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems