Lightstep from ServiceNow Logo





Lightstep from ServiceNow Logo
< all announcements

Introducing Snapshot Analyzer: Interactive Investigations for Deep Systems

Distributed tracingDistributed tracing generates a stream of rich, contextual data. But as systems grow in scale and complexity, it can be challenging to navigate through thousands of traces to quickly find answers to performance questions.

When things are on fire, how do you know if your hypothesis is a good one?

To help answer that question, we built Snapshot Analyzer. It’s a simple way to investigate cross-service performance at scale.

With Snapshot Analyzer, you can filter comprehensive views of complete system behavior (SnapshotsSnapshots) across any dimension in your system, and group cross-service traces by any attribute.

Think of it as being able to perform SQL-like operations on large amounts of trace data.

Narrow the Scope of Your Investigation to What Really Matters

When you’re investigating an issue in a complex or deep systemdeep system, it can be difficult to narrow the scope to whatever is the most likely culprit.

Tracing provides context for end-to-end requests, which often include multiple services. With Snapshot Analyzer, you can filter these traces by one or more services, operations, or tags and focus your analysis on only the traces relevant to your investigation.


In the above gif, we currently have a SnapshotSnapshot of the most recent traces from our system. From here, we can filter by the tags error: true and canary:trueto find only the traces that are returning errors after a canary release.

Digging into a single trace, we can quickly glance at the right logs and find the issue! Additionally, the suggestions themselves are scoped to only the ones matching the provided filter. This allows you to perform flexible, exploratory analysis on complex tracing data.

What If You Have No Idea Where to Start Your Investigation?

In certain (scary) situations, you may be notified of an issue by your customer, and you don’t know where to begin investigating.


Fear not! Backed by CorrelationsCorrelations, Snapshot Analyzer can help you reduce distractions from red herrings and speed up root cause identification.

Correlations automatically surfaces attributes that are associated with latency. With Snapshot Analyzer, you can dive deeper into the insights provided by Correlations, add additional filters, and focus your investigation on rapid hypothesis generation and validation.


Group Traces By Any Attribute

Snapshot Analyzer also allows you to group traces that share a certain attribute and compare performance characteristics across groups.

So, how does this work?

In the example below, we’re investigating what we initially guess is a database issue. We start by filtering the set of traces down to only those that have a db.type=cassandra tag. We then group these traces by region to see aggregate statistics across both us-east-1 and us-west-1. The difference in error percentage and average latency tell us that the issues is actually region-specific. We can dig into a trace in this region to get the context we need to mitigate the issue. This ability to group traces by a tag of any cardinality is invaluable to quickly corroborating or eliminating hypotheses.


Select Attributes Across Traces

Snapshot Analyzer provides the ability to view additional contextual information across traces. The Add Column feature allows the user to view the value for any tag in the Trace Analysis table. Having this information next to the individual spans helps identify patterns and anomalies in your system. It can expedite hypothesis generation when performing root cause analysis by helping you narrow down the trace search space.


Want to Give Snapshot Analyzer a Try? Check Out Our Free Sandbox

Go to and use Snapshot Analyzer to resolve a performance regression in under 10 minutes.

Interested in joining our team? See our open positions herehere.

November 4, 2019
3 min read

Share this article

About the author

Karthik Kumar

Karthik Kumar

Read moreRead more

Strengthening our commitment to the OpenTelemetry project 

Carter Socha | Apr 20, 2023

Lightstep is the first company to natively provide customers with complete control of their telemetry pipeline which saves time and money, and provides the freedom to innovate at scale. By embracing OpenTelemetry support without vendor lock-in, Lightstep helps you make complex app development easier and faster.

Learn moreLearn more

Transform ServiceNow workflows with Service Graph Connector for Observability - Lightstep

Andrew Gardner | Dec 20, 2022

The Service Graph Connector for Observability - Lightstep is the bridge between IT Operations and DevOps teams. When combined with ITOM Visibility, it provides organizations with a complete, end-to-end view of their entire cloud estate.

Learn moreLearn more

Evolving our incident response strategy

Lightstep | Nov 2, 2022

Lightstep’s Incident Response offering will be sunset effective January 31, 2023. Current customers may continue to use the service until then. Lightstep Observability will not be affected.

Learn moreLearn more

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems