Root cause analysis in three clicks: Announcing major updates to Lightstep’s distributed tracing
Since the early days of distributed tracing at Google, we’ve been working to make complex systems easier to understand. Today, we’re excited to announce updates — including the ability to search logs on traces — that enable developers to identify the root cause of virtually any regression in three simple clicks.
Root Cause Analysis in Three Clicks
Click 1: See what’s changed
Lightstep now highlights where latency, error rates, or other service level indicators (SLIs) have experienced the greatest change. This enables anyone on the team to understand system-wide service health in seconds.
Click 2: Compare before and after
Once you’ve identified a regression, you can quickly compare it to baseline performance with a simple before-and-after view.
Added bonus: this comparison is also available between deployment versions (both real-time and historical). This means you can quickly see which version of your canaries performs best over any period of time.
Click 3: Pinpoint the exact logs and traces needed to resolve an issue
The before-and-after view takes you straight to our RCA (root cause analysis) page, where we automatically surface the traces, metrics, logs, tags, and operations associated with increased latency or errors.
By adding logs to tracing, you can now view and search log messages in full system context, and identify where a regression occurred, down to a single line of code. You can understand the impact it made — all in one tab.
Click 4: There is no click 4! That’s it.
Rather than switching back and forth between traces and logs (and spend hours grepping through log files), you can now view log messages in the context of the problem you're trying to solve, even if you have little to no knowledge about the application or system you're investigating.
To recap, as part of this update, you can now:
Automatically see which service level indicators (SLIs) have meaningfully changed
Instantly identify which logs you need to resolve an incident or investigate a regression
View side-by-side version comparisons for canary deployments, and immediately know which version is performing the best.
How can I try out these new features?
Sign-up for a product demoSign-up for a product demo with one of our developers.
Try LightstepTry Lightstep for free for 14 days.
Interested in joining our team? See our open positions herehere.
In this blog post
Root Cause Analysis in Three ClicksRoot Cause Analysis in Three ClicksClick 1: See what’s changedClick 1: See what’s changedClick 2: Compare before and afterClick 2: Compare before and afterClick 3: Pinpoint the exact logs and traces needed to resolve an issueClick 3: Pinpoint the exact logs and traces needed to resolve an issueClick 4: There is no click 4! That’s it.Click 4: There is no click 4! That’s it.How can I try out these new features?How can I try out these new features?Explore more articles

Strengthening our commitment to the OpenTelemetry project
Carter Socha | Apr 20, 2023Lightstep is the first company to natively provide customers with complete control of their telemetry pipeline which saves time and money, and provides the freedom to innovate at scale. By embracing OpenTelemetry support without vendor lock-in, Lightstep helps you make complex app development easier and faster.
Learn moreLearn more
Transform ServiceNow workflows with Service Graph Connector for Observability - Lightstep
Andrew Gardner | Dec 20, 2022The Service Graph Connector for Observability - Lightstep is the bridge between IT Operations and DevOps teams. When combined with ITOM Visibility, it provides organizations with a complete, end-to-end view of their entire cloud estate.
Learn moreLearn more
Evolving our incident response strategy
Lightstep | Nov 2, 2022Lightstep’s Incident Response offering will be sunset effective January 31, 2023. Current customers may continue to use the service until then. Lightstep Observability will not be affected.
Learn moreLearn moreLightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems