Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

Managing SLOs and SLIs in Lightstep

Why are service levels important?

Because ultimately, service level objectives, indicators, and agreements (SLO, SLIs, and SLAs), reflect customer expectations. They help developers manage the kinds of failures that will stand in the way of your customers’ success.

Lightstep can help you monitor and meet your Service-Level Objectives (SLOs) and resolve incidents quickly. It’s easy to set custom alerts and notify your team as soon as SLI behavior trends towards a regression. Here’s how it works.

How to track key SLOs with custom alerting 

Let’s start with a visual representation of the performance history of a specific service, operation, or query. In Lightstep, we refer to this as a stream.

Here we have a stream for the Krakend API Gateway service in this system. I’ll click the “Create Condition” button in the upper-righthand corner to produce the dialogue seen below. I can choose exactly which signal I’d like to monitor – latency, error percentage, or operation rate – along with the threshold and evaluation window. In this instance, I’ve indicated that I’d like to be alerted if the error percentage for this stream surpasses 10% in a ten minute period.

PagerDuty Lightstep gif

Now that I’ve set my conditions, I have to add an alerting rule. From here, I’ll select the PagerDuty integrationPagerDuty integration from the list of available options, along with the destination and update interval.

Lightstep view of PagerDuty integration

Lightstep will automatically review the percentage of errors affecting the service’s performance health over the last ten minutes, and report any findings in the sidebar. It is clear in the image below that the condition I set was breached sometime during the last ten minutes.

Pagerduty Alert in Lightstep

As a result, a page was triggered by PagerDuty as soon as this breach was detected.

Lightstep showing PagerDuty Alert

Now, I can immediately investigate the error using Lightstep, and hopefully identify and implement a solution more quickly.

What’s different about investigations with Lightstep

When trying to rapidly restore service, it can be difficult to separate good hypotheses from bad ones. But Lightstep can help you avoid the guesswork entirely: with unlimited cardinality and a high-fidelity dataset uncompromised by sampling, Lightstep reveals issues unavailable to conventional monitoring solutions. It instantly analyzes thousands of traces from your system to produce root-cause insights for performance regressions, so your team can resolve issues quickly and meet SLOs.

Want to see it for yourself? Check out our free interactive sandboxfree interactive sandbox, where you can debug an iOS error or resolve a performance regression using our suite of observability tools.

Interested in joining our team? See our open positions herehere.

March 18, 2020
3 min read
Monitoring

Share this article

About the author

Ashley Rahimi Syed

Ashley Rahimi Syed

Read moreRead more
Monitoring

Kubernetes vs Docker Swarm: Which is better?

Austin Parker | Mar 19, 2020

You may be looking into the pros and cons of Kubernetes vs Docker Swarm. Both platforms are excellent, but they both have qualities that are unique to each other. What exactly are Kubernetes and Docker Swarm? Let’s dive in and learn a bit more.

Learn moreLearn more
Monitoring

How Lightstep’s Slack Integration Makes It Easier to Resolve Performance Regressions

Ashley Rahimi Syed | Jan 28, 2020

If you find a performance issue or regression, you can quickly troubleshoot it with your team using Lightstep’s Slack integration. We’ve made it easy to establish shared context with your entire organization – right from the app!

Learn moreLearn more
Monitoring

Tips for Monitoring Your System Over the Holidays

Eric O'Rear | Dec 17, 2019

For many, the joys of the holidays also include a major increase in system traffic — coupled with a major reduction in support and coverage. We have some tried-and-true advice to help you prepare for this unique time of the year. 

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems