Why are service levels important?
Because ultimately, service level objectives, indicators, and agreements (SLO, SLIs, and SLAs), reflect customer expectations. They help developers manage the kinds of failures that will stand in the way of your customers’ success.
Lightstep can help you monitor and meet your Service-Level Objectives (SLOs) and resolve incidents quickly. It’s easy to set custom alerts and notify your team as soon as SLI behavior trends towards a regression. Here’s how it works.
How to track key SLOs with custom alerting
Let’s start with a visual representation of the performance history of a specific service, operation, or query. In Lightstep, we refer to this as a stream.
Here we have a stream for the Krakend API Gateway service in this system. I’ll click the “Create Condition” button in the upper-righthand corner to produce the dialogue seen below. I can choose exactly which signal I’d like to monitor – latency, error percentage, or operation rate – along with the threshold and evaluation window. In this instance, I’ve indicated that I’d like to be alerted if the error percentage for this stream surpasses 10% in a ten minute period.
Now that I’ve set my conditions, I have to add an alerting rule. From here, I’ll select the PagerDuty integrationPagerDuty integration from the list of available options, along with the destination and update interval.
Lightstep will automatically review the percentage of errors affecting the service’s performance health over the last ten minutes, and report any findings in the sidebar. It is clear in the image below that the condition I set was breached sometime during the last ten minutes.
As a result, a page was triggered by PagerDuty as soon as this breach was detected.
Now, I can immediately investigate the error using Lightstep, and hopefully identify and implement a solution more quickly.
What’s different about investigations with Lightstep
When trying to rapidly restore service, it can be difficult to separate good hypotheses from bad ones. But Lightstep can help you avoid the guesswork entirely: with unlimited cardinality and a high-fidelity dataset uncompromised by sampling, Lightstep reveals issues unavailable to conventional monitoring solutions. It instantly analyzes thousands of traces from your system to produce root-cause insights for performance regressions, so your team can resolve issues quickly and meet SLOs.
Want to see it for yourself? Check out our free interactive sandboxfree interactive sandbox, where you can debug an iOS error or resolve a performance regression using our suite of observability tools.
Interested in joining our team? See our open positions herehere.
Explore more articles

Kubernetes vs Docker Swarm: Which is better?
Austin Parker | Mar 19, 2020You may be looking into the pros and cons of Kubernetes vs Docker Swarm. Both platforms are excellent, but they both have qualities that are unique to each other. What exactly are Kubernetes and Docker Swarm? Let’s dive in and learn a bit more.
Learn moreLearn more
How Lightstep’s Slack Integration Makes It Easier to Resolve Performance Regressions
Ashley Rahimi Syed | Jan 28, 2020If you find a performance issue or regression, you can quickly troubleshoot it with your team using Lightstep’s Slack integration. We’ve made it easy to establish shared context with your entire organization – right from the app!
Learn moreLearn more
Tips for Monitoring Your System Over the Holidays
Eric O'Rear | Dec 17, 2019For many, the joys of the holidays also include a major increase in system traffic — coupled with a major reduction in support and coverage. We have some tried-and-true advice to help you prepare for this unique time of the year.
Learn moreLearn moreLightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems