Observability: 1+1 > 2
by Sachi Shah
Are you tired of using dashboards for everything? Finding the right chart in a sea of tens of thousands of charts during a late-night incident? Mid-investigation, do you find yourself stitching pieces of a puzzle together across multiple tabs, sometimes across multiple products? Do you manually connect the dots between your service’s application metrics and the health of its dependencies? And have little to no guidance on the root cause of the spike to your metric? You’ve come to the right place.
Imagine a world in which dashboards are not only intuitive and powerful for visualizing metrics, but also actionable. One in which you can map a change to any metrics (application, infrastructure, or cloud) to changes across service boundaries. All without being an expert or having in-depth knowledge of your service’s ever-increasing dependencies.
See an increase in utilization of a particular k8s pod? Map it to an increase in throughput to an operation running on that pod. Curious how a large customer is affected by a recent deployment? Map changes to key SLIs across your most important upstream and downstream services and operations. Alerted on a metric you’re not familiar with? Rely on Lightstep to surface possible root causes.
Metrics are helpful in understanding the health of your infrastructure and gaining a high-level understanding of the state of your service. They are also inexpensive to store, enabling historical analysis over several months (or years). But they fall short in distributed systems, because they live in a silo. They can give you a good picture of one service and its underlying infrastructure; but if the problem stems from a dependency, you cannot form a reliable connection with metrics alone. That’s where the power of tracing comes in.
By analyzing traces in aggregate, Lightstep can correlate changes to a metric to changes in the latency, throughput, or error rate in the service that generated the metric, and to changes in that service’s upstream and downstream dependencies. All without generating noise, which is likely the last thing you want during a late-night incident. You can create powerful dashboards and alerts off your metrics within Lightstep, and with a single click off a chart, view the most relevant correlated changes across your system.
Observability with Lightstep is really that powerful and simple. With the metrics and tracing telemetry being completely integrated, you can seamlessly debug issues and ensure your services are healthy and stable. Try out the product here!
Interested in joining our team? See our open positions here.