Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo

Browse resources

Guide

Lightstep Resources

Find resources from Datasheets, Videos, White Papers and more.

The complete guide to distributed tracing

This guide will help you understand how distributed tracing provides unrivaled visibility and analysis in systems that can have dozens, hundreds, or thousands of services working together. You'll also learn: - How to establish true observability with distributed tracing - How your service affects — and is affected by — dependencies - How to decide when to take action by focusing on symptoms that directly impact users

Download NowDownload Now

Explore Guide resources

OpenTelemetry and distributed tracing for mobile applications

In this guide, we will discuss some of the benefits of distributed tracing, and more specifically OpenTelemetry when it comes to mobile applications as well as some considerations to keep in mind when implementing this type of solution. You'll also learn: - The current state of mobile monitoring - Ideas on gaining full observability from your mobile stack on down - Common ways to transmit mobile observability data

DownloadDownload

Incident management handbook: How to deal with being on-call

Responding to an alert can be daunting – for managers, developers, and anyone who is paged in the middle of the night. This step-by-step guide shows how to ask the right questions and stay organized along the entire journey from getting paged to resolving an incident. You'll also learn: - Where to start when your pager goes off - How to work through the On-Call Stages of Grief - How to use the process of elimination as a debugging strategy - How to mitigate the impact on users as quickly as possible

Download NowDownload Now

OpenTelemetry 101

This technical guide provides an overview of key concepts of observability and distributed tracing, and details the underlying relationship between core concepts in the OpenTelemetry project, including spans, metrics, and exporters. OpenTelemetry provides a single set of APIs, libraries, agents, and collector services to capture distributed traces and metrics from your application. OpenTelemetry is a Cloud Native Computing Foundation (CNCF) project. This is a great reference guide for those who are just getting started with distributed tracing, as well as for veterans who are using (or plan to use) open source instrumentation, including OpenTelemetry or its predecessors, OpenTracing and OpenCensus.

Download NowDownload Now

The power of platform teams

Platform teams are force multipliers — they reduce cognitive load for developers and make it easier to ship new features. Every investment in platform is an investment in making the rest of the development organization more efficient and effective. This ebook will help you understand how to build successful platform teams, and how to set them up for future growth. You'll also learn: - How to build a modern platform team - How to release more features with less risk - How to connect your platform team with key stakeholders in development, incident response, management, and product

Download NowDownload Now

Ensuring success with Kubernetes

Containerized applications like Kubernetes allow teams to build and release software faster, but at the cost of increased complexity. This guide will help you navigate that complexity, avoid common pitfalls, and take advantage of the highly active user community. You'll also learn how to: - Roll out Kubernetes at scale - Build an effective observability strategy - Ensure a better customer experience

Download NowDownload Now

Putting services first: A new scorecard for observability

Microservices allow us to achieve higher velocity and team independence, but at the cost of limited visibility into service health and dependencies. When things break — and they will — we can't expect to find a solution by guessing-and-checking across the entire system. Or rely on raw telemetry data (metrics, logs, and traces) to guide our hypotheses and investigations. Enter: A New Scorecard for Observability In this guide, you'll learn: - Why the “Three Pillars of Observability” fail to address the observability needs of modern systems - How APM broke microservices - What you can do to better develop, understand, and operate your distributed system

DownloadDownload

Distributed tracing at scale: Analyzing Google’s June 2nd outage

On Sunday June 2nd, Google Cloud Platform had an extended networking-based outage. There was significant disruption of commonly used services like YouTube and Gmail, as well as Google hosted applications like Snapchat. LightStep Research’s ongoing synthetic testing shows that the impact was longer than the advertised incident report, and provides an example of the type of evidence you can share with a cloud provider when discussing an outage. Read this brief to understand: - How to quickly understand the scope and impact of an outage - How to measure the performance of cloud service APIs - How to utilize distributed tracing at scale - How to fact check an incident report or status page

DownloadDownload

Best practices for root cause analysis

Root cause analysis is about understanding not just what happened but why it happened. It’s about how our assumptions about a system or services are different from reality, so that fixes address the underlying cause instead of simply rolling back the latest deployment. This guide provides the best practices to effectively use root cause analysis to understand why an outage happened in the first place so that teams can prevent this from occurring in the future. You'll also learn how to: - Understand the context of a problem and learn from outages - Effectively and efficiently find the root cause of problems - Start root cause analysis with the right mindset

Download NowDownload Now

How to manage the hidden costs of Kubernetes

Kubernetes has enabled teams to realize the benefits of microservices — and deploying, scaling, and running distributed software at scale has never been easier. But these benefits come with significant costs. In this guide, we’ll discuss how to avoid major potential costs of Kubernetes: - 1,000s of hours spent fixing bugs - Spiraling monitoring costs - A lack of understanding of service dependencies

Download NowDownload Now
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems