Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

Why choose OpenTelemetry?

Why choose OpenTelemetry?

I’ve written quite a bit about OpenTelemetry at this point, but almost all of it has been focused on explaining what OpenTelemetry does and how to use itwhat OpenTelemetry does and how to use it. But why does OpenTelemetry even exist in the first place? What problem does it solve, and why does it scratch an itch for a lot of developers?

Personally, it was frustration that drove me to work on observability full time, and to focus on telemetry in particular. Telemetry – traces, logs, metrics, etc – is the language our systems use to describe what they are doing. And it felt like the traditional “three pillars” approach to generating telemetry was designed to make my life a nightmare.

So here, in a nutshell, are the top four reasons which motivated me to help create the OpenTelemetry project.

One: Write once, send everywhere

I always want to try new tools before I buy them. But having to rip and replace my entire telemetry pipeline in order to do that creates a serious headache.

WhyChooseOTel-image1

With the OpenTelemetry Collector, you can add and remove providers with a simple configuration change.

Here’s why. When you exchange one set of instrumentation for another, you’re not just switching out code, you’re also changing what data is emitted. Even something as effortless as swapping out one Java Agent for another will have this effect. The new data won’t work with the old system. Not only will the new data be in an incompatible format, the content of the data will be completely different – different metrics, different labels, different logs, etc. So, even if you translated the new data into the old format, your current dashboards and alerts would still be broken.

But with OpenTelemetry, you can now send the same telemetry to almost every observability provider. And you can tee the data off to multiple providers at the same time. This makes trying out new services easy.

Two: Let operators manage telemetry

There’s a real observer’s paradox with telemetry. Managing a high volume telemetry pipeline can be a real beast, and operators often need to make changes quickly, in a coordinated fashion across the entire deployment. If making those changes involves reconfiguring and restarting applications, operators have to risk impacting the system.

In some cases, they may require an application developer to make the changes for them. This can remove quite a bit of agency from the operator. It can be especially painful when applications go through a complex release pipeline, where they end up running in many different environments (integration testing, staging, load testing, etc) all of which have different telemetry setups.

When running OpenTelemetry, applications can stick to the default OTLP settings. Instead of making configuration changes in the application, telemetry routing and processing can be managed using pools of Collectors. These Collector deployments can be fully controlled by the operator, making telemetry management a separate concern from application management. Operators can make updates whenever they want, without accidentally affecting production.

Three: Create a revolution in Automated Analysis

Large scale production systems problems need to handle huge numbers of concurrent requests, all of which are attempting to utilize the same resources at the same time. These complex, emergent interactions end up generating all kinds of unexpected and unfortunate behavior. Because these issues are ephemeral and only emerge under certain conditions, they can be difficult to diagnose.

An exciting recent development in observability is the use of machine learning and other statistical tools to identify emergent patterns of bad behavior. But there’s a telemetry problem - logs, metrics, traces, RUM, and other data types are traditionally kept in completely separate systems. It should go without saying, but automated analysis can’t find correlations between two data points when they aren’t stored in the same place, or otherwise connected in any way.

WhyChooseOTel-image2

OpenTelemetry integrates logging, metrics, tracing, and resources into a single data structure that is ideal for finding correlations and other forms of statistical analysis.

OpenTelemetry solves this by providing a unified data structure, OTLP. This is more than just putting traces, logs, and metrics next to each other in the same pipe. This is highly integrated data, which can only be generated from instrumentation which is context-aware.

For example, OpenTelemetry has trace exemplars. Whenever metrics are emitted, OpenTelemetry will correlate those metrics with a sampling of traces. So, when counting status codes, the counts are linked to the traces of requests which created those status codes. And when measuring RAM or CPU, traces of requests which were active on that machine at that time. And when I look at any of these traces, I want to also see the logs.

This kind of integrated telemetry is designed to power modern observability systems, which use machine analysis to surface correlations across

Four: Shared standards matter

While we don’t need standards for everything, data protocols are one place where they can be extremely useful. OpenTelemetry isn’t just for application developers, having a standard also enables OSS libraries, databases, and managed services to participate in observability.

OSS code is run by many different organizations, all of which have made different choices about what observability system they want to use. When the only instrumentation options available are proprietary, or open but tied to a specific observability platform, it’s hard to emit telemetry from these shared libraries and services. Making telemetry work for OSS is an important goal for the OpenTelemetry project. That’s why we work so hard to ensure that OpenTelemetry is stable, and works with every observability system.

Dive deeper

These four reasons are why I work on OpenTelemetry. If some of those reasons resonate, let me know! Learn more about OpenTelemetryLearn more about OpenTelemetry

Download nowDownload now
March 1, 2022
5 min read
OpenTelemetry

Share this article

About the author

Ted Young

From Day 0 to Day 2: Reducing the anxiety of scaling up cloud-native deployments

Jason English | Mar 7, 2023

The global cloud-native development community is facing a reckoning. There are too many tools, too much telemetry data, and not enough skilled people to make sense of it all.  See how you can.

Learn moreLearn more

OpenTelemetry Collector in Kubernetes: Get started with autoscaling

Moh Osman | Jan 6, 2023

Learn how to leverage a Horizontal Pod Autoscaler alongside the OpenTelemetry Collector in Kubernetes. This will enable a cluster to handle varying telemetry workloads as the collector pool aligns to demand.

Learn moreLearn more

Observability-Landscape-as-Code in Practice

Adriana Villela, Ana Margarita Medina | Oct 25, 2022

Learn how to put Observability-Landscape-as-Code in this hands-on tutorial. In it, you'll use Terraform to create a Kubernetes cluster, configure and deploy the OTel Demo App to send Traces and Metrics to Lightstep, and create dashboards in Lightstep.

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems