In this blog post
OpenTelemetry pipeline architectureOpenTelemetry pipeline architectureOpenTelemetry ClientOpenTelemetry ClientOpenTelemetry CollectorOpenTelemetry CollectorDoes OpenTelemetry provide storage?Does OpenTelemetry provide storage?OpenTelemetry deployment adviceOpenTelemetry deployment adviceWelcome back to The Big Pieces, a weekly series focused on the high level design of OpenTelemetryOpenTelemetry. In this installment, we’re covering the way OpenTelemetry handles the transmission part of that “telemetry” term.
OpenTelemetry pipeline architecture
\
OpenTelemetry Client
We covered client architectureclient architecture in detail last week. Code is instrumented using a clean, low dependency API. The API is implemented by the SDK, a data processing framework. The SDK includes Exporters, framework plugins for sending data in various formats. This allows clients to send data directly to your storage system of choice, without running any collectors. A standard configuration is to run the SDK with the OTLP/gRPCOTLP/gRPC exporter, OpenTelemetry’s native format, pointed at the default address of localhost:4317
, where it will expect a Collector to be listening.
OpenTelemetry Collector
The collectorcollector is a stand alone service for transmitting observations. The collector follows the pipeline pattern. Receivers, processors, and exporters can be chained together to form pipelines Data can be received, processed, and exported in a variety of formats: buffering data, managing configuration and converting from one format to another.
Collectors are configured via yaml files. In depth documentation and details on the format can be found herehere. Let's cover the basics.
Receivers: Receivers ingest data from a variety of popular sources and formats, such as Zipkin and Prometheus. Running multiple types of receivers can help with mixed deployments, allowing for a seamless transition to OpenTelemetry from older systems.
Processors: Processors allow tracing, metrics, and resources to be manipulated in a variety of ways.
Exporters: Collector exporters are the same as client exporters. They take completed data from OpenTelemetry’s in-memory buffer and flush it to various endpoints in a variety of formats. Exporters can be run in parallel, and data may be efficiently fanned out to multiple endpoints at once. For example, sending your trace data to both LightstepLightstep and JaegerJaeger, while sending your metrics data to Prometheus.
Does OpenTelemetry provide storage?
Once the collectors have finished their transformations, OpenTelemetry data ends its journey being handed to some form of stable storage where analysis can be performed. By design, OpenTelemetry does not provide any analysis tools or long term storage system – we’re focused on standardizing how systems describe themselves. What you do with that data is another matter, and we hope to see many great analysis tools designed to leverage OpenTelemetry’s data model.
OpenTelemetry deployment advice
Minimizing impact on the underlying system is a primary goal. The first tenant of OpenTelemetry design is “do no harm.”
What types of impact would we like to minimize, in this case?
Rebooting application processes just to manage configuration changes
Stealing system resources from the application process
Slowing application shutdown while waiting for data to flush
There are a couple of deployment choices which can help alleviate these issues. The first is the rebooting issue. To mitigate this, run the OpenTelemetry clients in as close to default mode as possible, pointed at a local collector. This allows configuration changes to be made by rebooting the collector, not the application process. The local collector can also measure system metrics such as CPU and memory usage for your application.
While a local collector is good, it can’t solve the overhead issue. To manage telemetry at scale, a pool of data processing collectors can be run on separate machines in the same private network. This allows the local collector to avoid spending application resources, and allows the data processing to occur on machines where the collector is allowed to utilize the whole machine.
And that’s the basic design of the telemetry pipeline. Hopefully this high level advice helps you to understand the project components, and decide how to best set up your own OpenTelemetry deployment. Interested in more? Check out our latest video on how to instrument our OpenTelemetry LaunchersOpenTelemetry Launchers.
Interested in joining our team? See our open positions herehere.
In this blog post
OpenTelemetry pipeline architectureOpenTelemetry pipeline architectureOpenTelemetry ClientOpenTelemetry ClientOpenTelemetry CollectorOpenTelemetry CollectorDoes OpenTelemetry provide storage?Does OpenTelemetry provide storage?OpenTelemetry deployment adviceOpenTelemetry deployment adviceExplore more articles

From Day 0 to Day 2: Reducing the anxiety of scaling up cloud-native deployments
Jason English | Mar 7, 2023The global cloud-native development community is facing a reckoning. There are too many tools, too much telemetry data, and not enough skilled people to make sense of it all. See how you can.
Learn moreLearn more
OpenTelemetry Collector in Kubernetes: Get started with autoscaling
Moh Osman | Jan 6, 2023Learn how to leverage a Horizontal Pod Autoscaler alongside the OpenTelemetry Collector in Kubernetes. This will enable a cluster to handle varying telemetry workloads as the collector pool aligns to demand.
Learn moreLearn more
Observability-Landscape-as-Code in Practice
Adriana Villela, Ana Margarita Medina | Oct 25, 2022Learn how to put Observability-Landscape-as-Code in this hands-on tutorial. In it, you'll use Terraform to create a Kubernetes cluster, configure and deploy the OTel Demo App to send Traces and Metrics to Lightstep, and create dashboards in Lightstep.
Learn moreLearn moreLightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems