OpenTelemetry automatic instrumentation: a deep dive
by Austin Parker
OpenTelemetry can be an overwhelming project to take in - understanding the interactions between the API, SDK, and various tools and protocols is a lot. One particularly interesting and important part of the project, though, is the "automatic instrumentation" components. What is automatic instrumentation, and why is it important to OpenTelemetry? How can you use it? How does it help? We’ll answer these questions, and more, so read on!
You might be more familiar with the term "agent" to refer to automatic instrumentation. A quick detour is in order - why did agents exist in the first place? Early application monitoring tools tended to rely on what’s known as a “black box” approach to monitoring. Often, the people responsible for monitoring the health of services weren’t the people developing those services, so they were forced to rely on external applications and processes that could perform introspection of running services and generate performance data about them. These processes, called agents, were installed on servers and hosts, given a list of process names to monitor, and would then use various methods to gather metrics and request traces from them.
As we’ve “shifted left”, and developers have taken more ownership in the care and feeding of their services, the tools we need to understand service performance has changed as well. Developers need a choice of tools that suit their service’s needs, rather than one-size-fits-all monitoring agents. Additionally, developers don’t want to be locked into a proprietary solution that they can’t extend or modify to suit their needs. OpenTelemetry’s automatic instrumentation fulfills both of these requirements, allowing for a bridge to a "batteries-included" future of observability.
The biggest benefit of OpenTelemetry automatic instrumentation is that it provides a single, shared set of semantics across languages for common operations. This means that in a polyglot system, traces and metrics will have the same attribute for the same type of operation, regardless of the framework or library, or language that emitted the telemetry. Let’s be more specific - if you have a web server running using Java Spring Boot, Node.JS Express, and Go net/HTTP, and make a GET request to a route on them, the trace for that request will have the same attributes regardless of which server handled it - just different values for things that are different. This makes it easy to ask questions about your system, like “is the new version of this library out-performing the old version” or “which region has the slowest response time to my API right now?”
In addition, OpenTelemetry is standardizing the configuration of automatic instrumentation, irrespective of language -- need to configure loglevel, or sampling? Doesn’t matter what language you’re in,
OTEL_TRACE_SAMPLER will work everywhere. This simplifies documentation and comprehension quite a bit!
When you’re embarking on observability, one of the most important tasks is reducing the "time to value" -- how long does it take before you can start to answer questions? OpenTelemetry automatic instrumentation tackles this by focusing on instrumenting the most vital parts of your system’s "critical path" -- HTTP and RPC servers, database clients, caches, and so forth. Ideally, you can drop in automatic instrumentation and within minutes, begin diagnosing common latency and availability issues with your system. Tracing is most useful when all of your libraries are instrumented, and auto-instrumentation simplifies this dramatically.
Automatic instrumentation also provides a building block for future instrumentation efforts. Once you know the basic outline of things, you can start to go deeper, adding in custom instrumentation and attributes around your business logic.
OpenTelemetry is, and always will be, a permissively-licensed open source project. This means that you’re never going to be boxed in by proprietary extensions, unavailable source code, or SaaS/Cloud-specific addons. This ensures that both today, and into the future, you’ll be able to rely on OpenTelemetry to grow with you as your observability practice expands and matures. We also play nice with existing tools, so you can put OpenTelemetry as the centerpiece of your observability strategy and integrate existing instrumentation from legacy OpenTracing or OpenCensus projects with new OpenTelemetry instrumentation.
You can find a full list of supported libraries and frameworks in the OpenTelemetry project readmes, but I’d like to highlight the breadth of the support here.
- Akka HTTP, Play Framework
- Apache HttpClient, AsyncHttpClient
- AWS Lambda, AWS SDK
- Google HTTP Client, gRPC
- Kubernetes Client
- Redisson, Rediscala
- Servlet Framework
- Spring Batch, Spring Data, Spring Scheduling, Spring Web MVC, Spring Webflux
- Glassfish, JBoss, Jetty, Tomcat, Weblogic, WildFly … and more!
- ASP.NET and ASP.NET Core
- Redis, SQL
- Entity Framework
- Azure SDK
- MassTransit … and many more coming soon!
- Ruby Kafka
- net_http … and more!
- http/https … and more!
- AIOPG, AIOHTTP
- SQLAlchemy … and more!
- Redis … and more!
This is only a selection of available packages and instrumentation libraries covered, be sure to check your language’s contrib repository for more!
The goal of OpenTelemetry is to make observability a built-in component of your software, which means that we’d love to see these integrations actually move upstream and become something that you get simply by using these packages! This is a long-term goal -- I wouldn’t expect it soon -- but in the meantime, using automatic instrumentation is a great way to get closer to that goal today. If you’ve got any questions about how to get started, check out our community discord at https://ltstp.run/discord, I’d love to hear from you!
Interested in joining our team? See our open positions here.