OpenTelemetry: Emerging standard for all DevOps solutions - feature flags
by Clay Smith
Lightstep wants to help customers and our DevOps partners adopt OpenTelemetry. As a founding member and core contributor of the standard, we have the expertise, tools, and templates to help vendors easily adopt it in their solutions.
We are launching a series of OpenTelemetry-based tutorials and example instrumentation for different DevOps solutions. We’ll show how to connect these solutions to observability data within Lightstep, and show how that’s a better user experience for running experiments, operating cloud services, or investigating errors. In an earlier post, we showed how you can extend instrumentation to AWS cloud services.
In this post, we’ll discuss the value of adding instrumentation to feature flags as part of your progressive delivery workflow.
Feature flags are enabling software teams to move faster. At its simplest, a feature flag is an if/else statement in code that allows for more control over delivering new products and features without a full software release, which is a key part of the emerging practice called progressive delivery. Feature flags also increase safety: if something goes wrong, like orders on an eCommerce site dropping when a new checkout experience is released, a feature can be immediately deactivated.
As feature flag adoption increases, it becomes important for multiple teams to work together to understand the impact of feature flags. Sometimes it’s very easy to determine impact, like toggling a button’s color on an app and measuring if more people click on it. For features that involve multiple development and operations teams—like a new product or major feature—it becomes very challenging to understand performance because a single feature flag impacts dozens of services and supporting infrastructure.
Here are some types of difficult questions we’ve heard from customers:
- Did the new mobile app signup flow impact database performance? Is it going to increase the cloud bill?
- What new features were enabled across services with downtime during the last incident?
- Is the cause of a new error on a backend service the result of a new feature in the frontend, or something else?
Historically, answering those types of questions has required superpower SRE skills—the data, if it existed, was siloed across multiple platforms, solutions, and teams. OpenTelemetry changes this by allowing development teams to add feature flags to their telemetry. With a single line of code, this can be done automatically using a vendor-neutral plugin and adds currently enabled feature flags to distributed traces.
Here’s what a trace with feature flag metadata looks like in the Lightstep Explorer with Split.io feature flags:
We believe this new kind of instrumentation will be very helpful in a variety of scenarios when teams want to understand how feature flags impact the performance of their overall system, and have made the code available on GitHub to try.
Check our Lightstep Developer Toolkit:
- Our demo app demonstrates OpenTelemetry-aware feature flags with Lightstep using an experimental feature flag plugin available on GitHub.
- A Lightstep learning path to learn how to enable this in your services or see a demo app on GitHub.
- Our OpenTelemetry Docs for software teams considering adopting OpenTelemetry.
Contact us if you’d like to learn more or know what’s planned for future integrations or know more about OpenTelemetry.