Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

OpenTelemetry for Python: The Hard Way

In my last blog postlast blog post, I showed y’all how to instrument Python code with OpenTelemetry (OTel)OpenTelemetry (OTel), à la auto-instrumentationauto-instrumentation. You may also recall from that post that I recommended using the Python auto-instrumentation binaryrecommended using the Python auto-instrumentation binary even for non-auto-instrumented libraries, because it abstracts all that pesky OTel config stuff so nicely. When you use it, along with any applicable Python auto-instrumentation libraries-instrumentation libraries (installed courtesy of opentelemetry-bootstrapopentelemetry-bootstrap), it takes care of context propagation across related services for you.

All in all, it makes life nice ‘n easy for us!

Well, today, my friends, we’re going to torture ourselves a weeeee bit, because we’re going to put that auto-instrumentation binary aside, and will instead dig into super-duper manual OpenTelemetry instrumentation for Python. Since we don’t have auto-instrumentation as our security blanket, we will have to learn how to do the following:

  • Configure OpenTelemetry for Python to send instrumentation data to an Observability back-end that supports OTLPOTLP. Spoiler alert: we’ll be using LightstepLightstep as our Observability back-end. ✅

  • Propagate context across related services so that they show up as part of the same trace ✅

Note: I won’t go into how to create Spans with OTel for Python, since the official OTel docsofficial OTel docs do a mighty fine job of it.

Are you scared? Well don’t be, because I’ve figured it all out so that you don’t have to!

Are you readyyyyy? Let’s do this!!

Pre-Requisites

Before we start our tutorial, here are some things that you’ll need:

If you’d like to run the full code examples in Part 2, you’ll also need:

Part 1: What’s Happening?

We’ll be illustrating Python manual instrumentation with OpenTelemetry with a client and server app. The client will call a /ping endpoint hosted by the server.

The example in this tutorial can be found in the lightstep/opentelemetry-exampleslightstep/opentelemetry-examples repo. We will be working with three main files:

Before we run the example code, we must first understand what it’s doing.

1- OTel Libraries

In order to send OpenTelemetry data to an Observability back-end (e.g Lightstep), you need to install the following OpenTelemetry packages, which are included in requirements.txtrequirements.txt:

opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp-proto-grpc

As you can see, we’re installing the OpenTelemetry API and SDK packages, along with opentelemetry-exporter-otlp-proto-grpc, which is used to send OTel data to your Observability back-end (e.g. Lightstep) via gRPCgRPC.

2- OTel Setup and Configuration (common.py)

In our example, OTel setup and configuration is done in common.pycommon.py. We split things out into this separate file so that we don’t have to duplicate this code in client.pyclient.py and server.pyserver.py.

First, we must import the required OTel packages:

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

Next, we must configure the Exporter. An Exporter is how we send data to OpenTelemetry. As I mentioned earlier, Lightstep accepts data in the OTLP format, so we need to define an OTLP Exporter.

Note: Some vendors don’t accept data in OTLP format, which means that you will need to use a vendor-specific exportervendor-specific exporter to send data to them.

We configure our Exporter in Python like this:

def get_otlp_exporter():
   ls_access_token = os.environ.get("LS_ACCESS_TOKEN")
   return OTLPSpanExporter(
       endpoint="ingest.lightstep.com:443",
       headers=(("lightstep-access-token", ls_access_token),),
   )

Some noteworthy items:

Finally, we configure the Tracer Provider. A TracerProvider serves as the entry point of the OpenTelemetry API. It provides access to Tracers. A Tracer is responsible for creating a SpanSpan to trace the given operation.

We configure our Tracer Provider in Python like this:

def get_tracer():
    span_exporter = get_otlp_exporter()

    provider = TracerProvider()
    if not os.environ.get("OTEL_RESOURCE_ATTRIBUTES"):        
        # Service name is required for most backends
        resource = Resource(attributes={
            SERVICE_NAME: "test-py-manual-otlp"
        })
        provider = TracerProvider(resource=resource)
        print("Using default service name")

    processor = BatchSpanProcessor(span_exporter)
    provider.add_span_processor(processor)
    trace.set_tracer_provider(provider)    

    return trace.get_tracer(__name__)

A few noteworthy items:

  • We define a ResourceResource to provide OpenTelemetry with a bunch of information that identifies our service, including service nameservice name and service versionservice version. (You can see a full list of Resource attributes that you can set herehere.) As the name implies, service name is the name of the microservice that you are instrumenting, and service version is the version of the service that you are instrumenting. In this example, we get the service name and service version are passed in as key/value in the environment variable, OTEL_RESOURCE_ATTRIBUTESOTEL_RESOURCE_ATTRIBUTES (we’ll see some example values in Part 2). If that environment variable is not present, we then set a default service name, "test-py-manual-otlp".

  • We are using the BatchSpanProcessorBatchSpanProcessor, which means that we are telling OTel to export the data in batches. For the purposes of this example, we’re not doing anything beyond a basic configuration.

3- Initialization (client.py and server.py)

We’re finally ready to send data to Lightstep! All we need to do is call common.pycommon.py’s get_tracer function from client.py (Lines 17-2017-20) and server.py (Lines 1717 and 2929), like this:

from common import get_tracer

...

tracer = get_tracer()

...

4- Instrumentation (client.py and server.py)

With initialization done, we need to instrument our code, which means that we’ll need to create Spans. I won’t go into the specifics of Span creation here, since the OTel docsthe OTel docs do a pretty good job of it, and as I mentioned in the intro, it’s outside of the scope of this post.

I will, however, briefly mention that there are a couple of ways to instrument our code in Python, and you’ll see both ways of Span creation in the example code: using the with statementusing the with statement, and using function decoratorsusing function decorators.

You can see an example of creating a Span using the with statementwith statement in client.py, Lines 23-32client.py, Lines 23-32. Below is the full function listing:

def send_requests(url):
    with tracer.start_as_current_span("client operation"):
        try:
            carrier = {}
            TraceContextTextMapPropagator().inject(carrier)
            header = {"traceparent": carrier["traceparent"]}
            res = requests.get(url, headers=header)
            print(f"Request to {url}, got {len(res.content)} bytes")
        except Exception as e:
            print(f"Request to {url} failed {e}")
            pass

The Span is initialized with the line, with tracer.start_as_current_span("client operation"):, and everything below that line is within the scope of that Span.

You can see an example of creating a Span using a function decoratorfunction decorator in server.py Line 78server.py Line 78. Below is the full function listing:

@tracer.start_as_current_span("pymongo_integration")
@app.route("/pymongo/<length>")
def pymongo_integration(length):
    with tracer.start_as_current_span("server pymongo operation"):
        client = MongoClient("mongo", 27017, serverSelectionTimeoutMS=2000)
        db = client["opentelemetry-tests"]
        collection = db["tests"]
        collection.find_one()
        return _random_string(length)

A few noteworthy items:

  • The line @tracer.start_as_current_span("pymongo_integration") starts the Span for the pymongo_integration function. Everything in that function is within the scope of that Span.

  • You may have also noticed that we initialize another span in there, with the line, with tracer.start_as_current_span("server pymongo operation"):, (server.py, Line 89server.py, Line 89). This means that we end up with nested Spansnested Spans (a Span within a Span).

5- Context Propagation

As I mentioned in the intro, one of the advantages of using Python auto-instrumentation is that it takes care of context propagation across services for you. If you don’t use auto-instrumentation, however, you have to take care of context propagation yourself. Great. Just great.

But before we dig into how to do that, we need to first understand context propagation.

Definition time!

Context represents the information that correlates Spans across process boundaries.

Propagation is the means by which context is bundled and transferred in and across services, often via HTTP headers.

This means that when one service calls another, they will be linked together as part of the same TraceTrace. If you go the pure manual instrumentation route (like we’re doing today), however, you have to make sure that your context is propagated across services that call each other, otherwise you’ll end up with separate, unrelated-even-though-they-should-be-related) Traces.

I have to admit that I was wracking my brains trying to figure out this context propagation stuff. After much time spent Googling and asking folks around here for clarification, I finally got it, so I’m going to share this piece with you here to hopefully spare you some stress.

Note: Although the OpenTelemetry documentation does provide some insight into how to do manual context propagation in Pythonmanual context propagation in Python, the documentation needs a little work. I’m actually part of the OpenTelemetry Comms SIGOpenTelemetry Comms SIG, so I am using this as motivation to improve the docs around this topic…stay tuned for updates to the OTel docs too! 😎

Okay, so how do we do this manual context propagation? First, let’s remind ourselves of what’s happening in our example app. We have a clientclient service and a serverserver service. The client service calls the /ping endpoint on the server service, which means that we expect them to be part of the same Trace. This in turn means that we have to ensure that they both have the same Trace ID in order to be seen by Lightstep (and other Observability back-ends) as being related.

At a high level, we accomplish this by:

  • Getting the Trace ID of the client

  • Injecting the Trace ID into the HTTP header before the client calls the server

  • Extracting the client’s Trace ID from the HTTP header on the server side

Easy peasey! Now let’s look at the code that needs to make this happen.

First, we need to start with something called a carrier. A carrier is just a key-value pair containing a Trace ID, and it looks something like this:

{'traceparent': '00-a9c3b99a95cc045e573e163c3ac80a77-d99d251a8caecd06-01'}

Where traceparent is the key, and the value is your Trace ID. Note that the above is just an example of what a Trace ID might look like. Obviously, your own Trace ID will be different (and will be different each time you run the code).

Okay, great. Now how do we obtain said carrier?

First, we need to import a TraceContextTextMapPropagator in client.pyclient.py:

from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

Next, we must populate the carrier:

carrier = {}
TraceContextTextMapPropagator().inject(carrier)

If you were to inspect the value of carrier after this line, you would see that it would look something like this:

{'traceparent': '00-a9c3b99a95cc045e573e163c3ac80a77-d99d251a8caecd06-01'}

Look familiar? 🤯

Now that we have the carrier, we need to put it into our HTTP header before we make a call to the server.

header = {"traceparent": carrier["traceparent"]}
res = requests.get(url, headers=header)

And voilà! Your carrier is in the HTTP request!

Now that we know what all of these snippets do, let’s put it all together. Here’s what our client code looks like:

def send_requests(url):
    with tracer.start_as_current_span("client operation"):
        try:
            carrier = {}
            TraceContextTextMapPropagator().inject(carrier)
            header = {"traceparent": carrier["traceparent"]}
            res = requests.get(url, headers=header)
            print(f"Request to {url}, got {len(res.content)} bytes")
        except Exception as e:
            print(f"Request to {url} failed {e}")
            pass

For the full code listing, check out client.pyclient.py.

Okay…we’ve got things sorted out on the client side. Yay! Now let’s go to the server side and pluck our carrier from the HTTP request.

In server.pyserver.py, we pull the value of traceparent from our header like this:

traceparent = get_header_from_flask_request(request, "traceparent")

Where we define get_header_from_flask_request as:

def get_header_from_flask_request(request, key):
    return request.headers.get_all(key)

Now we can build our carrier from this information:

carrier = {"traceparent": traceparent[0]}   

We use that to extract the context from this carrier:

ctx = TraceContextTextMapPropagator().extract(carrier)

Now we can create our Span with the context, ctx:

with tracer.start_as_current_span("/ping", context=ctx):

Here, we are passing ctx to a named parameter called context. This ensures that our "/ping" Span knows that it’s part of an existing Trace (the one originating from our client call).

It is worth noting that any child Spans of the "/ping" Span do not require us to pass in a context, since that’s passed in implicitly (see server.py, Line 81server.py, Line 81, for example).

Now that we know what all of these snippets do, let’s put it all together. Here’s what our server code looks like:

...

from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

...

def get_header_from_flask_request(request, key):
   return request.headers.get_all(key)

...

@app.route("/ping")
def ping():

   traceparent = get_header_from_flask_request(request, "traceparent")
   carrier = {"traceparent": traceparent[0]}   
   ctx = TraceContextTextMapPropagator().extract(carrier)

   with tracer.start_as_current_span("/ping", context=ctx):

       length = random.randint(1, 1024)
       redis_integration(length)
       pymongo_integration(length)
       sqlalchemy_integration(length)
       return _random_string(length)

...

For the full code listing, check out server.pyserver.py.

Part 2: Try it!

Now that we know the theory behind all of this, let’s run our example!

1- Clone the repo

git clone https://github.com/lightstep/opentelemetry-examples.git

2- Setup

Let’s first start by setting up our Python virtual environment:

cd python/opentelemetry/manual_instrumentation

python3 -m venv .
source ./bin/activate

# Install requirements.txt
pip install -r requirements.txt

3- Run the Server app

We’re ready to run the server. Be sure to replace <LS_ACCESS_TOKEN> with your own Lightstep Access TokenLightstep Access Token.

export LS_ACCESS_TOKEN="<LS_ACCESS_TOKEN>"
export OTEL_RESOURCE_ATTRIBUTES=service.name=py-opentelemetry-manual-otlp-server,service.version=10.10.9

python server.py

Remember how I told you that we’d see an example of values passed into OTEL_RESOURCE_ATTRIBUTESOTEL_RESOURCE_ATTRIBUTES? Well, here it is! Here, we’re passing in the service name py-opentelemetry-manual-otlp-server, and service version 10.10.9. The service name will show up in the Lightstep explorer.

Your output will look something like this:

Python server.py startup sequence output

4- Run the Client app

Open up a new terminal window, and run the client app. Be sure to replace <LS_ACCESS_TOKEN> with your own Lightstep Access TokenLightstep Access Token.

PS: Make sure you’re in python/opentelemetry/manual_instrumentation in the opentelemetry-examples repo root.

export LS_ACCESS_TOKEN="<LS_ACCESS_TOKEN>"
export OTEL_RESOURCE_ATTRIBUTES=service.name=py-opentelemetry-manual-otlp-client,service.version=10.10.10

python client.py test

Note how we’re passing in the service name py-opentelemetry-manual-otlp-client, and service version 10.10.10. The service name will show up in the Lightstep explorer.

When you run the client app, it will continuously call the /ping endpoint. Let it run a few times (maybe 5-6 times-ish?), and kill it (à la ctrl+c). Sample output:

Sample client.py output

If you peek over at the terminal running server.py, you will likely notice a super-ugly stack trace. DON’T PANIC! The /ping service makes calls to RedisRedis and MongoDBMongoDB, and since neither of these services is running, you end up getting some nasty error messages like this:

Sample server.py program run output with error

5- See it in Lightstep

If you go to your trace view in Lightstep by selecting the py-opentelemetry-manual-otlp-client service from the explorer (you could also see the same thing by going to the py-opentelemetry-manual-otlp-server service), you’ll see the end-to-end trace showing the client calling the server, and the other functions called within the server.

And remember that stack trace in Step 4? Well, it shows up as an error in your Trace. Which is cool, because it tells you that you have a problem, and pinpoints to where it’s happening! How cool is that??

End-to-end trace sample of server.py and client.py in Lightstep

And remember how we never passed our context to the redis_integration and server redis operation Spans, you can see that server redis operation rolls up to redis_integration, which rolls up to /ping, just like I said it would. Magic! 🪄

Final Thoughts

Today we learned how to manually configure OpenTelemetry for Python to connect to Lightstep (this also works for any Observability back-end that ingests the OTLP formatOTLP format). We also learned how to link related services together through manual context propagation.

Now, if you ever find yourself in a situation whereby you need to either connect to your Observability back-end without the use of the Python auto-instrumentation binary and/or need to manually propagate context across services, you will know how to do it!

Now, please enjoy this cuddly little pile of rats. From front to back: Phoebe, Bunny, and Mookie. They were nice enough to sit still for the camera while my husband held them.

Pile 'o rats! Featuring Phoebe, Bunny, and Mookie

Peace, love, and code. 🌈 🦄 💫


Got questions about OTel instrumentation with Python? Talk to me! Feel free to connect through e-maile-mail, or hit me up on TwitterTwitter or LinkedInLinkedIn. Hope to hear from y’all!

September 20, 2022
12 min read
OpenTelemetry

Share this article

About the author

Adriana Villela

Adriana Villela

Read moreRead more

From Day 0 to Day 2: Reducing the anxiety of scaling up cloud-native deployments

Jason English | Mar 7, 2023

The global cloud-native development community is facing a reckoning. There are too many tools, too much telemetry data, and not enough skilled people to make sense of it all.  See how you can.

Learn moreLearn more

OpenTelemetry Collector in Kubernetes: Get started with autoscaling

Moh Osman | Jan 6, 2023

Learn how to leverage a Horizontal Pod Autoscaler alongside the OpenTelemetry Collector in Kubernetes. This will enable a cluster to handle varying telemetry workloads as the collector pool aligns to demand.

Learn moreLearn more

Observability-Landscape-as-Code in Practice

Adriana Villela, Ana Margarita Medina | Oct 25, 2022

Learn how to put Observability-Landscape-as-Code in this hands-on tutorial. In it, you'll use Terraform to create a Kubernetes cluster, configure and deploy the OTel Demo App to send Traces and Metrics to Lightstep, and create dashboards in Lightstep.

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems