Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

Bring observability to your GitHub workflows with Lightstep

With modern software development — where a single team might be responsible for dozens of services — it’s increasingly hard to connect an intentional change in code to a performance change in a production or pre-production environment. The Lightstep Services Change Report GitHub ActionLightstep Services Change Report GitHub Action helps mitigate that with a combination of telemetry, the new Lightstep Snapshots APILightstep Snapshots API, and flexible integration into GitHub workflows.

Connecting GitHub telemetry to code with OpenTelemetry

Chances are you look at a number of dashboards for a service after a major code change is deployed. If something unexpected happens, one chart might look something like this:

GitHub Actions - Error Percentage

Correlating a troubling error spike like the one above to a specific code change requires the traces, metrics, or logs you collect to have an attribute that points to the service’s version, which could be a reference to a specific commit or a release. Many continuous integration (CI) systems, including GitHub Actions, automatically expose the version (or git commit) as an environment variable.

The recent release of Lightstep’s OpenTelemety LaunchersLightstep’s OpenTelemety Launchers makes annotating telemetry with version information straightforward: when setting up tracing for a service, it’s just one line of code to add a version attribute. Here’s an example of a launcher that uses a SHA, provided as an environment variable in GitHub ActionsGitHub Actions, as the service version:

  const sdk = lightstep.configureOpenTelemetry({
  accessToken: 'YOUR ACCESS TOKEN',
  serviceName: 'my-awesome-service',
  serviceVersion: process.env.GITHUB_SHA
});

We see this type of instrumentation as not only a best practice, but also key to enabling workflows that bridge the gap between code reviews and CI checks and understanding what that code is doing in production. We think the best way to do this in a robust, vendor-neutral way is using OpenTelemetry.

As founders and core contributors to the OpenTelemetry project, we’ve recently donated code that can automatically annotate projects with GitHub metadataannotate projects with GitHub metadata provided in an Actions runner environment, like repository name and version number. This code will be available for anyone to use in the Node.js OpenTelemetry library.

Taking Snapshots for telemetry insights

Think of Lightstep SnapshotsLightstep Snapshots as a “big bag of useful telemetry” at a specific point in time: a collection of all of the traces and their attributes with detailed performance and error information about code running in an environment. Snapshots automatically get created when you use the Lightstep UI, but are also available through the APIthrough the API. Once a snapshot is created, it’s saved to be shared or analyzed. Here’s the Explorer interface in Lightstep, which has tools for analyzing a snapshot:

GitHub Actions - Explorer

The Lightstep Services Change Report ActionLightstep Services Change Report Action takes snapshots and analyzes them to understand what changed over time.

Here’s the start of workflow file that takes a snapshot of code that was just deployed from GitHub:

name: Lightstep Post-Deploy Check
on:
  deployment_status:

jobs:    
  postdeploy_check_job:
    runs-on: ubuntu-latest
    name: Compare Snapshots
    if: github.event.deployment_status.state == 'success'
    steps:  
      - name: Checkout
        uses: actions/checkout@v2

      - name: 📸 Take Lightstep Snapshot
        id: take-snap
        uses: lightstep/lightstep-action-snapshot@v2
        with:
          lightstep_api_key: ${{ secrets.LIGHTSTEP_API_TOKEN }}
          # points to the code that's being deployed in this workflow
          lightstep_snapshot_query: '"service.version" IN ("${{ github.sha }}")'

Later on, in the same workflow file (after a pause of a 3-4 minutes to let the snapshot collect telemetry), we can compare that snapshot with the most recent one available for that repository:

    - name: Compare Snapshots
    	id: lightstep-snapshot
    	uses: lightstep/lightstep-action-snapshot@v2
    	with:
      	  lightstep_api_key: ${{ secrets.LIGHTSTEP_API_TOKEN }}
      	  lightstep_snapshot_compare_id: '*'
      	  lightstep_snapshot_id: ${{ steps.take-snap.outputs.lightstep_snapshot_id }}

The Action automatically adds a summary of how the service changed between snapshots, to a pull request associated with the deploy. From there, it’s one-click to view errors from RollbarRollbar, see on-call information from PagerDutyPagerDuty, or dive into root-cause analysis inside Lightstep.

Lightstep Github GIF

Services Change Report: Catching problems using traces

In the Action, you can also configure violations that appear if traces collected in the snapshot meet certain conditions specified in a .lightstep.yml configuration file that is committed to a project’s GitHub repository.

Here’s an example .lightstep.yml that looks for traces in the web project that are collected in a specific AWS region, have 500 errors, or have requests to a specific service that’s not allowed:

# configuration file for Lightstep GitHub Action

organization: LightStep
project: demo

services:
  web:
	violations:
  	- name: No requests to currency service
    	  type: connection
    	  value: krackend-api-gateway
  	- name: No 500s Allowed
    	  type: span.attributes
    	  key: http.status_code
    	  op: equals
    	  value: 500
  	- name: No us-east-2
    	  type: span.attributes
    	  key: cloud.region
    	  op: equals
    	  value: us-east-2

If any violations are detected, they appear in the summary of the service:

Lightstep Services Change Report Violations

Violations based on traces give teams flexibility to catch code that’s violating best practices or internal policies, like making a call to an outside service or being deployed to the wrong datacenter.

GitHub workflows meet observability

Part of GitHub Actions’ popularity is its flexibility: as a “cause and effect” API, it’s possible to define a workflow tailored to how you or your team does software development and make it more efficient.

Lightstep’s Pre-Deploy Check and Services Change Report can be used in any workflow where observability is useful. Our initial examples are related to taking snapshots before and after deploys to different environments, code reviews, or attaching data to GitHub issues related to performance issues:

GitHub Action Workflow

The more snapshots you take during software development, the more opportunities you have to understand how a code change is impacting your service, without leaving GitHub. If you’re changing tabs from GitHub to a dashboard, that’s potentially a place to try out the Action.

We’re excited to see the workflows you build with our new actions and how you’re using observability and Lightstep to build better services in GitHub

How can I try out the Lightstep Observability Toolkit?

Interested in joining our team? See our open positions herehere.

December 8, 2020
•
5 min read
Observability

Share this article

About the author

Clay Smith

How to Operate Cloud Native Applications at Scale

Jason Bloomberg | May 15, 2023

Intellyx explores the challenges of operating cloud-native applications at scale – in many cases, massive, dynamic scale across geographies and hybrid environments.

Learn moreLearn more

2022 in review

Andrew Gardner | Jan 30, 2023

Andrew Gardner looks back at Lightstep's product evolution and what's in store for 2023.

Learn moreLearn more

The origin of cloud native observability

Jason English | Jan 23, 2023

Almost every company that depends on digital capabilities is betting on cloud native development and observability. Jason English, Principal Analyst at Intellyx, looks at the origins of both and their growing role in operational efficiency.

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems