In this post I want to walk you through a recent project I worked on at LightStep: building a Slack integration to make it easier for developers to share insights about their distributed systems.
Why am I working on this?
One of my team’s goals is to increase LightStep’s reach within our existing customer base. We brainstormed ways to achieve this goal and one area we landed on was improving the sharing experience. But to do so, we needed to understand how our customers communicate within their organization. Luckily, the answer was pretty consistent: Slack. With the vast majority of our customers using the same communication platform, we determined that it was worth building a prototype integration.
What is the actual deliverable?
LightStep can do many things, but one of the most highly trafficked pages is our Trace view. The Trace view is where a user is able to view the spans that make up a single logical request. It is rich in data and is the foundational unit making up our product. As a team, we made a bet that the Trace view was the first part of our app we should make more shareable.
Slack has a cool feature where they will show previews of links that you post. They call this unfurling. Many public websites include
<meta> tags that provide Slack with the necessary information to determine what the preview should be automatically. However, for an application requiring user login, a custom integration is required. This motivates the project deliverable:
When a user shares a trace view URL in Slack a preview message will be shown providing details about the active span and summary information about the trace as a whole.
One of the nice things about creating an integration with a big (now public) company is that they have the resources to create helpful documentation. We did our best to follow their documentation point by point and, for the most part, it was complete!
The fun starts when a user posts a link to a LightStep trace in their Slack account (with the LightStep App installed). Slack will identify that the URL posted matches a prefix string supplied by our application (
app.lightstep.com) and then send a
link_shared event request to
api.lightstep.com/integration/slack. Our Kubernetes Ingress Controller is configured with a path routing rule that directs all
/integration/slack/* paths to a service that exposes an HTTP server capable of servicing the request.
Let’s take a look inside of the HTTP server. The Ingress Controller monitors the health of the backends it routes to so we set up a liveness handler that is responds with a
200 OK message as long as the server is healthy. All of the interesting stuff begins with the
BaseEventHandler is responsible for doing high level validation of the request and then delegating the request to the specific
HTTPHandler capable of responding to the request. The validation done at this level is twofold (and based on Slack’s recommendations):
- Ensure the request timestamp in the request header
X-Slack-Request-Timestampis a near match to current server time
- Ensure that HMAC provided in the request header
X-Slack-Signaturematches the HMAC computed on the server
If either of these validations fails, an error HTTP code is returned to Slack. Once validation passes,
BaseEventHandler inspects the type field in the JSON request Body. If type
== url_verification then the request is delegated to the
URLVerificationHandler is only invoked when setting up a new Slack App or when changing the URL that Slack will send events to. Its sole job is to echo out a challenge parameter — a mechanism to prove that we own the domain we are registering. Once this happens for the first time, it won’t need to be invoked again for the lifetime of the application.
The more common case is that type
== event_callback. In this case,
HTTPHandler delegation is determined by inspecting a JSON field called event. If event
== link_shared then the request is delegated to the
LinkSharedHandler. We added this extra layer of delegation to make it easy for us to support other event types in the future. All we will need to do is add the new event type into what is essentially a switch statement.
LinkSharedHandler is responsible for determining the type of resource represented by the link (for this project we only built support for Trace links, but as we get customer feedback we might extend this to support links to other resources within LightStep), fetching information about that resource, and then packaging that data into a format suitable for a Slack preview. The first thing LinkSharedHandler does is return a
200 OK response to Slack. This is necessary to avoid Slack timing out the request, which can even lead to your app being disabled! After this, the LinkSharedHandler queues up event payload for additional processing.
A background process pulls from the queue and begins processing. First, the URL in the event is parsed to determine the resource it represents. Right now we only support Trace URLs.
An example trace URL is below, with the fields relevant to this project called out.
|Project||Used to scope the resource||lightstep-demo|
|Resource Type||Determines what resource is being accessed||trace|
|Seed Span||Span ID that indexes the trace||0f5653fd9bbf6dc9|
|Selected Span||Span ID selected in the UI||250e3883c1f9e4d7|
Once we have parsed the URL and confirmed that it is for a resource type that we support, we verify the Slack User who is authorized to view the link they posted. This ensures that someone at imaginary customer BEEMO can’t construct a valid link for customer ACME and view the data. As part of this authorization, we require users to go through an OAuth flow (described below) in which the LightStep client requests permissions to read/write links data in Slack and maps the LightStep user to the Slack user. Since each event request contains the slack user who initiated the call, we are able to map that user to the corresponding LightStep user and then make sure that they are authorized to view the
Project in the URL. If the user is authorized, the trace is fetched from storage, it’s features are analyzed, and then a
POST request is sent to Slack’s chat.unfurl endpoint with the link preview information.
In order to ensure that Slack users can only see data for links that they have access to in LightStep, we needed to implement an OAuth 2.0 flow that users would be sent through before unfurling any trace links they post. Here is a nice visual overview of the OAuth flow. I also recommend at least skimming through RFC 6749 where it is defined.
Image Source: Slack
Step 0 - Set Up
To ensure users are authenticated, when we receive a
link_shared event from a user who has not gone through the OAuth flow, we will respond to the
chat.unfurl endpoint with the following two parameters set:
user_auth_required = true and
https://app.lightstep.com/integration/auth/slack prompts the user to log into LightStep and then takes them to a page in the webapp that contains an “Add to Slack” button. When the user clicks this button it triggers the OAuth process by calling to
Step 1 - Authorization Request
https://app.lightstep.com/auth/slack/initiate redirects the user to
https://slack.com/oauth/authorize with the following parameters:
|State||Unique string used for preventing CSRF attacks||Randomly generated string that is\ mapped to the logged in LightStep user|
|Scope||Permissions that our LightStep App is requesting from Slack||
|Redirect_uri||URL for Slack to redirect back to after user approves/denies permission request||
Step 2 - Authorization Grant Received
After the user confirms the requested scopes, Slack will redirect users to the
redirect_uri specified with
Step 3 - Authorization Grant Exchanged for Access Token
First, we need to verify that the state parameter matches the one sent in Step 1. To do this, we perform a lookup using
state and the currently logged in users
user_id and verify that there is a match. Assuming a legitimate match, a request to
https://slack.com/api/oauth.access is sent along with the code from the previous step and some information about the LightStep App:
client_secret. These two values are provided to us by Slack upon creating a Slack App.
Step 4 - Access Token Received
After verifying that the incoming request looks good, Slack will respond to the previous request with an access_token enabling LightStep access to the scopes requested earlier in the flow. Specifically, the request contains the following fields:
|access_token||The OAuth token granting LightStep the requested permissions|
|user_id||The user’s user id in Slack|
|team_id||The user’s team (workspace) id in slack|
|team_name||The user’s team (workspace) name in Slack|
|scope||The scopes granted for this OAuth token|
Step 5 - Use the Token
Now that the user has authenticated with LightStep and Slack, when they post a Trace link in Slack, we will be able to match their Slack User ID to their LightStep user, verify that they have permission to view the trace they linked, and then use the access token we were granted to send the unfurl preview!
We did it! We built v0 of our Slack integration.
We are excited to hear feedback from our customers about what they do and don’t like about this trace preview. We hope to iterate on this integration by enabling previews for other LightStep features such as Streams and Explorer.
If you’d like to test out our Slack integration or any other LightStep features, you can get started in less than 10 minutes with a free Developer Account.