TracedPromise: Visualizing Async JS Code

July 20, 2016

·

Ben Cronin

Writing code in Node.js strongly encourages use of event-driven callbacks and asynchronous functions. There are plenty of advantages to writing event-driven code, but it can often obscure the control flow of operations.

We’ll look at how adding OpenTracing instrumentation to JavaScript ES6 Promise objects can allow you to visualize how asynchronous code behaves. We’ll then dig into the nuts and bolts of a TracedPromise class to demonstrate how you can see useful tracing information with only a small code change.

As the example throughout, we’ll use an instrumented Promise.all() call. The example contains fives sub-operations, including a single rejected Promise. The trace for the example looks like this:

TracedPromise

A bit about Promises

If you’re already a Promises expert, feel free to skip right along to the next section.

There are many ways to make coordinating asynchronous JavaScript code more sane. For example, Caolan McMahon’s async library is one immensely popular option that is not promises-based. There are also plenty of libraries that have a more promise-like API, though they may use terminology like futures, deferred, subscriptions, or callback objects. Libraries like the comprehensive Bluebird package and the concisely named ‘Q’ package are two of the more popular ones.

Promises have recently become a more interesting choice as the Promise class is a standard ES6 class* with pretty good support across platforms. With Promises as a standard API—and a rather effective, trim API at that—the process of writing asynchronous code is lot more natural and the code itself becomes less mentally taxing to read.

*ES6 being shorthand for ECMAScript 6th Edition. JavaScript is an implementation of ECMAScript.

Example code

Let’s look at the Promises “fail-fast” example, copied from the MDN documentation:

var p1 = new Promise((resolve, reject) => {
  setTimeout(resolve, 1000, "one");
});
var p2 = new Promise((resolve, reject) => {
  setTimeout(resolve, 2000, "two");
});
var p3 = new Promise((resolve, reject) => {
  setTimeout(resolve, 3000, "three");
});
var p4 = new Promise((resolve, reject) => {
  setTimeout(resolve, 4000, "four");
});
var p5 = new Promise((resolve, reject) => {
  reject("reject");
});

Promise.all([p1, p2, p3, p4, p5]).then(value => {
  console.log(value);
}, function(reason) {
  console.log(reason)
});

While this isn’t a full tutorial on using Promises, the general idea is that Promise objects allow for asynchronous operations to be written without resorting to deep nesting of callbacks. Functions that finish with a single resolve or reject callback are encapsulated into Promise objects, which in turn can be chained together or coordinated using a few standard Promise methods. These Promise method calls replace the deep nesting.

If you’re interested in more about why ES6 Promises are a good fit for JavaScript, check out runnable’s recent post on reasons to use Promises.

Visualizing the control flow with a trace

While the code is easier to write (and more importantly to read) with Promises, there’s still the challenge of knowing what actually happened at runtime. That’s precisely what tracing is for.

Let’s show the Promise.all() trace again. It’s based on the example code from the prior section (we’ll get to how this was instrumented in a moment):

TracedPromise

Very briefly, a trace is a collection of related spans. A span can be thought of as a single operation with well-defined start and finish times. A trace is a graph of the relationships between those spans. In the common case where a trace represents a single request, the trace is just a tree of spans with parent-child relationships. The Google research paper, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, is a good place to head if you’d like to get more detail.

In the specific trace above, a five operations are started concurrently (p1 through p5). The Promise.all() is used to coordinate on the success or failure of that group of operations. The example code itself is contrived (each operation is simply a timeout), but it’s easy to imagine the operations as network, file, database, or other asynchronous requests. The five inner requests are issued and the outer operation cannot proceed until the child operations have either succeeded or failed.

Fail-fast behavior visualized

One quick insight from visualizing this trace: the Promise.all() function is, in fact, a “fail-fast” operation. The timing of the Promises.all call shows that control returns to its handler as soon as p5 fails. It does not wait for p4 to resolve to proceed.

For a standard function like all(), the insight is limited (it’s documented behavior after all). However, what’s happening is made clear from the trace rather what’s been manually documented. Visualizing the runtime behavior is especially helpful when applied to a production system where complexity and lack of documentation might make such insight otherwise unavailable.

Taking it for granted that tracing visualizations would be useful in a complex system: what code changes are needed to make an ES6 Promise into a building block for collecting tracing data about a system?

Instrumenting Promises

To keep the code simple and clean, we’ll create a new class called TracedPromise that closely mirrors a standard Promise. Mirroring the existing API allows for existing code to be adapted rapidly. This technique should apply equally well to other promise-like libraries and methods, making adapting something like Q’s spread method relatively easy.

A standard ES6 Promise with a little more data

The TracedPromise API should be as simple and familiar as possible. The intent of tracing is to make code easier to understand, not more complex, after all. There are only two additional pieces of information we need to make standard promises more effective for tracing:

  • Add a meaningful name for each promise: good names will make examining the data offline more effective.
  • Add a reference to the parent: Promise objects can be chained, but don’t have the notion of parent-chain relationships. In practice it is a not uncommon in systems of even moderate complexity to have deeply nested Promises. Exposing those relationships in the tracing data can be very valuable.

Example code—this time with tracing!

The original example needs a few modifications to use TracedPromise:

  • Create an outer span to capture the total work being done (i.e. to correspond to what Promise.all() is tracking)
  • Use a TracedPromise object in place of a standard Promise
  • Add arguments to give the TracedPromise objects names and parent-child relationships
// NEW: Set up an initial span to track all the subsequent work
let parent = opentracing.startSpan('Promises.all');

// NEW: Assign a parent and name to each child Promise
let p1 = new TracedPromise(parent, 'p1', (resolve, reject) => {
    setTimeout(resolve, 100, 'one');
});
let p2 = new TracedPromise(parent, 'p2', (resolve, reject) => {
    setTimeout(resolve, 200, 'two');
});
let p3 = new TracedPromise(parent, 'p3', (resolve, reject) => {
    setTimeout(resolve, 300, 'three');
});
let p4 = new TracedPromise(parent, 'p4', (resolve, reject) => {
    setTimeout(resolve, 400, 'four');
});
let p5 = new TracedPromise(parent, 'p5', (resolve, reject) => {
    setTimeout(reject, 250, 'failure!');
});

// NEW: Use TracedPromise.all() to finish the parent span
TracedPromise.all(parent, [p1, p2, p3, p4, p5]).then(value => {
    console.log(`Resolved: ${value}`);
}, reason => {
    console.log(`Rejected: ${reason}`);
});

The code has the same fundamental form and should be easily understood and recognizable to anyone familiar with an untraced Promise.

Implementation with OpenTracing

If you’re using a standard ES6 Promise, the TracedPromise class may be already directly usable in your code. In the case where you’re using another approach to dealing with asynchronous code, the details of the handful of changes needed to create TracedPromise should be generalizable to many similar libraries and primitives.

OpenTracing Spans

Tracing and OpenTracing are topics of their own. For the purposes of the code snippets below, the key idea is this: an OpenTracing Span tracks a meaningful operation from start to finish. To give a very quick example, here’s how a span could be used to track an asynchronous file read:

var span = Tracer.startSpan('fs.readFile');
fs.readFile('my_file.txt', 'utf8', function(err, text) { // ignoring errors!
	span.finish();
  	doStuffWithTheText(text);
});

In the case of a promise, the promise itself is the operation. A Span object should be used to track logically when the promise’s work first starts and when finishes.

First a few helpers

Promises work on the basis of resolve and reject callback functions which finish the promise (moving them out of the pending state to either a fulfilled or rejected state). Two simple helpers will be used to create wrapped versions of the resolve and reject functions. These will behave the same as the unwrapped versions but also finish the span associated with the Promise.

function wrapResolve(span, f) {
    return function (...args) {
        span.finish();
        return f(...args);
    };
}

function wrapReject(span, f) {
    return function (...args) {
        span.setTag('error', 'true');
        span.finish();
        return f(...args);
    };
}

(Quick note on a span that has an error: there’s still some discussion around “standard tags” in the OpenTracing community. The tracer used to generate the images, LightStep, treats spans with the tag 'error' as containing errors.)

The TracedPromise constructor()

The TracedPromise class encapsulates a paired Promise and OpenTracing Span.

The first two arguments to the constructor are used to set up the TracedPromise’s own span. The third argument mirrors the standard Promise constructor’s argument: a single callback that receives a resolve, reject pair. The helpers above are used to wrap the resolve and reject function objects in new function objects that will finish the span in concert with the promise.

constructor(parent, name, callback) {
    let span = opentracing.startSpan(name, { childOf : parent });
    let wrappedCallback = (resolve, reject) => callback(
        wrapResolve(span, resolve),
        wrapReject(span, reject)
    );
    this._promise = new Promise(wrappedCallback);
    this._span = span;
}

(An aside: TracedPromise intentionally does not inherit from Promise. The native promise implementation—at least as of Node v6.3.0—will create new Promise objects using the invoking object’s constructor, meaning it will internally create TracedPromise objects, not Promise objects. Using inheritance with TracedPromise would require code specifically be aware of this implementation detail, which is not desirable.)

The then() and catch() methods

The then method behaves very similarly to the constructor: it simply wraps the two function objects and then forwards them along to the underlying promise.

then(onFulfilled, onRejected) {
    return this._promise.then(
        wrapResolve(this._span, onFulfilled),
        wrapReject(this._span, onRejected)
    );
}

…and catch() is even simpler:

catch(onRejected) {
    return this._promise.catch(wrapReject(this._span, onRejected));
}

all() & race() methods

In the case of the static all method, the span should resolve as soon as the promise created by all resolves. Standard then and catch handles can do just this. The race() method has the same requirement. We’ll add one more helper and then implement all and race:

function chainFinishSpan(promise, span) {
    return promise.then((value) => {
        span.finish();
        return value;
    }, (reason) => {
        span.setTag('error', 'true');
        span.finish();
        return Promise.reject(reason);
    });
}

Back in the TracedPromise implementation:

static all(span, arr) {
    return chainFinishSpan(Promise.all(arr), span);
}

static race(span, arr) {
    return chainFinishSpan(Promise.race(arr), span);
}

reject() & resolve() methods

The final two static methods of Promise are reject and resolve. These are trivial to handle. Both immediately return a completed Promise. There’s no need to add tracing instrumentation as no on-going operation exists.

static reject(...args) {
    return Promise.reject(...args);
}

static resolve(...args) {
    return Promise.resolved(...args);
}

That’s it! The ES6 Promise is a slim API composed of only a handful of methods.

Adding tracing to real world systems

The exercise of creating a TracedPromise is intended to be a simple, but practical example of how tracing can make asynchronous control flow far easier to understand. The implementation details also show that adding tracing instrumentation is fairly straightforward.

Production systems in the real world aren’t all exclusively based on ES6 Promises, but most architectures of a certain scale are built around a handful of basic primitives and patterns. Whether they are Promise objects, Promise-like objects, control flow helper libraries, or custom in-house code, the general principles behind the instrumentation stay the same. A few simple additions to the core building blocks of an asynchronous system can a long way towards better understanding of the system itself.

OpenTracing is an open source project on Github. Contributions for tracing-enabled versions of common Node.js libraries and utilities would undoubtedly be welcome! LightStep is a distributed tracing system that relies on the OpenTracing standard.

The full code for this post is available at: github.com/opentracing-contrib/javascript-tracedpromise.

Ben Cronin
Written by Ben Cronin
on July 20, 2016