I had a lovely time at QCon London earlier this month. I had the opportunity to present on a few of my favorite topics (hint: they all involve microservices) and also got to chat with devs/devops building many different flavors of powerful software for companies of all shapes and sizes. (As a side note, the vendor areas at most tech conferences seem to be fluorescent-lit, windowless rooms – not so at QCon London! We all had a beautiful floor-to-ceiling view of Westminster Abbey. Not bad!)
I’ve been to a number of tech conferences in Europe over the years. Things felt qualitatively different this time around. In the past, it seemed like enterprise software developers in the E.U. were curious about microservices and other distributed architectures, but they were still stuck with their monoliths for various practical reasons. “Tracing” and “serverless” were similarly foreign, at least in production.
Fast forward to 2019: Microservices have gone mainstreamMicroservices have gone mainstream. It was remarkable how far microservices – as well as the problems they introduce – have proliferated, especially at older, traditionally more risk-averse companies. This is no doubt due to the strength of the evidence in favor of a transition to microservices; for instance, Sarah WellsSarah Wells gave a wonderful keynote presentation where she documented, with evidence, how Financial Times increased their release velocity more than 100x by switching to microservices. It’s all very compelling and hard to ignore.
Granted, from a certain perspective, nothing has changed. Teams still need to provide an excellent (and speedy) product experience for their end users, they need to ship code faster, and they need to resolve incidents more quickly. How can we make all of this possible? What can we do to help organizations develop with confidence despite the growing complexity of their modern, distributed systems?
Perhaps we can come to agreement on a few guiding principles:
1. Observability must be service-centric We can do a much better job transforming signals (spans, traces, etc.) into insights when we have clear objective functions. For example, once a service team declares their SLIs – and clearly states which metrics serve as indicators of the health of their service – our tools have an objective function to work with: p99 latencies, error rates, throughput, etc. This clarity lends itself to meaningful automation. Everything from automatic rollbacks (based on SLI latency thresholds) to dynamic, contextual analysis of spans and traces is suddenly possible.
2. Tracing isn’t just for microservices Traces should absolutely achieve coverage of the modern, progressive services in a production deployment – but they should also account for overall time spent in mobile apps, web clients, and monoliths. In fact, it’s the best way to understand their interdependence. An Android dev may think about latency only in terms of the literal end user, whereas a backend engineer’s mind is likely focused on their particular service, but in distributed systems both developers are working on components that depend on each other. Mapping the journey of transaction – from swipe to servers – is needed if we expect to form a nuanced understanding of systemic issues afflicting modern applications.
3. There’s simply “too much signal” Some say that observabilityobservability{: target="_blank"} has a "signal-to-noise" problem. I’d say it's deeper than that: there's simply too much signal. None if it should be discarded – it is signal, after all! – but we need tools to detect the actionable patterns and surface them for us. Simply discarding outliers because they are infrequent runs contrary to the very purpose of observability: to understand the inner workings of a system by its outputs. Does this mean we need to manually analyze every span? No – we don’t have the time or the brainpower to do so without assistance. But by using tools that ingest the firehose in its entirety, we can begin to understand and build a strong, evidence-based case about the root cause of complex, multifactorial problems in production.
4. Serverless means too many things It’s problematic that “Serverless” has come to mean everything“Serverless” has come to mean everything from FaaS in general, to “nanoservices,” to edge compute functions. It’s high time that we choose more self-descriptive terms, or we will inevitably end up talking past each other. ETL processes ported to Lambda and S3 are completely different than latency-sensitive consumer-facing products, even if they’re all “serverless.” As a trend, “serverless” is worth understanding, but it’s so broad that it’s difficult to have a coherent discussion about problems and solutions.
For those who were at QCon and those who weren’t, I’d love to get your feedback on these ideas. You can find me on twitter @el_bhs@el_bhs or drop me a line over old-fashioned emailover old-fashioned email if you’d like to use more than 280 characters.
Interested in joining our team? See our open positions herehere.
Explore more articles

Monoliths to Microservices: The Journey to Cloud Native
Jason Bloomberg | Sep 14, 2022The path from earlier-generation monolithic architectures to today’s cloud native computing was a bumpy one. From REST-based SOA to microservices architectures to today’s hybrid cloud native architectures, we’ve learned many lessons along the way.
Learn moreLearn more
Migrating to Microservices: Worst Practices
James Burns | Apr 28, 2020The reality is that most migrations bog down quickly. This worst practices guide will tell you how you too can end up with a distributed monolith at the end of a multi-year long slog.
Learn moreLearn more
Thinking about Migrating to Microservices? Here’s Where to Start
Talia Moyal | Mar 24, 2020Migrating to microservices is not an easy decision, and one that shouldn’t be taken lightly. To get a head start, here are the two main questions that should be answered before migrating to microservices.
Learn moreLearn moreLightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems