Hippies, Ants, and Healthy Microservices
by Ben Sigelman
This article originally appeared on Medium.
For any organization that expects its developers to produce powerful software, the decision to adopt microservices should be an easy one – developer velocity is king, and that's hard to come by in the byzantine build-test-release lifecycles of monolithic software architectures. But the initial commitment to adopt microservices is much simpler than the decisions that follow about how to structure that adoption: there are uncountably many blog posts addressing various and sundry technical details, and dozens of partially-overlapping solutions for every problem, even (especially?) for the ones you haven't encountered yet.
And yet, despite the mountains of content about the technical details, it surprises me how little has been written about the biggest failure mode I've seen out there in the wild: a fundamental misunderstanding of the goals surrounding a microservices migration, and how those goals best translate into engineering management practices. In particular, the conventional wisdom makes a microservices-oriented engineering organization sound like a hippie commune. But it should probably feel more like an ant colony.
Before I proceed, let it be known that I have a soft spot for the hippies of yore. I love idealists as long as they're peaceful, and you can't get much more peaceful or idealistic than a good, old-fashioned hippie. If the hippie ethos could be distilled into a single value, it would be the freedom to make independent decisions and act on them.
There are other posts offering greater detail (I'm especially fond of this article about the intersection of management and microservices from Vijay Gil, SVP Eng at Databricks), but to summarize: the only good reason to adopt microservices is to accelerate development through reduced human communication overhead. The idea is that each microservice gets its own development team, and these teams stay out of each other's way – i.e., they make completely independent decisions that further their own goals, and they try to allow others to do the same. It's like "the Me generation" for software.
But I don't think hippies have the right instincts for engineering management. What happens if we try to truly maximize the independence of distinct microservice teams? Every dev team chooses the language, frameworks, message queue, CI/CD strategy, and naming conventions (etc) that make the most sense for their service and their expertise as a group. Since every service and situation is different, this appears to be a rational strategy: after all, aren't microservices about increasing parallelism in decision-making (if not the software itself)?
And yet there are many flavors of independence. Ants are certainly enterprising little creatures: they readily explore every nook and cranny, they can famously carry up to 50x their own body weight, and some species build architecturally marvelous structures for themselves. But they use social and utterly standardized behaviors (and some pheromones) to facilitate their own versions of load-balancing, discovery, security, and replication.
Service discovery for ants
While one can observe an individual ant and reason about their actions in the context of their environment, their most adaptive behaviors rely on "biological standardization." For instance, if an individual wanders its way to a plentiful food source, that ant will emit a "trail pheromone" and head straight back to the colony; their fellow ants pick up the scent and use it to backtrack to the food source.
Similarly, when ants are in an alarmed or panicked state, they emit chemicals that alert their peers to the threat and protect the group. And so on and so forth: for every macrobehavior that benefits the colony, there is a standard chemical mechanism that all individuals understand and obey that facilitates that macrobehavior.
These collective adaptations have made ants one of the most "horizontally scalable" animals on Earth: the largest ant colony is 3,700 miles wide and is home to billions of individual organisms. They are remarkable animals!
There's no question that hippies are more independent than ants. And I suppose I should acknowledge that ants would make lousy engineering managers (they can't even drink coffee). But when we're spinning up microservices, we have a lot to learn from ants and other hive-minded animals: their reliance on the rigid standardization of certain functions facilitates optimal outcomes for the group as a whole.
There's always a temptation to allow each service team to decide on a language, a stack, and a set of primitives that feel familiar or appropriate to them. This is well-intentioned, as it seems to maximize the autonomy of the distinct service teams. But in a microservices deployment – especially at scale – we must also facilitate cross-cutting concerns like deployment, load-balancing, service discovery, security, and observability. If we encourage our two-pizza teams to make entirely independent decisions about each of these critical aspects, we are left with a monstrous challenge when operating our distributed application, especially as teams disband and services go into maintenance mode.
When transitioning towards a microservices architecture, it's best to create a limited number of choices – ideally only one – for each cross-cutting aspect of the larger system. For example:
- Programming language(s)
- Service (and infrastructure) naming conventions
- Orchestration and auto-scaling
- Web/RPC framework
- Service-to-service authentication
- Instrumentation for logging
- Instrumentation for tracing (I am obligated as a co-creator to plug OpenTracing for this)
- Instrumentation for metrics
- Service discovery
- Load balancing
- (and so on…)
By standardizing in these areas, a central team can manage these well-factored facets of the larger system, and the developers working on the microservices themselves can focus on what's most important: building something valuable.