Lightstep Enables Lyft’s Move to Microservices
by Kristin Brennan
In order to rapidly scale its system and support growth, Lyft started to explore moving from a monolithic architecture to microservices. Today, Lyft deploys more than 200 microservices in its distributed architecture, and this number is growing. These services work together to perform fundamental functions of the Lyft app, including matching riders with drivers, optimizing the route for the most efficient ride, and processing riders’ payment information. It’s a challenge to quickly and accurately monitor Lyft’s system as the number of microservices grows, because a distributed architecture generates exponentially more data than its monolithic predecessor. To gain insight into this detailed level of performance as efficiently as possible, Lyft chose to implement Lightstep.
Lightstep is the only solution that monitors 100% of unsampled transaction data and is always-on in production environments, with negligible overhead. With its unique architecture, Lightstep can capture a near-limitless amount of data and weave distributed trace data together into meaningful point-in-time stories about the application – even if the data was produced asynchronously or across distinct service boundaries. Lightstep considers every operation and intelligently assembles traces automatically for interesting events like errors or latency spikes, as well as traces representative of normal operating behavior. Once assembled, these traces are stored indefinitely and can be reviewed at any time. By considering all of an application’s transactional data, Lightstep reliably detects one-in-a-million anomalies, unlike any other technology, and shows everything that happens both upstream and downstream from the event. Lyft’s systems generate more than 100 billion microservice calls per day. As Morelli stated, “With Lightstep, there is no risk of overlooking any problems at the edges where the biggest problems are found.”
View detailed end-to-end traces for complex, distributed transactions and make better critical-path optimizations
Monitoring and application performance insights from Lightstep also empower engineers to make many critical-path optimizations that improve ride request times, increase dispatch efficiency, and ensure effective incident postmortems – all of which translates into increased revenue and developer efficiency. According to engineers Roy Williams and Danial Afzal, one of the first projects where they used Lightstep was a spring cleaning of the entire system. The focus was on identifying and optimizing critical paths for dispatch services that connect riders with drivers. Lyft was able to improve the efficiency of customer ride routes and accelerated response times by 60% (250 milliseconds). Saving time is a key goal, explained Williams: “The more time we get, the more efficient we can be. If we can use those extra milliseconds to find a more efficient match, that’s a win for us, that’s a win for our customers.”
Read the case study, Lightstep Enables Lyft’s Move to Microservices, Helping Drive Significant Revenue and Improving Product Efficiency, and get all of the details about Lyft’s success.