How Simplebet uses Lightstep’s observability platform to handle millions of bets
Headquarters: San Francisco, CA
Industry Segment: Computer Software
Experiencing latency with unknown causes
Incomplete system visibility
Mixed naming conventions leading to team confusion
Clear context around what went wrong and why during an issue
Enhanced DevOps culture with OpenTelemetry and Observability
Launched a successful and profitable product with the help of Lightstep
Simplebet is a sports technology and betting company focused on unlocking a new type of sports betting called Micro-Markets. Rather than betting on the outcome of a game (win or loss), Micro-Markets turn every moment of every sporting event into a betting opportunity. You’re able to bet on every plate appearance in a baseball game, every drive in a football game, every shot in a basketball game.
Simplebet has a B2B data product that allows others, whether they are sports betting companies or media companies, to connect to a live stream of odds and surface Micro-Markets themselves.
In order to have complete system visibility and reduce Mean Time to Resolution (MTTR) for all aspects of their system, a wide range of Simplebet teams from machine learning to pricing to front-end use Lightstep on a daily basis. Each team proactively monitors their system with real-time system diagrams to see precisely where an error is, along with using Lightstep’s Change Intelligence, which automatically surfaces metrics, traces, and logs correlated with latency and errors.
Simplebet recently launched its first product — Micro-Market sports betting — in September 2020. This created immediate and exponential growth in traffic. “We went from having no public product to building a product that was capable of scaling to handle millions of bets,” said Dave Lucia, VP of Engineering. “Even before we launched the products, we put a lot of time into load testing the system and using Lightstep to find out where those bottlenecks were and tackle them one by one.”
Although the growth in traffic was expected, Simplebet’s traffic patterns are highly variable due to sports betting’s natural environment. “"When the markets open, we receive a major spike in bets that totals hundreds, if not thousands, of bets per second,” said Bryan Naegele, Head of Games.
“The way that people interact with the system can be different day-to-day, hour-by-hour. If there are 10 games going, that produces a known quantity of traffic; whereas, the consumer side can take on highly fluctuating traffic,” said Bryan.
Lightstep’s Service HealthService Health view highlights the greatest changes occurring in a system and compares operation performance. Lightstep automatically gathers latency, throughput, and error rate for each application, as well as important infrastructure metrics. From there, Lightstep will correlate changes in applications to changes in infrastructure to create a full picture of system health. With this, Simplebet is able to handle the unpredictable traffic on a daily basis.
“That traffic pattern is what we have to be able to handle. If you can't measure it and understand what's happening within the system, it's very difficult. You have to take the guesswork out as much as possible. Knowing whether or not, ‘Oh, we have a load balancer issue, or it's a database bottleneck, or there's someplace within the app code that can use some tuning.’ allows us to prioritize where to focus effort,” said Bryan.
Clear context around what went wrong
and why during an issue
"We noticed that we could really improve our observability in the whole system. We didn't have a way to end-to-end track what was going on between teams and between services."
Enhanced DevOps culture with
OpenTelemetry and Observability
"If you have observability, then you have the key part of the scientific method, which is the ability to measure. If you can't measure, then no matter what you do testing wise, it doesn't matter. You’re still just guessing."
Head of Games
Launched a successful and profitable product
with the help of Lightstep
"I've been obsessed with Lightstep. I love to share traces and talk about it and get into it. It's been a fun product to use, not just a helpful one."
VP of Engineering
Simplebet’s new product created an opportunity to invest in an observability platform. Observability helps developers understand what causes changes in their application and infrastructure, guiding even developers on the job for the first time to understand and revert regressions.
“We noticed that we could really improve our observability in the whole system. We didn't have a way to end-to-end track what was going on between teams and between services,” said Joshua Massover, Data Engineer.
Simplebet chose Lightstep not only based on the full-context observability provided but due to the fact that Lightstep is one of the major OpenTelemetry contributors. OpenTelemetry is typically the first step in any Observability strategy and provides a standardized vendor-agnostic data format that makes instrumentation a breeze. “I'm part of the observability working group for Erlang and Elixir and in the SIG for OpenTelemetry,” said Byran. “When it came time to make a decision as to how the company should approach observability, I obviously had a little bit of a bias as to what was going to positively impact the company.”
“We very quickly got all of our critical services integrated with OpenTelemetry and Lightstep,” said Dave. “OpenTelemetry is becoming more and more aligned with our culture.”
“If you have observability, then you have the key part of the scientific method, which is the ability to measure. If you can't measure, then no matter what you do testing wise, it doesn't matter. You’re still just guessing,” said Bryan. “If you can't observe and know what's going on in the system, it's very easy to waste a lot of engineering resources, making educated guesses, or relying on people's assumptions as to like, how to make something faster, or how to make something more reliable.”
“Having Bryan on the team helped us get the ball rolling and build up our observability story.”, said Dave. “We've been working upstream with the Elixir open source libraries and frameworks that we use. And, we've been pushing commits to those repos, getting tighter integration, and giving back to the community that benefits us in the long run. We're pretty committed to open source at Simplebet, and we're not afraid to jump in and contribute.”
When Simplebet found a performance issue in Lightstep, they were quickly able to see which team was affected. Lightstep is able to streamline incident resolution by automatically highlighting the critical path. This allows teams to focus on the current problem rather than viewing multiple dashboards trying to locate the issue.
“This specific bottleneck began as something that was hard to track down in Kubernetes infrastructure,” said Josh. “Dave [Lucia] was able to set up an alert that tracked our long-running requests. We discovered excess time spent from when the request was sent over the wire to when another service was picking it up. We knew that there was an infrastructure layer problem or something to be improved.”
Lightstep was able to quickly and accurately detect the performance issue for the team to resolve from launch day in September to now. Since the partnership began, Simplebet has used Lightstep daily as their go-to solution and expanded the use across all SRE, Data Engineering, mobile and backend Games, Machine Learning Engineering, and Pricing teams.
“Lightstep helped us identify the type of strategy we're using for load balancing and switch it and improve the communication between those two services, “ said Cory Smith, Site Reliability Engineer. “I don't think we would've been able to identify the problem to begin with.”
With the varying traffic patterns, Lightstep is able to monitor and optimize Simplebet’s system.
“Lightstep helps us to identify how the system will perform under various conditions,” said Bryan.
“I've been obsessed with Lightstep. I love to share traces and talk about it and get into it. It's been a fun product to use, not just a helpful one, “ said Dave. “I've never seen anything that you could drill in and find the root cause of a problem so quickly and so visually. That just absolutely blew my mind.”
Each Lightstep customer gets hands-on service from the beginning. Our Customer Success team is there to ensure your team is successful from instrumentation to each deployment to any performance regression.
“We definitely don't feel like a customer of Lightstep. It's more of a partnership. Anytime we need anything, regardless of who it is in the entire org, they've been there and ready and receptive and attentive to resolving any issues,” said Bryan.
“Lightstep is worth its weight in gold,” said Josh. “It does what it says it's going to do out of the box.”
"Lightstep helped us identify the type of strategy we're using for load balancing and switch it and improve the communication between those two services. I don't think we would've been able to identify the problem to begin with."
Learn how you can use Lightstep's Change Intelligence to find the root cause when you notice a deviation in your metrics.
Explore more case studies
Lightstep sounds like a lovely idea
Monitoring and observability for the world’s most reliable systems