Stepping It Up! Lightstep Feature Updates - May 2021
by Robin Whitmore
It's been a while since we've shouted about all the great releases at Lightstep! Take a look at all we've accomplished in the past few months.
There are now endpoints to our public API that let you work with your metric dashboards and alerts.
Lightstep's new Microsatellites significantly simplify data ingestion by shifting most of the data processing and storage to the Lightstep SaaS. Instead of temporarily storing span data to begin analyzing traces in the Satellite, Microsatellites immediately forward all span data directly to the Lightstep SaaS for immediate analysis.
Microsatellites are easier to set up and manage and can reduce Satellite fleets up to 90%.
Instead of the sometimes complicated math to ensure Satellites had enough memory set aside for span indexing and enough recall to suit your needs, now you don't need to determine how many Microsatellites to run or how much retention to maintain. Retention is always fixed at 1-hour.
You can configure your orchestration framework to scale Microsatellites on CPU and memory utilization, which is a more intuitive metric and has more support across ecosystems (K8, ECS, etc).
Microsatellites are less sensitive to load balance issues.
You no longer need to worry about Satellites being imbalanced, leading to a degraded product experience like inconsistent recall across pools, incomplete trace assembly, and inaccurate time series. Now when load balancing isn't perfect, the product experience won't be degraded. And because retention is now configured in Lightstep SaaS, retention across all pools/projects will always be consistent.
- 1-hour retention for all projects. You no longer have to tune your Satellites to gain a larger recall window.
- Higher Key Ops and Streams limits
You can convert your Classic Satellites to Microsatellites with a configuration change. Contact Customer Success for more info!
By default, Lightstep displays times based on your browser's configured time zone. You can now change that to a specific time zone from the Account Settings page.
Error analytics solutions like Rollbar can connect errors seen in your system to specific lines of code that may be causing the issue, allowing developers to quickly remediate issues before they become widespread. However, the problem with many of these tools is that in a distributed system, different services owned by different teams often collect varied telemetry data, making it difficult to trace an issue through the stack.
We’ve built an experimental OpenTelemetry plugin that works with Rollbar to automatically create attributes from Rollbar metadata. Whenever Rollbar is called from your Node.js service, it collects the values of those attributes and adds them to the span.
When you view a trace in Lightstep, if a span contains a call to Rollbar, the Rollbar UUID appears in the Attributes panel. Learn how to use our new Learning Path!
Many distributed systems include API calls to services in the AWS cloud. Instrumenting your own services with OpenTelemetry is great for monitoring performance inside your system, but observability becomes a black hole once a call is made out to the cloud. Issues in those cloud services might be causing latency or errors in your services, but there's no way to tell.
To solve this problem, you can use an OpenTelemetry AWS plugin that automatically instruments services in the AWS cloud with OpenTelemetry. With a single line of code, it instruments and then captures all requests made from your Node.js service to the AWS API. The resulting data can then be seen in Lightstep, removing the blind spots from your observability.
Learn how to use our new Learning Path!
When you're using Change Intelligence to determine the cause of a metric deviation, you can now view the chart's query.
The query displays in a read-only format. If any global filters have been applied, you can see those by clicking the filter dropdown.
Lightstep metric charts now allow you to toggle the visibility of time series when the chart uses multiple queries or a formula.
Lightstep can now ingest your application, infrastructure, and cloud metrics! You can build dashboards, create charts from queries on your data, and configure alerts to let you know when things go wrong.
But most important, once you’ve set up Lightstep to ingest your metric data, you can use Change Intelligence to investigate what caused any deviations in those metrics. Change Intelligence determines the service that emitted a metric, searches for performance changes on Key Operations from that service at the same time as the deviation, and then uses trace data to determine what caused the change.
When you deploy new versions of your services, you always hope for the best but expect the worse. Likely, part of your workflow is to check performance after each deploy. But that means everyone has to remember to do that and then use additional tools to do the monitoring and if there’s an issue, a regression investigation. Switching context like that can be hard. And even then, you may not have enough information to understand what code actually caused that change.
Without needing to leave your CI/CD workflow, Lightstep automatically determines if there are changes after a deploy and reports those changes directly to GitHub. Our Services Change Report GitHub action automatically checks the system health of the current environment once a deploy in GitHub is successful by taking a Snapshot of performance at the time of the deploy and comparing it to the most recent snapshot (or any snapshot you choose) in Lightstep.
It lets you know if there are any latency, error rate, or operation rate issues for each service you’ve configured it to check. It also includes links to downstream dependencies for each service, operations on the service, and traces from the Snapshots, so you can immediately start remediation, right from GitHub.
Outside of a post-deployment scenario, you can use this action anytime you need span data to provide observability information into your committed code. For example, you can configure the action to run whenever a particular label (
bug) is added to a GitHub issue. Or, you can configure it to check for violations of your coding best practices!
Check out our new Learning Path for details on how to use our latest action.