LightStep On-Prem and Private-Managed Collectors


The LightStep data processing pipeline has two major components: the collector pool and the analysis engine. All spans generated by instrumented clients and servers are sent to a collector pool where they are processed and temporarily stored during trace assembly. The analysis engine records aggregate information about spans, directs the assembly process, and stores traces durably.

While LightStep offers a complete SaaS solution encompassing both components, we offer two additional options for LightStep Enterprise customers, on-prem collectors and private-managed collectors, that provide additional security and improve performance. These are hybrid configurations as the analysis engine is run by LightStep as a service in all cases.

SaaS Collectors

LightStep offers a SaaS pool for customers with smaller workloads and as an evaluation tool for Enterprise customers. These collectors provide only limited recall for assembling traces and are not suitable for larger workloads.

On-Prem Collectors

LightStep also offers the option for customers to run collectors on-premise. This gives you complete control over the resources used by the collector pool and therefore control over how many traces can be assembled and how much recall is available.

An on-prem pool also greatly reduces the amount of data leaving your network and the costs associated with network egress. Each instrumented client or server sends all spans to a collector, where statistics are computed and the spans are temporarily buffered. These buffered spans are provided—on-demand—to the LightStep analysis engine as part of trace assembly. Since no filtering occurs in the instrumented clients or servers themselves, the amount of data that passes between your application and the collector pool is much larger than the amount of data that passes between the collector pool and the analysis engine.

Because it is hosted within your VPC or datacenter, an on-prem pool provides additional security over the SaaS pool. Collectors support an option to scrub data from spans before they are forwarded to the analysis engine. While no sensitive data should be sent to collectors, this feature provides a form of defense-in-depth to ensure that these data don’t leave your VPC or datacenter.

Collectors are straightforward to deploy (using an AMI or Docker image) and have no storage or other dependencies. However, running a collector pool does require additional operational work for your team, and you are also responsible for the computing costs associated with running these servers or instances. To balance these trade-offs, some customers choose to use the SaaS pool for evaluation and development and then deploy an on-prem pool for production systems.

Private-Managed Collectors

Using private-managed collectors is an option that falls part way between the SaaS pool and an on-prem pool. Like the SaaS pool, these collectors are managed by LightStep engineers, but like an on-prem pool, they only handle data from your application. In addition, technologies like AWS VPC peering can provide additional security. (Note that there is a cost associated with data transfer across peering connections.)