Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

Monitoring NGINX with OpenTelemetry and Lightstep

NGINX is a commonly used web server, boasting performance that is 2.5x faster than Apache. Many enterprises use NGINX to host both internal and customer-facing web services. Because these web services are critical, continuous monitoring of the NGIX web server (especially when coupled with alerts) is essential to ensuring the performance and uptime of web applications.

This guide provides an overview of the metrics available from NGINX and how to start ingesting metrics from NGINX and send them to Lightstep for more intelligent analysis. Then, it looks at how to create charts in Lightstep to help with NGINX monitoring.

Key Metrics in NGINX

When NGINX runs with the ngx_http_stub_status_modulengx_http_stub_status_module, you gain access to basic information about the current status of your NGINX server. This module exposes the following key metrics:

  • Requests: The total number of client requests.

  • Accepts: The total number of accepted client connections.

  • Handled: The total number of handled connections.

  • Connections:

    • Writing: The current number of connections where NGINX is writing the response back to the client.

    • Waiting: The current number of idle connections waiting for a request.

    • Active: The current number of active connections, including those that are waiting for a request.

    • Reading: The current number of connections where NGINX is reading the request header.

The metric with the total number of requests is a single number that increases over time. It is only reset to zero when the NGINX server restarts. For this reason, simply monitoring the number is not very helpful. Instead, you would want to look at the rate at which this number is increasing. For example, consider the following time series data:

Timestamp

Total number of requests

08:00:00

300

08:00:30

312

08:01:00

312

08:01:30

372

08:02:00

1,572

From 8:00:00 to 8:01:00, the total number of requests increased by 12, meaning the server saw requests arrive at an average 0.2 requests (12 / 60) per second over that one-minute window. From 8:01:00 to 8:02:00, the number of requests increased from 312 to 1,572, yielding 1,260 new requests in a single minute. On average, this means the server saw 21 requests (1260 / 60) per second over that one-minute window. Over the entire two-minute window, the rate of incoming requests would be 10.6 requests (1272 / 120) per second. Your DevOps team would want to be alerted if the rate of requests suddenly spiked, perhaps indicating a DDoS attack underway.

The metrics representing current connections based on state are also helpful for alerting your team to possible issues. For example, if you see a high number of connections with writing state and very few in waiting state, then your server might be trying to process requests, but is blocked as it waits for results from other third-party, upstream services.

Consider when metrics show an unchanging number of handled connections, but a rapidly increasing number of accepts. This indicates a continuing influx of connection attempts that are not being handled by NGINX. These dropped connections may point to a wider problem that requires deeper investigation.

To capture NGINX metrics and visualize them, leverage OpenTelemetry CollectorOpenTelemetry Collector and Lightstep.

Configure OpenTelemetry Collector

Before you can begin using OpenTelemetry Collector to gather and ship metrics, you need to set up NGINX.

Set up the NGINX server

First, spin up an NGINX server locally, with a metrics scraping endpoint exposed. Ensure that the NGINX module has been built and enabled with the --with-http_stub_status_module configuration parameter.

Next, create a new virtual host file configuration called 00-site-with-status, placing it in the /etc/nginx/sites-available folder. The file has the following contents:

server {
  listen 80;
  server_name localhost;
  location / {
    proxy_pass http://127.0.0.1:3000;
  }
  location /status {
    stub_status;
  }
}

With this configuration file, you've opened up a localhost at port 80, with requests to the root path pointing to the web application, which is listening on port 3000. Let’s assume that you've spun up a simple web application (for example, a simple Node.js Express applicationNode.js Express application) that is listening on port 3000.

Next, configure the /status path to display the metrics from the NGINX status module. Then, restart the NGINX server with the following command:

$ sudo systemctl restart nginx

In the web browser, this is what you see when you visit http://localhost/status:

localhost-status-image


Install the Collector

The NGINX receiver is bundled with the contributor distribution of the OpenTelemetry Collector. To use it, you need to install the contributor distribution binary found on GitHubcontributor distribution binary found on GitHub. For the Debian Linux system, install the collector as follows:

$ wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.72.0/otelcol-contrib_0.72.0_linux_amd64.deb
$ sudo dpkg -i otelcol-contrib_0.72.0_linux_amd64.deb

After installing the collector, verify that it’s running:

● otelcol-contrib.service - OpenTelemetry Collector Contrib
     Loaded: loaded (/lib/systemd/system/otelcol-contrib.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-02-27 14:10:00 PST; 5s ago
   Main PID: 677031 (otelcol-contrib)
      Tasks: 13 (limit: 18868)
     Memory: 24.7M
     CGroup: /system.slice/otelcol-contrib.service
             ├─677031 /usr/bin/otelcol-contrib --config=/etc/otelcol-contrib/config.yaml
             └─677047 /usr/bin/dbus-daemon --syslog --fork --print-pid 4 --print-address 6 --session


Configure the collector

Next, configure the collector receiver to scrape metrics from NGINX. To do this, edit the collector configuration file found at /etc/otelcol-contrib/config.yaml. Add the NGINX receiver, setting it to retrieve metrics from the /status endpoint every 10 seconds.

receivers:
  nginx:
    endpoint: http://localhost/status
    collection_interval: 10s

processors:
  batch:

exporters:
  logging:
    verbosity: detailed

service:
  pipelines:
    metrics:
      receivers: [nginx]
      processors: [batch]
      exporters: [logging]


For the processor, use the batch processorbatch processor, which batches and compresses incoming data for more efficient exporting. For the exporter, use the logging exporterlogging exporter to start out, which is configured with verbosity set to detailed. After verifying that the collector is properly capturing metrics from NGINX, change the exporter to send metrics to Lightstep.

Once the collector is configured, restart it:

$ sudo systemctl restart otelcol-contrib

To verify that the collector is receiving metrics from NGINX, run the following command:

x$ journalctl -u otelcol-contrib -f
…
Feb 27 14:16:00 demo otelcol-contrib[1215451]: Resource SchemaURL:
Feb 27 14:16:00 demo otelcol-contrib[1215451]: ScopeMetrics #0
Feb 27 14:16:00 demo otelcol-contrib[1215451]: ScopeMetrics SchemaURL:
Feb 27 14:16:00 demo otelcol-contrib[1215451]: InstrumentationScope otelcol/nginxreceiver 0.68.0
Feb 27 14:16:00 demo otelcol-contrib[1215451]: Metric #0
Feb 27 14:16:00 demo otelcol-contrib[1215451]: Descriptor:
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> Name: nginx.connections_accepted
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> Description: The total number of accepted client connections
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> Unit: connections
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> DataType: Sum
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> IsMonotonic: true
Feb 27 14:16:00 demo otelcol-contrib[1215451]:      -> AggregationTemporality: Cumulative
Feb 27 14:16:00 demo otelcol-contrib[1215451]: NumberDataPoints #0
Feb 27 14:16:00 demo otelcol-contrib[1215451]: StartTimestamp: 2023-02-27 21:15:50.558593496 +0000 UTC
Feb 27 14:16:00 demo otelcol-contrib[1215451]: Timestamp: 2023-02-27 21:16:00.583045553 +0000 UTC
Feb 27 14:16:00 demo otelcol-contrib[1215451]: Value: 1

After you've verified that NGINX metrics are being collected and logged, you're ready to send those metrics to Lightstep.

Send metrics from OpenTelemetry Collector to Lightstep

Before you can send metrics to Lightstep, you'll need a Lightstep account. Signing up for a new accountSigning up for a new account is free and simple.

lightstep-sign-up-image

After logging in, navigate to Project Settings, and then to the Access Tokens page. Your OpenTelemetry Collector will need an access token to authenticate requests when exporting metrics data to Lightstep. Create a new access token and copy down its value.

access-token-image


Configure the collector to export to Lightstep

Returning to the collector configuration at /etc/otelcol-contrib/config.yaml, you'll configure a different exporter called otlp/lightstep. You’ll use Lightstep’s ingestion endpoint and then paste in your token. The resulting file should look like this:

receivers:
  nginx:
    endpoint: http://localhost/status
    collection_interval: 10s

processors:
  batch:

exporters:
  logging:
    verbosity: detailed
  otlp/lightstep:
    endpoint: ingest.lightstep.com:443
    headers: {"lightstep-access-token": "INSERT YOUR TOKEN HERE"}

service:
  pipelines:
    metrics:
      receivers: [nginx]
      processors: [batch]
      exporters: [otlp/lightstep]

Then, restart the collector.

$ sudo systemctl restart otelcol-contrib

Now that you're up and running with NGINX and OpenTelemetry Collector, you can begin working with metrics in Lightstep.

Working with metrics in Lightstep

Let’s look at some basic ways to use Lightstep in conjunction with NGINX metrics. If you’re interested in more detailed examples, you can go herehere.

Create a dashboard

First, you'll create a new dashboard to display charts related to NGINX metrics. Go to the Dashboards page, and then click on Create Dashboard. Next, provide a name and description for the new dashboard.

dashboard-name-image


Add a chart

Next, add a chart to the dashboard. Click on Add a chart

add-chart-button-image


The first chart will show the current number of connections, regardless of connection state. For this, you need to select the telemetry type that you'll be charting. Select “Metric.”

select-telemetry-type-image


Then, search for the metric you're looking for: nginx.connections_current.

select-metric-image


The resulting chart shows the number of current connections over the last 60 minutes.

current-connections-60-minutes-image


If you want to focus on a smaller window of time, you can adjust the time range. For example, you can adjust it to show the last 10 minutes.

adjust-time-range-image


The scale of the chart adjusts, showing us data from the last 10 minutes only.

current-connections-10-min-images


Finally, you can save the chart to your dashboard. The resulting dashboard shows the first chart.

11-dashboard-one-chart-image


Filter metrics by an attribute

The chart you created was based on the nginx.connections_current metric. However, if you look closely at the verbose logging of that metric, you'll see that it's a single metric made up of four data points—one for each connection state (active, reading, writing, waiting). The first chart did not filter by state, but instead aggregated the values across the different states and simply displayed the max of the four values.

A more helpful display of current connections would use filtering to show the value for each state. To do this, add a new chart. Although you'll use the same nginx.connections_current metric, you then want to use the Filter box and select state for the attribute key.

current-connections-attribute-key-state-image


Then, you can select the attribute value that you want to display.

current-connections-search-attribute-value-image


For the first metric in this chart (metric a), select connections with active state.

current-connections-active-all-metric-attributes-displayed-mage


A single chart can display more than one metric, and it can also display formulas that operate on metrics. For this chart, you want to display four metrics—one for each connection state. Click on Plot another metric.

plot-another-metric-image


Then, you'll add the nginx.connections_current metric with state = reading. This will be metric b on the chart.

current-connections-reading-image


Now, do the same for state = waiting (metric c).

current-connections-waiting-image


And again for state = writing (metric d).

current-connections-writing-mage


The chart shows multiple bands of color, showing current connection numbers for each state.

current-connections-by-state-full-chart-image


A key, below the chart, shows us which color corresponds to which metric.

current-connections-by-state-key-image


Lastly, save the chart and add it to the dashboard. After creating additional charts, your dashboard of multiple NGINX visualizations begins to take shape.

dashboard-with-4-charts-mge

UQL and Alerts

So far, you've created charts using the Query Builder, which provides a simple and easy-to-use interface for selecting metrics and configurations. However, users who are familiar with the Lightstep Unified Querying Language (UQL)Lightstep Unified Querying Language (UQL) can use the Query Editor to write queries directly. For example, your chart of current connections filtered by state would be represented in UQL like this:

current-connections-by-state-as-UQL-image


In addition, you can create alerts to notify you when metrics surpass certain thresholds. For example, a DevOps team monitoring their NGINX server may want to be alerted when nginx.connections_accepted increases faster than nginx.connections_handled, as this may indicate that NGINX is dropping connections. Notifications can be set up to use webhooks or third-party services like PagerDuty or Slack.

Conclusion

Making sure that your NGINX servers are performing properly is essential to delivering your web applications and services. For most enterprises, web applications and services are mission-critical to business. Therefore, proper monitoring of NGINX and quick alerting on issues is a must-have. With these practices in place, DevOps teams can respond quickly whenever an issue surfaces, and that will lead to increased uptime and reliability.

Ready to learn more? Schedule a demoSchedule a demo today.




April 6, 2023
9 min read
Technical

Share this article

About the author

Robin Whitmore

Robin Whitmore

Read moreRead more

Monitoring Apache with OpenTelemetry and Lightstep

Andrew Gardner | May 2, 2023

Continue your observability journey by ingesting metrics from Apache and sending them to Lightstep.

Learn moreLearn more

Monitoring MySQL with OpenTelemetry and Lightstep

Andrew Gardner | Apr 11, 2023

Learn how to ingest metrics from MySQL and send them to Lightstep.

Learn moreLearn more

Monitoring PostgreSQL with OpenTelemetry and Lightstep

Robin Whitmore | Feb 15, 2023

Get an in-depth walkthrough of how to set up monitoring of your PostgreSQL instance with OpenTelemetry and Lightstep.

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems