Lightstep from ServiceNow Logo

Products

Solutions

Documentation

Resources

Lightstep from ServiceNow Logo
< all blogs

Monitoring PostgreSQL with OpenTelemetry and Lightstep

With the complexity of today’s enterprise systems—built as stacks of what may be hundreds of different components deployed across various clouds and regions—tracking down performance issues or downtime causes is a beast. Teams struggle to collect metric data from their disparate components in a uniform and consistent way, and they’re looking for a central place to handle analysis and alerting on that data.

In this series, we’re covering the OpenTelemetry CollectorOpenTelemetry Collector, a vendor-agnostic tool that can connect to nearly any component in your stack to receive, process, and export performance metrics. When you export those metrics to LightstepLightstep, you attach the versatility of OpenTelemetry to an observability platform that gives you powerful tools for monitoring, querying, and alerting.

We kick off this series with a walkthrough of how to set up monitoring of your PostgreSQL instance with OpenTelemetry and Lightstep. We’ll look briefly at some of the key metrics exposed by PostgreSQL and collected by OpenTelemetry. Then we’ll step through how to set up OpenTelemetry Collector for ingesting and exporting to Lightstep. Finally, we’ll introduce the use of dashboards, queries, and alerts in Lightstep.

Let’s begin with a brief look at key metrics in PostgreSQL.

Key Metrics in PostgreSQL

As a commonly used database, PostgreSQL is an essential component for most web applications. By continuously monitoring your PostgreSQL performance—especially if you couple this with alerts—your team can ensure database uptime and application functionality.

PostgreSQL captures a broad set of performance metrics through its statistics collectorstatistics collector, and these metrics help you to keep an eye on the health and performance of your instance. The OpenTelemetry Collector has a PostgreSQL receiverPostgreSQL receiver purpose-built to query the PostgreSQL statistics collector in order to scrape these metrics.

While the full list of PostgreSQL metrics produced by the receiver can be found herehere, let’s cover a few notable ones.


Connection counts

For each database, PostgreSQL captures a metric called numbackends, which is essentially the number of active connections to that database. The Collector tracks a metric called postgresql.backends, which tallies the number of active connections across all databases in the PostgreSQL instance.|

The Collector also retrieves the maximum number of client connections allowed by the current PostgreSQL configuration, exposing this metric as postgresql.connection.max.

Monitoring the number of active connections, especially relative to the total number of connections allowed, can help a team determine if their PostgreSQL instance is under a heavy and unsustainable load. Too many connections can be caused by long-running queries in an application, resulting in the creation of new connections rather than the use of existing ones.

Data size and usage

The Collector also exposes certain counts that can give an indication as to data size and usage. For example:

  • postgresql.database.count: The total number of user databases.

  • postgresql.table.count: The total number of tables across all user databases.

  • postgresl.db_size: The total amount of disk space used across all user databases.

  • postgresql.rows: The total number of data rows across all user databases.

  • postgresql.index.size: The total amount of disk space used by the index.


Transactions and operations

Metrics related to the performance and load of database operations are also available:

  • postgresl.operations: The total number of row operations across all user databases.

  • postgresql.commits: The total number of commits across all user databases.

  • postgresl.rollbacks: The total number of rollbacks across all user databases.

Tracking the number of transaction commits and rollbacks gives an indication of the current level of activity in a database, also exposing whether your applications might be rolling back transactions excessively due to errors.

Now that we’ve covered some of the notable metrics, let’s walk through the Collector setup.

Setting Up the Collector

There are many deployment modelsdeployment models for running the Collector. They include Docker, Kubernetes, and system installs like Debian. For our demo, we’ll use Debian Linux, and we’ll install a single Collector instance that runs on our local machine. After installing the Collector, we’ll configure it to monitor our PostgreSQL instance.

Install the Collector

Because the postgresql receiver is in beta, it is bundled with the contributor distribution of the Collector. Therefore, instead of installing the main distributionmain distribution of the Collector, we’ll install the contributor distribution binary found on GitHubcontributor distribution binary found on GitHub. We navigate to the latest release, find the binary that we will use on our system, and install it.

For our Debian Linux system, we run the following commands:


After installing the Collector, we verify that it’s running:

$ sudo systemctl status otelcol-contrib
    ● otelcol-contrib.service - OpenTelemetry Collector Contrib
        Loaded: loaded (/lib/systemd/system/otelcol-contrib.service; enabled; vendor preset: enabled)
        Active: active (running) since Tue 2022-12-27 20:43:26 PST; 5s ago
      Main PID: 677031 (otelcol-contrib)
         Tasks: 13 (limit: 18868)
        Memory: 24.7M
        CGroup: /system.slice/otelcol-contrib.service
                ├─677031 /usr/bin/otelcol-contrib --config=/etc/otelcol-contrib/config.yaml
                └─677047 /usr/bin/dbus-daemon --syslog --fork --print-pid 4 --print-address 6 --session



Set up PostgreSQL permissions for the Collector

For our example, we’ll scrape metrics from our local instance of PostgreSQL.

$ psql --version psql (PostgreSQL) 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1)


Our Collector will need permissions in PostgreSQL to interact with the PostgreSQL statistics collector. To do this, we create a new PostgreSQL user, which we will call otel, with a password (otelpassword). We’ll create an empty database (also called otel) with otel as the owner, for simpler connection testing. Lastly, our otel user will need SELECT privileges on the pg_stat_database view in PostgreSQL.

postgres=# create user otel with password 'otelpassword'; CREATE ROLE postgres=# create database otel owner otel; CREATE DATABASE postgres=# grant SELECT on pg_stat_database to otel; GRANT


We can verify that we did this correctly by connecting to PostgreSQL as the otel user and attempting to read from pg_stat_database:

~$ psql -h localhost -p 5432 -U otel -W
Password: ************

psql (12.12 (Ubuntu 12.12-0ubuntu0.20.04.1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

otel=> select count(*) from pg_stat_database;
 count 
-------
  13 
(1 row)


With PostgreSQL properly prepped, we can configure our Collector.


Configure a Collector receiver

The Collector primarily consists of three components:

  • Receivers collect data from data sources, either by push or pull.

  • Processors operate on data between its point of collection and export, performing operations such as filtering or transformation.

  • Exporters ship the data out of the Collector, whether that be to a simple local logging console or to a remote backend like Lightstep.

To get started, we must configure our postgresql receiver by editing the Collector configuration file found at /etc/otelcol-contrib/config.yaml. Our simple configuration file, in its entirety, looks like this:

receivers:
  postgresql:
    username: otel
password: otelpassword

processors:
  batch:

exporters:
  logging:
    verbosity: detailed

service:
  pipelines:
    metrics:
      receivers: [postgresql]
      processors: [batch]
      exporters: [logging]


We have configured the postgresql receiverpostgresql receiver with the username and password we set up in PostgreSQL. We will use the standard batch processorbatch processor, which batches incoming data and compresses it for more efficient exporting.

Finally, we’ll use the logging exporterlogging exporter, which is configured with verbosity set to detailed. We’ll do this initially just to verify that the Collector is properly capturing metrics from PostgreSQL, but we’ll change our exporter later.


Next, we restart the Collector.

$ sudo systemctl restart otelcol-contrib


To verify that the Collector is receiving metrics from our database, we run the following command:

$ journalctl -u otelcol-contrib -f
...
InstrumentationScope otelcol/postgresqlreceiver 0.68.0
Metric #0
Descriptor:
      -> Name: postgresql.backends
      -> Description: The number of backends.
      -> Unit: 1
      -> DataType: Sum
      -> IsMonotonic: false
      -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2022-12-28 05:43:05.738132562 +0000 UTC
Timestamp: 2022-12-28 05:43:35.772060409 +0000 UTC
Value: 3
Metric #1
Descriptor:
      -> Name: postgresql.commits
      -> Description: The number of commits.
      -> Unit: 1
      -> DataType: Sum
      -> IsMonotonic: true
      -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2022-12-28 05:43:05.738132562 +0000 UTC
Timestamp: 2022-12-28 05:43:35.772060409 +0000 UTC
Value: 159813
Metric #2
Descriptor:
      -> Name: postgresql.db_size
      -> Description: The database disk usage.
      -> Unit: By
      -> DataType: Sum
      -> IsMonotonic: false
      -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2022-12-28 05:43:05.738132562 +0000 UTC
Timestamp: 2022-12-28 05:43:35.772060409 +0000 UTC
Value: 8168303
...


We’ve verified that the Collector is receiving metrics from PostgreSQL. Now, we’re ready to export those metrics to Lightstep. Before we reconfigure the Collector with a different exporter, let’s get set up with Lightstep.


Obtain a Lightstep access token

Signing up for a new Lightstep accountSigning up for a new Lightstep account is free and simple.

Sign Up


After you’ve logged in, you can navigate to Project Settings, and then to the Access Tokens page. Create a new access token which your Collector will use for authentication when exporting its data to Lightstep.

Access Token

Configure the Collector to export to Lightstep

Now that we have a Lightstep access token, we will need to modify our Collector configuration (found at /etc/otelcol-contrib/config.yaml). It should look like this:

receivers:
  postgresql:
    username: otel
    password: otelpassword

processors:
  batch:

exporters:
  logging:
    verbosity: detailed
  otlp/lightstep:
    endpoint: ingest.lightstep.com:443
    headers: {"lightstep-access-token": "INSERT YOUR TOKEN HERE"}

service:
  pipelines:
    metrics:
      receivers: [postgresql]
      processors: [batch]
      exporters: [otlp/lightstep]

We’ve configured a new exporter, which we call otlp/lightstep. This otlp exporter uses the OpenTelemetry Protocol (OTLP); we give it a unique name for our use of the exporter with Lightstep by adding /lightstep to our exporter configuration. We use the standard Lightstep endpoint, and we paste in our access token from the previous step.

Finally, we restart the Collector.

$ sudo systemctl restart otelcol-contrib

Those are all of the steps we need to take on the Collector side. Now, we can move to Lightstep to start thinking about queries and alerts.


Working with Metrics in Lightstep

In this section, we’ll cover some basic usage of Lightstep. However, more detailed usage steps and examples can be found herehere.


Creating a dashboard

We can begin by creating a dashboard with charts of various PostgreSQL metrics that the Collector has captured. On the Dashboards page, click on Create Dashboard.

Dashboard Search


On the New Dashboard page, we can provide a name and a description for our dashboard.

Postgres Metrics


Adding a chart

Next, we click on Add a chart. For the new chart, we provide a title for the chart. We can use the query builder to build a query around a specific metric or set of metrics.

5. Add chart initial


With “Metric” selected, we can begin typing in the search box. We are shown suggestions that begin with postgresql, which show metrics that are coming through the Collector.

Query filter image


For example, we can choose postgresql.backends to plot the number of client connections to our PostgreSQL instance. After opening several terminals to make several connections to PostgreSQL, we begin to see data plotted like the following:

Number of connections data plot


We can adjust the time range for our query. For example, we can adjust the chart to show the last 15 minutes.

Time range chart


The resulting chart looks like this:

Time chart, view one

Adding multiple metrics to a chart

We can add another metric to this chart, such as postgresql.connection.max.

Multiple metrics chart


The resulting chart now shows both metrics, with active connections in blue (all in the 6-10 connections range) and max connections in purple (fixed at 100).

Line chart image

Working with formulas

Although seeing the number of active connections and max connections is helpful, it would be more useful for us to see the proportion of active connections to max connections. For this, we continue working within the present cart, but we click on Add a formula.

Add formula screenshot


The active connections metric is labeled (by Lightstep) as a, while the max connections metric is labeled as b. To express the percentage of our allowable connections that is currently active, we would write a formula like this:

Another view of formula percentages


Then, we toggle what our chart displays, showing only the resulting percentage value while hiding the raw metrics. We do this by checking the box for our formula but unchecking the boxes for metrics a and b.

A view of a formula chart


The resulting chart looks like this:

Image of 20 connections


Of course, this chart isn't very inspiring, as max connections just happen to be set to 100. Therefore, an active connection count of eight will result in an active connections proportion of 8%. However, if we were to modify the configuration of our PostgreSQL instance to set the max connections to 20, then our chart would look like this:

Image of percentage of connections active


Quite quickly, our proportion of active connections jumps from around 4% (which is 4 out of 100) to 20% (which is 4 out of 20).

We change the type of our chart to show Area rather than Line, and then we rename our chart to “% of Connections Active.” Finally, we save our chart. Our dashboard, with one chart, now looks like this:

Ingest metrics - postgres - dashboard with one chart


Adding multiple charts to a dashboard

After adding several charts, each with its own query, our dashboard begins to take shape.

Image of Postgres metrics dashboard

Working with UQL in the Query Editor

Returning to our “% of Connections Active” query, you’ll recall how we originally used the Query Builder to create this query. If you’re familiar with UQLUQL, you can also create your queries directly with the Query Editor. The equivalent of the query we created, in UQL, looks as follows:

Image of connections active

Creating an alert

Lastly, we can also create alerts on our queries. Within our chart, we click on Create an alert.

Image of create an alert button


We can configure our alert to trigger based on certain conditions related to our formula. For example, we can set a “Critical threshold” at 80% of allowable connections active. We can also set a “Warning threshold” at 60%.

Then, we can configure how Lightstep should notify us when these thresholds are reached. We can set up a notification to a generic webhook, Slack, or other destinationsdestinations.

Image of alert configuration

Conclusion

In this walkthrough, we guided you through the setup of the OpenTelemetry Collector to receive metrics from PostgreSQL and export them to Lightstep for querying, analysis, and alerting. Throughout this series, we’ll provide guides for connecting other components in your tech stack with the Collector and Lightstep.

Whether your organization is running a web application, an ML-backed data pipeline, or business-critical service integrations, you likely run PostgreSQL, and the performance and uptime of your PostgreSQL instances are essential to the success of your business. By monitoring key PostgreSQL metrics related to connections, resource usage, and transaction throughput, you can quickly detect and even preempt performance issues that might cripple your PostgreSQL-dependent systems and impact your operational efficiency.

For more detailed information on what we’ve covered, check out the following resources:

OpenTelemetry CollectorOpenTelemetry Collector

PostgreSQL receiver for OpenTelemetry CollectorPostgreSQL receiver for OpenTelemetry Collector

PostgreSQL Statistics CollectorPostgreSQL Statistics Collector

Exporting OpenTelemetry Collector data to Lightstep with OTLPExporting OpenTelemetry Collector data to Lightstep with OTLP

Want to learn more? Schedule a demoSchedule a demo.

Request a demo!Request a demo!
February 15, 2023
11 min read
Technical

Share this article

About the author

Robin Whitmore

Robin Whitmore

Read moreRead more

Three Terraform Mistakes, and How to Avoid Them

Adriana Villela | Nov 16, 2022

Learn how to fix three Terraform gotchas around module and provider configurations.

Learn moreLearn more

Defining Services for Incident Response

Darius Koohmarey | Oct 17, 2022

In Lightstep Incident Response, we believe in defining the service on every alert or incident. There are numerous benefits associated with this approach.

Learn moreLearn more

Events, Alerts, and Incidents - What's the Difference?

Darius Koohmarey | Oct 5, 2022

Understand the difference between events, alerts, and incidents. Lightstep Incident Response is the all-in-one platform that enables developers, DevOps, and site reliability engineers to respond quickly and effectively to incidents.

Learn moreLearn more
THE CLOUD-NATIVE RELIABILITY PLATFORM

Lightstep sounds like a lovely idea

Monitoring and observability for the world’s most reliable systems