Monitoring a Raspberry Pi with Grafana, InfluxDB and collectd

05 February 2021

If you use a Raspberry Pi as a microserver, it might be a good idea to keep an eye on various metrics. In this article, I will show you how to set up collectd, a metrics collection daemon, InfluxDB, a time series database which will store the collected data, and finally Grafana, to display a dashboard.

An screenshot of the final dashboard with multiple graphs and gauges displaying CPU load, network traffic and used disk space — The final result will look something like this.

Preparations

I will assume you already have your Pi up and running, with shell access (for example via SSH). First, let us make sure we are running the latest software:

sudo apt update && sudo apt upgrade

InfluxDB — data storage

InfluxDB is a time-series database, meaning it is mostly used to store time-stamped data. It can be installed directly from the raspbian repository:

sudo apt install influxdb influxdb-client

Now, open up /etc/influxdb/influxdb.confin whatever text editor pleases you the most (e.g. nano or vim), and make the following changes:

First, we are going to adapt the storage directory into the tmpfs Directory, so values are stored in RAM and not written to the SD card. This should increase both query speed and extend the life of the SD card. In order to do that, change the following configuration settings:

[meta]
dir = "/tmp/influxdb/meta"

# ABRIDGED

[data]
dir = "/tmp/influxdb/data"
wal-dir = "/tmp/influxdb/wal"

Next up, we will enable the HTTP endpoint, so later Grafana can make the actual requests to our data. Here, we will bind to the local host only, because Grafana and InfluxDB are running on the same Raspberry and we do not want other network participants to be able to scoop up our data. If you want to query remote hosts (for example when running a whole cluster of Pis), you might want to change the bind-address to 0.0.0.0.

[http]
enabled = true
bind-address = "127.0.0.1:8086"

Finally, as described in this post, we will enable the collectd listener, so InfluxDB knows where to pick up what data:

[[collectd]]
enabled = true
port = 25826
database = "collectd_db"
typesdb = "/usr/share/collectd/types.db"

That’s it! Now we can start InfluxDB, enable it on startup and check if we can connect to it:

sudo systemctl enable influxdb
sudo systemctl start influxdb

If all goes well, we can fire up the client and it should connect to the database:

$ influx
Connected to http://localhost:8086 version 1.6.4
InfluxDB shell version: 1.6.4
>

Enter the following commands into the influx prompt to create the collectd database with a retention policy of 24h, meaning data will get overwritten after whole day:

CREATE DATABASE collectd_db
CREATE RETENTION POLICY "twentyfour_hours" ON "collectd_db" DURATION 24h REPLICATION 1 DEFAULT

twentyfour_hours is the name of the policy, should you want to delete it later. the REPLICATION directive is only relevant for clustered systems but must be set nonetheless. Here, we only want one copy of our data.

Press <CTRL><D> to exit the influx prompt, and start with the next step.

collectd — metric collection

Our database is now ready to receive data, now we need to feed it some! collectd will be responsible for periodically collecting data about our system and then write it into the database. It is also included in the Raspbian package repository and can be installed like this:

sudo apt install collectd collectd-utils

Once again, open the configuration file /etc/collectd/collectd.conf and ensure the following settings are not commented out:

Hostname "microserver314"

Interval 60
LoadPlugin syslog
LoadPlugin cpu
LoadPlugin cpufreq
LoadPlugin df
LoadPlugin disk
LoadPlugin entropy
LoadPlugin interface
LoadPlugin irq
LoadPlugin load
LoadPlugin memory
LoadPlugin network
LoadPlugin processes
LoadPlugin swap
LoadPlugin thermal
LoadPlugin users

You can choose any hostname you want, but make sure you later adapt your Grafana queries accordingly. Interval determines how often collectd will collect and write the data. Here we specified 60 seconds, meaning collection will happen every minute. Afterwards, we load a bunch of plugins. For further information consult the man page collectd.conf(5). The individual plugin report:

cpufreq: the CPU frequency
df: available disk space
entropy: available entropy for (pseudo) random number generation
interface: transmitted and received bytes on the network interfaces
irq: number of times the interrupt handler of the OS has been called
load: CPU load averages
memory: available and used main memory
network: write data to network servers
processes: number of processes and their state
swap: size and usage of the swap partition
thermal: CPU temperature
users: number of logged in users (via SSH, …)

If you do feel a certain plugin is unnecessary for your needs, just comment out the LoadPlugin statement and the directive. Now that the plugins are loaded, we must configure some of them individually:

<Plugin df>
# This will ignore uninteresting file systems
# to keep our DB from cluttering
    FSType rootfs
    FSType sysfs
    FSType proc
    FSType devpts
    FSType tmpfs
    FSType fusectl
    FSType cgroup
    Ignore Selected true
</Plugin>

<Plugin "syslog">
# Skip messages with info label
    LogLevel "warning"
</Plugin>

Finally, we must tell collectd where to write all the data. We will set it to the host and port of InfluxDB’s collectd listener:

<Plugin "network">
    Server "127.0.0.1" "25826"
</Plugin>

And here as well, we start the service and configure it to auto-launch on boot:

sudo systemctl enable collectd
sudo systemctl start collectd

Optional: test that data arrives in the DB

Let us see if we can fetch the number of sleeping processes:

$ influxdb
Connected to http://localhost:8086 version 1.6.4
InfluxDB shell version: 1.6.4
> USE collectd_db
> SELECT * FROM processes_value WHERE ("type_instance" = 'sleeping') ORDER BY time DESC LIMIT 1
name: processes_value
time                host           type     type_instance value
----                ----           ----     ------------- -----
1612535387871343898 microserver314 ps_state sleeping      97

Seems to work! Moving on, we will make the data more accessible and visually pleasing.

Grafana - interactive visualization

We collect the data, we store it safely in memory, what is missing? The dashboard of course! Grafana is not in the official Raspbian repository, but luckily they provide official builds. I am using a Raspberry Pi 3, so the armhf variant is used here, however, you might want to install the arm64 version if you are using a Raspberry Pi 4.

wget https://dl.grafana.com/oss/release/grafana_7.3.7_armhf.deb
sudo dpkg -i grafana_7.3.7_armhf.deb
sudo systemctl enable grafana
sudo systemctl start grafana

You can now access the web interface under http://:3000 The default login is username **admin** and password **admin**. In order to complete the last step of the data pipeline, we must tell Grafana about our InfluxDB instance. On the first start, you should be prompted to add a data source, otherwise, you can always manually go to Configuration (cog wheel in the menu bar) ⇒ Add data source.

An screenshot of the Grafana web interface. The data source dialog is opened, and the InfluxDB entry is selected.

Select InfluxDB

Then enter the following values: URL: http://127.0.0.1:8086 Database: collectd_db

Then click “Save & Test”.

Now you have two options: you can either create your own dashboard or just use one from the internet.

Option A: use a preconfigured dashboard

Lots of people on the internet make their dashboards available. Just take a look here, and you will be able to get a JSON file that you can import in your Grafana instance.

For example, here is mine: dashboard.json. However, chances are your situation is not exactly the same. For example, I have an external drive attached and its third partition mounted (sda3), so you might still need to adapt the dashboard to fit your needs.

Option B: create your own

That is path I took, and it is actually pretty easy. Click on the plus to add a new dashboard, then add a new panel. Here we will be exemplary create a panel that displays network traffic.

On the right side, in the tab Panel, make sure Visualization is set to Graph. You can also change the panel title to something more descriptive, e.g. “Ethernet traffic”. Now we must edit the query that produces the graph.

A screenshot of the query editor in Grafana — You can use the query editor to put together the query graphically.

On the left side in the lower box, the query builder is visible.
Click on select measurement, and select interface_rx from the list.
Click on the plus next to WHERE, and set instance to eth0
Click on the plus again, set type to if_octets
Compute the derivative by clicking on the plus in the SELECT row, click Transformations ⇒ Derivative and set the interval to 1s.
By default, grafana will try to average over the interval. To only display latest values, click on mean and then remove it. Also remove the time($__interval) statement in the GROUP BY section.

When you are done your query in text form (the little pencil switches editor modes) should look something like this:

SELECT derivative("value", 1s) FROM "interface_rx"
WHERE ("instance" = 'eth0' AND "type" = 'if_octets')
    AND $timeFilter

You can now add a second query with interface_rx changed to interface_tx to also display outgoing traffic.

Finally, under Axes ⇒ Left Y ⇒ Unit select bytes/sec (IEC) to show the correct unit. And that’s it!

What’s next?

With Grafana set up, you can customize your dashboard to really fit your needs. You could install a TV and put Grafana in kiosk mode, to monitor it at a quick glance. You could feed further information streams into the system, maybe include an RSS feed? The possibilities are vast!

#linux