Monitoring your home lab devices in the cloud for free

I was about to start installing a monitoring stack at home to monitor my Linux devices when I learned Grafana Cloud now has a forever free plan.

And let me tell you, it’s just like they said it is; “A free plan that’s actually useful”. Some of what you get for free:

  • 10,000 series for Prometheus metrics
  • 50 GB of logs
  • 14-day retention for metrics and logs

But more importantly, you don't need to maintain a time-series Database nor a Grafana instance in your environment.

In this post, we will describe the steps to get you up and running. In a nutshell, you need to:

1. Create an account

2. Install an agent on your server(s)

3. Select a pre-built dashboard.

If all goes well, the end result should look something like this:

Linux Node dashboard for my Home Lab

1. Create an account

After you create an account, you get access to the Grafana Cloud Portal.

From here, you want to retrieve your Service ID’s and API Key. If you click on Prometheus Details, you are presented with the following:

Copy your 5-digit User. Do the same for Loki, if you plan to export logs along with metrics.

The API Key can be generated following the link provided in the Password section or from the menu option on the left-hand side (you only need one key). Make sure the Role is MetricsPublisher.

2. Install an agent on your devices

In order to export metrics from a host, you need to install the Grafana Cloud Agent, which isa subset of Prometheus built for hosted metrics that runs lean on memory and uses much of the same battle-tested code that has made Prometheus so awesome”.

Grafana provides great walkthroughs to help you install an agent — per service — on your host(s). This is perfect for a single host.

However, if you are planning to deploy this on multiple hosts and want to provide some consistency, we better automate the process. Even more so, if you need to install more than one agent per host and want them to start automatically on boot.

You can run an Ansible role to automate the installation of these agent(s). The Ansible role will takeprometheus_user and grafana_api_key as input values, which you get from the previous step. The role installs and configures the latest version of the Grafana Cloud Agent from GitHub releases in all the target hosts. It also creates a Systemd service to manage the agent.

To install the role on your Ansible control machine, you can execute ansible-galaxy role install nleiva.grafana_agent. Then run a playbook that targets all the Linux hosts you want to install the agent(s) on. In the following example, the input variables; prometheus_user and grafana_api_key are sourced from a file: secret/vars-enc.yml.

If the variableloki_user is passed to this role, it will also install the Promtail agent to scrape log messages from /var/log and journald, as well as creating a Systemd service for it. Promtail is an agent for logs that is heavily Prometheus inspired.

3. Select a pre-built dashboard

Now the fun part! Click to Login next to the Grafana section in your Grafana Cloud Portal to access your Grafana instance.

If you enabled the Linux Server Integration in the Integration Management section of your Grafana Cloud instance, you would have a nice Linux Node dashboard pre-configured that displays CPU/Memory Usage, Disk I/O, along with other metrics from your hosts.

You can find the Dashboard in the Manage section of the Dashboards menu.

Alternatively, you can import a Dashboard, as long as Prometheus is the data source for it. For example, let’s import Dashboard ID 11074.

And just like that, after selecting the data source, you end up with something like this (It’s magic!):

Pre-built dashboard you can import with a click of a button

Optional: Create a custom dashboard panel

The Linux Node dashboard that comes with the Linux Server Integration does not include a panel for CPU temperature, which is something you might want to monitor if you have some Raspberry Pi’s for example (vcgencmd measure_temp).

The Grafana Cloud Agent exports thermal data by default as part of thenode_thermal_zone_temp collector, which reads from /sys/class/thermal. This means you should be able to visualize CPU temperature in Grafana without too much effort.

In the Explore section of the Grafana instance, we can run a query to verify we are receiving this data. This is optional but helps you refine your query.

Adding a panel to a dashboard

To add this metric to an existing dashboard, you need to click on Add panel in the top right corner of the dashboard you want to include this.

Image for post
Image for post

Chose a title and a graph type, in this case, Gauge.

Image for post
Image for post

You can add some options, like a metric unit, the number of decimals to display, and set thresholds.

Image for post
Image for post

Last, but not least, you need to define your query, in this case node_thermal_zone_temp{job=”integrations/node_exporter”, instance=”$instance”, type=~”cpu-thermal|x86_pkg_temp”}.

Conclusions

Grafana Cloud free plan is an excellent alternative to monitor your lab devices if you want to save time and money.

In the future, I'll be exploring how this can work with Performance Co-Pilot (PCP).

Solutions Architect at Red Hat (ex Cisco). Cloud, Go and Open Source enthusiast.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store