Monitor your home lab devices with Grafana Cloud for free
I was about to install a monitoring stack at home to monitor my Linux devices when I learned Grafana Cloud now has a forever free plan.
And let me tell you, it’s just like they said it is; “A free plan that’s actually useful”. Some of what you get for free:
- 10,000 series for Prometheus metrics
- 50 GB of logs
- 14-day retention for metrics and logs
But more importantly, you don't need to maintain a time-series Database nor a Grafana instance in your environment.
In this post, we will describe the steps to get you up and running. In a nutshell, you need to:
1. Create an account
2. Install an agent on your server(s)
3. Select a pre-built dashboard.
If all goes well, the end result should look something like this:
1. Create an account
After you create an account, you get access to the Grafana Cloud Portal.
From here, you want to retrieve your Service ID’s and API Key. If you click on Prometheus Details
, you are presented with the following:
Copy your 5-digit User
. Do the same for Loki, if you plan to export logs along with metrics.
The API Key
can be generated following the link provided in the Password
section or from the menu option on the left-hand side (you only need one key). Make sure the Role
is MetricsPublisher
.
2. Install an agent on your devices
In order to export metrics from a host, you need to install the Grafana Cloud Agent, which is “a subset of Prometheus built for hosted metrics that runs lean on memory and uses much of the same battle-tested code that has made Prometheus so awesome”.
Grafana provides great walkthroughs to help you install an agent — per service — on your host(s). This is perfect for a single host.
However, if you are planning to deploy this on multiple hosts and want to provide some consistency, we better automate the process. Even more so, if you need to install more than one agent per host and want them to start automatically on boot.
You can run an Ansible role to automate the installation of these agent(s). The Ansible role will takeprometheus_user
and grafana_api_key
as input values, which you get from the previous step. The role installs and configures the latest version of the Grafana Cloud Agent from GitHub releases in all the target hosts. It also creates a Systemd service to manage the agent.
To install the role on your Ansible control machine, you can execute ansible-galaxy role install nleiva.grafana_agent
. Then run a playbook that targets all the Linux hosts you want to install the agent(s) on. In the following example, the input variables; prometheus_user
and grafana_api_key
are sourced from a file: secret/vars-enc.yml
.
If the variableloki_user
is passed to this role, it will also install the Promtail agent to scrape log messages from /var/log
and journald
, as well as creating a Systemd service for it. Promtail is an agent for logs that is heavily Prometheus inspired.
3. Select a pre-built dashboard
Now the fun part! Click to Login
next to the Grafana section in your Grafana Cloud Portal to access your Grafana instance.
If you enabled the Linux Server Integration in the Integration Management
section of your Grafana Cloud instance, you would have a nice Linux Node dashboard pre-configured that displays CPU/Memory Usage, Disk I/O, along with other metrics from your hosts.
You can find the Dashboard in the Manage
section of the Dashboards
menu.
Alternatively, you can import a Dashboard, as long as Prometheus is the data source for it. For example, let’s import Dashboard ID 11074.
And just like that, after selecting the data source, you end up with something like this (It’s magic!):
Optional: Create a custom dashboard panel
The Linux Node dashboard that comes with the Linux Server Integration does not include a panel for CPU temperature, which is something you might want to monitor if you have some Raspberry Pi’s for example (vcgencmd measure_temp
).
The Grafana Cloud Agent exports thermal data by default as part of thenode_thermal_zone_temp
collector, which reads from /sys/class/thermal
. This means you should be able to visualize CPU temperature in Grafana without too much effort.
In the Explore
section of the Grafana instance, we can run a query to verify we are receiving this data. This is optional but helps you refine your query.
Adding a panel to a dashboard
To add this metric to an existing dashboard, you need to click on Add panel in the top right corner of the dashboard you want to include this.
Chose a title and a graph type, in this case, Gauge
.
You can add some options, like a metric unit, the number of decimals to display, and set thresholds.
Last, but not least, you need to define your query, in this case node_thermal_zone_temp{job=”integrations/node_exporter”, instance=”$instance”, type=~”cpu-thermal|x86_pkg_temp”}
.
Conclusions
Grafana Cloud free plan is an excellent alternative to monitor your lab devices if you want to save time and money.
In the future, I'll be exploring how this can work with Performance Co-Pilot (PCP).