Checkmk has an integrated graphing system with comprehensive features for visualizing and storing of metrics. However it might still be helpful to use Grafana as an external graphing system — for example, if you are already using Grafana and have other data sources connected to it and want to have a single, unified dashboard.
Beginning with version VERSION[1.5.0p16], in the Checkmk Enterprise Editions it is possible to directly address Checkmk as a data source in Grafana, and to display individual metrics – or even entire graphs as predefined by Checkmk – in Grafana. In addition you can create your own graphs dynamically using regular expressions to specify a set of hosts and services that should take specific metrics in the graph into account.
How to get and to display your Checkmk metrics in Grafana is explained fundamentally in this article. For detailed instructions on how to use and configure Grafana, see the Documentation at Grafana Labs.
Since the plug-in for Grafana is developed in parallel to Checkmk, it is not included with Checkmk. You can however keep up-to-date with the latest development status through the Archive on GitHub.
Listing the data source directly in the official Grafana Data Sources is still in progress. Once the plug-in has been officially incorporated you will be able to install it directly from Grafana’s interface.
To use the plug-in, there are two ways to add it into Grafana: * You download the zip file from the GitHub project, and manually paste the content into the Grafana plug-in directory. * You clone the GitHub project directly into the plug-in directory.
This variant is the simplest and is preferred if you have not installed
git on the Grafana server and cannot/do not want to.
To install the plug-in, simply download the zip file with the latest version, and copy
it – for example, with
scp – to the Grafana server:
Alternatively, you can also load the file directly from the command line.
Note that you need to know the correct version for this.
With the option
-O in the following comannd, the name of the file to
be saved locally is set manually. Otherwise it would be
root@linux# wget -O grafana-checkmk-datasource-1.1.0.zip https://github.com/tribe29/grafana-checkmk-datasource/archive/1.1.0.zip
The content is then extracted into the Grafana plug-in directory.
This is usually the file path
root@linux# unzip grafana-checkmk-datasource-1.1.0.zip && mv grafana-checkmk-datasource-1.1.0 /var/lib/grafana/plugins/
Finally you can activate and set up the plug-in in the Grafana interface.
The variant described above has the fewest requirements, and is easy to implement even for less-experienced users. But if you get the plug-in directly from the Git archive, there are several advantages:
Upgrading to a new version can be quickly performed with two commands:
You have the option of using the current development version directly from the archive if you want to test a new feature:
git checkout develop
To use the plug-in with the help of a copy of the archive,
you absolutely need the
git program. The procedure is then quite simple
– simply clone the archive into the Grafana plug-in directory:
root@linux# cd /var/lib/grafana/plugins/ && git clone https://github.com/tribe29/grafana-checkmk-datasource.git Cloning into 'grafana-checkmk-datasource'... remote: Enumerating objects: 541, done. remote: Total 541 (delta 0), reused 0 (delta 0), pack-reused 541 Receiving objects: 100% (541/541), 291.55 KiB | 0 bytes/s, done. Resolving deltas: 100% (374/374), done. Checking connectivity... done.
Afterwards the plug-in is available in the Grafana GUI, and from there can be activated and set up.
Since the master branch always shows the latest version, after a new release you just need to execute the following command to update the plug-in on the Grafana server:
root@linux# cd /var/lib/grafana/plugins/grafana-checkmk-datasource && git pull
After the necessary files have been installed you can activate the plug-in in Grafana. Go to the configuration and select the Data Sources tab. Here you can add a new data source using the Add data sources button:
The entry for Checkmk can be found under the category Others:
The configuration mask for the data source is quite simple. Enter the URL for your instance and an automation user here. Important – In a distributed setup with multiple instances, here you specify the URL for the master instance:
If you want to connect more than one Checkmk instance, from CheckMK you can optionally
add a unique name to each connection — otherwise simply leave the standard
Checkmk as is.
After you have saved the connection with the Save & Test button, it will be available as a data source in Grafana and you can configure your first graphs.
Dashboards are generated in Grafana using the ‘plus’ icon on the right side. With a click on Dashboard you can create something like this:
Checkmk automatically merges metrics into a graph to quickly compare content-related metrics. You can display the metrics from such a ready-made graph directly in Grafana. In an existing – or just created – dashboard, create a new Panel. Here you first select Add Query:
The Query should be Check_MK. You can then prefilter the query for a Checkmk instance (Site) — then select the desired Host, Service and Graph. The CPU utilization service is used here as an example:
The result is displayed immediately. As soon as you click on the Save icon, you will be prompted to specify a title for the Panel. Then you will be redirected directly to the dashboard:
Of course it is also possible to display individual metrics for a host. The procedure is very similar to that with predefined graphs – you just change the Mode to single metric, and instead of choosing a predefined graph, select the Metric for a service:
Again, save the Panel and view the result in the dashboard:
Especially in a dynamic cluster, you often want to be able to track the entire history of a metric across all participating hosts without having to adjust a graph each time a new node is added or dropped. As of version VERSION[1.6.0p2] you also have the option to create graphs dynamically using regular expressions. The prerequisite for this is that the plug-in is in version 1.1.0.
Change the Mode in a new Panel to combined Graph. The general setting options remain unchanged, but you can now summarize metrics from one or from different hosts and services. You have the access to all
regular expressions that you also know from Checkmk. Note that regular expressions can also optionally be used for the hosts here. The expression
.* in the service field is only for clarification — it would work without it.
In addition to the advanced filter options, with Aggregation you can specify the representation of the metrics in the graph, and with Graph, which graph should be used as a reference. Note that metrics for a host/service will only be displayed if the host/service also has this selected graph. The example graph looks like this:
From version VERSION[1.6.0p2] and version 1.1.0 of the plug-in it is also possible to
control the metric names using variables and
show status changes of certain services as comments.
The plug-in usually takes the metric name as it has already been defined in Checkmk. You therefore don’t have to define a human-readable alias in order to avoid later having to work with cryptic metric names, which the code uses internally.
However, if you want to use metrics from several hosts in a graph, it can quickly lead to confusion regarding the source of a metric. To solve this problem you can adjust the display name in a panel to always get clear information. You can choose from a number of variables:
The metric’s title as it would be represented in Checkmk.
The Checkmk instance on which the host and its metric is monitored.
The host that the metric is associated with.
The service to which the metric is assigned in Checkmk.
With these variables you can easily adjust the metric label even if you display several metrics in one graph. In the example below the following expression was used in the corresponding Label Format field:
The result looks like this, for example:
Grafana supports the setting of comments in your graphs. Set comments, then mark an event directly in the graph and thus make it possible to leave a comment at certain times. You can also have the status changes of one or more services displayed automatically by adding an Annotation Query.
You can access the configuration by clicking the ‘gear’ icon on the dashboard, and then opening the configuration for the Annotations:
Use the New-/Add Annotation Query button to create a new query. Set the Data source to Checkmk and under Name assign the display name as the query will later be shown in the dashboard. You also determine whether the annotation query is directly activated (Enabled) and/or invisible (Hidden). The color of the comments can also be defined here if desired. In this example it was set to yellow — this query should only be WARN:
The actual query then works in a similar way to creating a graph. You only have to explicitly determine the instance to be queried, since it is not possible to query all Checkmk instances here. Finally you determine the status of the service or services to be displayed:
Important: Limit the data to be displayed as much as possible, because comments are shown in all compatible graphs on the dashboard. If in doubt create several small annotation queries rather than one large one.
After you have added the configuration (Add button), and saved the new dashboard settings, go back to your dashboard. Depending on whether you activated the query directly during setup, you may already see comments that have been automatically-generated in your graphs: