1.1. Background and motivation
You may be wondering why you should integrate Prometheus in Checkmk at all. Therefore an important note at this point: Our integration of Prometheus is aimed at all of our users who already use Prometheus. By integrating Prometheus into Checkmk, we can close the gap that has arisen here so that you do not have to continuously check two monitoring systems.
This enables you to correlate the data from the two systems, accelerate any error analysis and, at the same time, facilitate communication between Checkmk and Prometheus users.
Finally, context again
As a most pleasant side effect of this integration, it is likely that your metrics from Prometheus automatically receive a meaningful context thanks to Checkmk. For example, while Prometheus correctly shows you the amount of main memory used, you do not have to take any extra manual steps in Checkmk to find out how much of the total available memory this is. As banal as the example may be, it shows at which points Checkmk makes monitoring easier — even in the smallest detail.
1.2. Exporter or PromQL
Following the activation of the expansion package for Prometheus, the following exporters will be available.
If we do not support yet the exporter you need, experienced Prometheus users also have the option of sending self-defined queries to Prometheus directly via Checkmk. This is performed using Prometheus’ own query language, PromQL.
2. Setting up the integration
2.1. Creating a host
Since the concept of hosts in Prometheus simply doesn’t exist, first create a place that gathers the desired metrics. This host forms the central point of contact for the special agent, and then later it distributes the delivered data to the correct hosts in Checkmk. To do this, create a new host using the WATO module of the same name.
If the specified host name does not correspond to an FQDN, enter the IP address at which the Prometheus server can be reached.
Make all other settings for your environment and confirm your selection with Save & Finish.
2.2. Create a rule for the Prometheus datasource
Before Checkmk can find metrics from Prometheus, you must first set up the special agent using the Prometheus rule set. You can find this in WATO via Host & Service Parameters > Datasource Programs > Prometheus. Regardless of which exporter you want to use, under TCP Port number enter the port through which your Prometheus server’s web frontend can be reached.
Integration using Node Exporter
If, for example, you now want to integrate the hardware components of a so-called Scrape Target from Prometheus, use the so-called Node Exporter. Select Add new Scrape Target, and from the dropdown menu that opens, select Node Exporter:
Furthermore, here you can select which hardware or which operating system instances are to be queried by the Node Exporter. The services created in this way use the same check plug-ins as are used for other Linux hosts. This means that their behavior is identical to those already familiar, so without needing to adapt to something new you can quickly configure thresholds, or work with graphs.
Integration using cAdvisor
The cAdvisor exporter enables the monitoring of Docker environments, and returns metrics on usage and performance data.
Via the selection menu Entity level used to create Checkmk piggyback hosts you can determine whether and how the data from Prometheus should be collected in an already-aggregated form. You can choose from the following three options:
Both - Display the information for both pod and container levels
Container - Display the information on container level
Pod - Display the information for pod level
Select either Both or Container, and also define the name under which hosts are created for your containers. The following three options are available for the naming. The option Short is the default:
Short - Use the first 12 characters of the docker container ID
Long - Use the full docker container ID
Name - Use the container’s name
Please note that your selection here affects the automatic creation and deletion of hosts according to your dynamic host configuration.
Integration using kube-state-metrics
Within a Kubernetes cluster deployments, nodes and pods can be queried with the kube-state-metrics exporter. The mechanics here are largely the same as for the Node Exporter, or the cAdvisor described above: You select the metrics that you want to monitor. Only by using the Cluster name field can you determine the name of the host under which the data for a cluster should be displayed.
Integration via PromQL
As already mentioned, with the help of the special agent it is also possible to send requests to your Prometheus servers via PromQL. Enter the port via which Prometheus can be reached, and select Service creation using PromQL queries > Add new Service. Use the Service Name field to determine what the new service should be called in Checkmk.
Then select Add new PromQL query and use the Metric label field to specify the name of the metric to be imported into Checkmk. Now enter your query in the field PromQL query. It is important that this query may only return a single value.
In this example, Prometheus is queried about the number of running and blocked processes. In Checkmk these processes and the two metrics — Running and Blocked — are then combined in a service called Processes.
Important: At the moment it is not yet possible to assign thresholds to metrics in this way.
Assigning a rule to the Prometheus host
Assign this rule explicitly to the host you just created, and confirm your entries with Save.
2.3. Service Discovery
Now that you have configured the special agent, it is time to run a service discovery on the Prometheus host.
3. Dynamic host configuration
3.1. General configuration
Monitoring Kubernetes clusters is probably one of the most common tasks that Prometheus performs. In order to ensure the integration of the sometimes very short-lived containers, which are orchestrated by Kubernetes and monitored with Prometheus — also in Checkmk without great effort — it is advisable to set up a dynamic host configuration. The data from the individual containers is forwarded as piggyback data to Checkmk.
Simply create a new connection using WATO > Hosts > Dynamic config > New connection, select Piggyback data as the connector type, and use Add new element to define the conditions under which new hosts should be created dynamically.
Please also note whether it is necessary for your environment to dynamically delete hosts again when no more data arrives at Checkmk via the Piggyback mechanism. Set the option Delete vanished hosts accordingly.
3.2. Special feature in interactions with cAdvisor
Containers usually receive a new ID when they are restarted. In Checkmk the metrics from the host with the old ID are not automatically transferred to the new ID. In most cases, that wouldn’t make any sense. In the case of containers, however, this can be very useful, as seen in the example above: If a container is only restarted, you probably do not want to lose its history. To achieve this, do not create the containers under their ID, but instead under their name (option Name - Use the container’s name in the Prometheus rule). In this way, with the Delete vanished hosts option in the dynamic host configuration you can still delete containers that no longer exist, without having to fear that their history will also be lost. Instead, this will be continued — by the use of the identical container name — even if it is actually a different container which uses the same name.