1. Foreword
There may still be changes to both this description and the new Kubernetes monitoring feature in Checkmk version 2.1.0 itself. Please see the Changes to this article on GitHub or our Werks regarding changes to the feature itself.
Since Kubernetes integration with Checkmk is natively built into Kubernetes itself, we also directly use the README files in the GitHub repositories. In particular, the Instructions for installing the agent is a direct source to read up on the current recommended ways.
1.1. Getting started with the Kubernetes monitoring
For an introduction to the new monitoring of Kubernetes, we recommend the two videos Kubernetes Monitoring with Checkmk and Detecting issues and configuring alerts for Kubernetes clusters.
1.2. Differences to the previous Kubernetes monitoring
Kubernetes monitoring in Checkmk has been rewritten from scratch. The amount of data that can be monitored has grown significantly. Since the technical basis for Kubernetes monitoring is fundamentally different in Checkmk 2.1.0, it is not possible to adopt or even rewrite previous monitoring data for your Kubernetes objects.
2. Introduction
Kubernetes has been the most widely used tool for container orchestration for quite some time. Checkmk helps you monitor your Kubernetes environments.
Starting with version 2.1.0, you can use Checkmk to monitor the following Kubernetes objects:
Cluster
Nodes
Deployments
Pods
DaemonSets
StatefulSets
For a complete listing of all available check plugins for Kubernetes monitoring, see our Catalog of Check Plug-ins.
3. Perequisites in the cluster
To be able to monitor your Kubernetes cluster in Checkmk, first create the prerequisites in your cluster. First and foremost, tell the cluster which pods/containers to deploy and how to configure them.
3.1. Setting up the Helm repository
Currently, we recommend installing Kubernetes monitoring using the tool helm
,
as it is also suitable for less experienced users and
standardizes the management of configurations.
Helm is a kind of a package manager for Kubernetes.
You can use it to include repositories as sources and
easily add the Helm charts they contain like packages to your cluster.
To do this, first of all make the repository known.
In the following example,
we use the name tribe29
to make it easier to access the repository later.
However, you can of course use any other name:
3.2. Adjustments to the configuration
With Helm, you create the necessary configuration files completely on your own.
In order to be able to determine certain parameters over all configurations,
you give thereby a control file along, the so-called values.yaml
.
As a starting point we recommend
the template provided by us.
Copy it and adapt it to your own environment.
Since we cannot know in advance how your Kubernetes cluster is set up, we have chosen the safest option for how the Checkmk collectors are started: By default, they do not expose any ports to be reached from the outside. To allow you to access the collectors later, adjust these settings accordingly.
For simplicity, let’s take our template as a starting point. We support two communication paths by default: the query via Ingress and the query via NodePort. Depending on which variant you support in your cluster, the configuration will vary.
Provide communication via Ingress
If you use Ingress to control access to your services, adjust the already prepared parts in values.yaml
accordingly.
For a better overview only the relevant part is shown in the following example.
Set the value enabled
to true
. You adjust the remaining values according to your environment:
ingress:
enabled: true
className: ""
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
hosts:
- host: checkmk-cluster-collector.local
paths:
- path: /
pathType: Prefix
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
Provide communication via NodePort
You can also provide access to the services directly through a port.
This is necessary if you do not use Ingress.
Also in the following example only the relevant section is shown.
You set the value type
to NodePort
and remove the comment for the value nodePort
:
service:
# if required specify "NodePort" here to expose the cluster-collector via the "nodePort" specified below
type: NodePort
port: 8080
nodePort: 30035
3.3. Let create the configuration files
After customizing values.yaml
or creating your own,
use the following command to create all the necessary configuration files
to set up your Kubernetes cluster for monitoring in Checkmk:
user@host:~$ helm upgrade --install --create-namespace -n cmk-monitoring checkmk tribe29/checkmk -f values.yaml
Since the command is not self-explanatory, we provide an explanation of each option below:
command part | meaning |
---|---|
| This part is the basic command to send the configuration to the Kubernetes cluster. |
| In Kubernetes you always specify to which namespace the configuration should be added. You need this option if the namespace does not exist yet. Helm will create it in this case. |
| This option specifies the namespace to which the configuration should be added. cmk-monitoring` is just an example of what it could be called. |
|
|
| The first part of this option describes the repository you created with the command before. The second part — after the slash — is the package where the necessary information is located to be able to create the configuration of your Kubernetes monitoring. |
| Finally, specify the configuration file that you created or customized earlier. It contains all the customizations to be included in the configuration files created with |
After you run the command, your Kubernetes cluster is prepared to be monitored with Checkmk. The cluster will now take care of itself to ensure that the necessary pods and the containers within them are running and accessible.
3.4. Alternative: Set up via manifest
Normally, it does not make sense for you to customize the manifests (configuration files) yourself.
On the one hand,
because you need detailed knowledge of the architecture of the Checkmk Kubernetes Collectors for this and,
on the other hand,
because manual customization is much more error-prone.
For example, in helm
you set up communication over TLS once,
instead of adding it to all relevant places in the manifests themselves.
However, if you don`t use helm
, or want to have control over all the setup details,
you can still go this route.
To do so, first download the manifests we have pre-built from our corresponding repository at GitHub. We have split the whole configuration into several files to facilitate their maintenance or to provide more concise files for clearly defined purposes.
You need at least the following five files:
filename | purpose |
---|---|
| Create namespace named checkmk-monitoring |
| Create service account named checkmk and cluster role named checkmk-metrics-reader in namespace checkmk-monitoring |
| Here we create the cluster collector we have named. Among other things, a service account named cluster-collector is created in the namespace checkmk-monitoring and the service accounts are assigned roles within the cluster. In addition, the deployment named cluster-collector is defined. |
| Analogous to |
| Create service named cluster-collector in namespace checkmk-monitoring. Create a service named cluster-collector-nodeport in the namespace checkmk-monitoring. The port for the NodePort is also specified here. |
If you don’t want to clone the whole repo right away - which you are free to do, of course - you can use the following command to download specifically the five files you need:
user@host:~$ URL='https://raw.githubusercontent.com/tribe29/checkmk_kube_agent/main/deploy/kubernetes/'; for i in 00_namespace checkmk-serviceaccount cluster-collector node-collector service; do wget "${URL}${i}.yaml"; done
If you also want to set up a network policy and a pod security policy, you also need the following two files:
network-policy.yaml pod-security-policy.yaml
user@host:~$ URL='https://raw.githubusercontent.com/tribe29/checkmk_kube_agent/main/deploy/kubernetes/'; for i in network-policy pod-security-policy; do wget "${URL}${i}.yaml"; done
In the files cluster-collector.yaml
and node-collector.yaml
you have to fill four placeholders with concrete content.
In both files you will find places where main_<YYY.MM.DD>
is written.
Replace these placeholders with tags from our Kubernetes collector on Docker Hub.
For example, you could use the following command to replace all occurrences of main_<YYY.MM.DD>
with the March 1, 2022 build tag of our container.
user@host:~$ sed -i 's/main_<YYYY.MM.DD>/main_2022.03.01/g' node-collector.yaml cluster-collector.yaml
For the communication to the outside a service of the type NodePort is needed.
This allows communication from the outside and is also permanently set to TCP port 30035 in the service.yaml
file.
If this port is already assigned in your cluster, please change the port accordingly.
Once you have made these settings, you can apply these manifest files collectively to your cluster. To do this, run the following command from the manifest location:
user@host:~$ kubectl apply -f .
namespace/checkmk-monitoring created
serviceaccount/checkmk created
clusterrole.rbac.authorization.k8s.io/checkmk-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/checkmk-metrics-reader-binding created
serviceaccount/cluster-collector created
clusterrolebinding.rbac.authorization.k8s.io/checkmk-cluster-collector created
clusterrolebinding.rbac.authorization.k8s.io/checkmk-token-review created
deployment.apps/cluster-collector created
serviceaccount/node-collector-machine-sections created
serviceaccount/node-collector-container-metrics created
clusterrole.rbac.authorization.k8s.io/node-collector-container-metrics-clusterrole created
podsecuritypolicy.policy/node-collector-container-metrics-podsecuritypolicy created
clusterrolebinding.rbac.authorization.k8s.io/node-collector-container-metrics-cluterrolebinding created
daemonset.apps/node-collector-container-metrics created
daemonset.apps/node-collector-machine-sections created
service/cluster-collector created
service/cluster-collector-nodeport created
You can also use kubectl
to check whether the manifests have been applied correctly.
To do this, use the following command to display all the pods in the checkmk-monitoring
namespace:
user@host:~$ kubectl get pods -n checkmk-monitoring
Furthermore, you can also check all services within the namespace as follows:
user@host:~$ kubectl get svc -n checkmk-monitoring
4. Set up the monitoring in Checkmk
Next, in the GUI of Checkmk, we move on to setting up the specialagent and a rule to automatically create hosts for your Kubernetes objects. However, to set up the special agent, there are a few prerequisites that need to be met first:
4.1. Store password (token) in Checkmk
The best way to store the password (token) of the service account is to store it in the password store of Checkmk. This is the most secure variant, because you can separate the storage and use of the password organizationally. Alternatively, enter it directly in plain text when creating the rule (see below).
If you have kept the default checkmk-monitoring
as the namespace for monitoring your Kuberentes cluster, the following command line will cut the password directly from the output of kubectl get secrets
:
user@host:~$ kubectl get secret $(kubectl get serviceaccount checkmk -o=jsonpath='{.secrets[*].name}' -n checkmk-monitoring) -n checkmk-monitoring -o=jsonpath='{.data.token}' | base64 --decode
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjaGVjay1tayIsI
mt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjaGVjay1tay10b2tlbi16OWhicCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5
hbWUiOiJjaGVjay1tayIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjIxODE3OWEzLTFlZTctMTFlOS1iZjQzLTA4MDAyN2E1ZjE0MSIsInN1YiI6I
nN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjaGVjay1tazpjaGVjay1tayJ9.gcLEH8jjUloTeaAj-U_kRAmRVIiETTk89ujViriGtllnv2iKF12p0L9ybT1fO-1Vx7XyU8jneQRO9lZw8JbhVmaPjrkEc8
kAcUdpGERUHmVFG-yj3KhOwMMUSyfg6wAeBLvj-y1-_pMJEVkVbylYCP6xoLh_rpf75JkAicZTDmhkBNOtSf9ZMjxEmL6kzNYvPwz76szLJUg_ZC636OA2Z47qREUtdNVLyutls7ZVLzuluS2rnfoP
JEVp_hN3PXTRei0F5rNeA01wmgWtDfo0xALZ-GfvEQ-O6GjNwHDlsqYmgtz5rC23cWLAf6MtETfyeEJjRqwituhqUJ9Jp7ZHgQ%
The password is really that long.
If you work directly under Linux, you can add a | xsel --clipboard
at the end.
Then the password is not printed at all, but copied directly to the clipboard (as if you had copied it with the mouse):
user@host:~$ kubectl get secret $(kubectl get serviceaccount checkmk -o=jsonpath='{.secrets[*].name}' -n checkmk-monitoring) -n checkmk-monitoring -o=jsonpath='{.data.token}' | base64 --decode | xsel --clipboard
Add the password to the Checkmk password store with Setup > General > Passwords > Add password e.g. under the ID and the title kubernetes
:

4.2. Import CA of the service account into Checkmk
In order for Checkmk to trust the Certificate Authority (CA) of the service account, you must store the CA certificate in Checkmk. You can read out the certificate - provided you have kept checkmk-monitoring
as the namespace - with the following command:
user@host:~$ kubectl get secret $(kubectl get serviceaccount checkmk -o=jsonpath='{.secrets[*].name}' -n checkmk-monitoring) -n checkmk-monitoring -o=jsonpath='{.data.ca\.crt}' | base64 --decode
Copy everything here including the lines BEGIN CERTIFICATE
and END CERTIFIACTE
and add the certificate in the Setup menu under Setup > General > Global settings > Site management > Trusted certificate authorities for SSL.

4.3. Create Piggyback source host
Create a new host in Checkmk in the usual way and name it mykubernetesclusterhost
, for example.
As the title and host name suggest, this host is used to collect the Piggyback data and also to map all services and metrics at the cluster level.
Since this host only receives data via the special agent, set the IP address family option to No IP.
4.4. Set up dynamic host configuration
To ensure separation between the numerous Kubernetes objects and the rest of your monitoring environment, it is a good idea to first create a folder via Setup > Hosts > Add folder in which the dynamic host configuration can automatically create all necessary hosts. Creating or using such a folder is certainly to be considered optional.
However, it is absolutely necessary to set up a connector for the piggyback data. Via Setup > Hosts > Dynamic host management > Add connection you get to the page for the corresponding setup. First enter a title and then click show more under Connection Properties.
Next, click Add new element and under Create hosts in select the folder you created earlier.
In a Kubernetes environment, where monitorable and monitored objects naturally come and go, it is also recommended to enable the Automatically delete hosts without piggyback data option. What exactly this option does and under what circumstances hosts are then actually deleted is explained in the section Automatically deleting hosts in the article on dynamic host configuration.
Now enter the previously created Piggyback source host under Restrict source hosts and enable the Discover services during creation option.
The Connection Properties section of this new connector might look like the following afterwards:

4.5. Setting up the special agent
Now that all the prerequisites are in place in the cluster and in Checkmk, you can turn your attention to configuring the special agent. This can be found via Setup > Agents > VM, Cloud, Container > Kubernetes.
First of all, you need to assign a name for the cluster you want to monitor.
You can choose this name freely.
It is used to give a unique name to all objects that originate from exactly this cluster.
For example, if you enter mycluster
here, the names of the hosts of all pods from this cluster will later start with pod_mycluster
.
The next part of the host name will then always be the namespace in which this Kubernetes object exists.
Under Token, now select the previously created entry from the password store of Checkmk.
Under API server connection > Endpoint Checkmk now requires the entry of the URL (or IP address) via which your Kubernetes API server can be reached.
The port must also be specified here if the service was not provided via a virtual host.
The easiest way to find out this IP address — if you don’t already have it handy — depends on your Kubernetes environment.
The following command will give you the endpoint of the API server as IP address and port, which you will find as the last entry under server
in the shortened output:
user@host:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://10.73.42.21:6443
name: my-kubernetes
If the server is provided via a DNS record, the output will look more like this instead:
user@host:~$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://DFE7A4191DCEC150F63F9DE2ECA1B407.mi6.eu-central-1.eks.amazonaws.com
name: xyz:aws:eks:eu-central-1:150143619628:cluster/my-kubernetes
If you have stored the CA of your cluster - as described above - in Checkmk, you can select Verify the certificate under SSL certificate verification.
If your Kubernetes API server is only accessible via a proxy or special timeouts are required for the connection, you can enter them under HTTP proxy and TCP timeouts.
Next, you have the choice to enrich the monitoring of your Kubernetes cluster with usage data collected by the Checkmk cluster collector.
To do this, you need to specify the protocol, URL, and port of the cluster collector under Collector NodePort/Ingress endpoint.
If you set it up using our manifests, the port here is 30035
by default.
If you have customized the port in the service.yaml
file, change the port here accordingly.
You should be able to find out the URL or IP address of the nodeport from the description of the cluster-collector
pod.
Just run the following command and look in the output in the line starting with Node:
:
user@host:~$ kubectl describe pod $(kubectl get pods --no-headers -o custom-columns=":metadata.name") | grep -A5 Name:.*cluster-collector
Name: cluster-collector-5b7c8468cf-5t5hj
Namespace: checkmk-monitoring
Priority: 0
Node: minikube/172.16.23.2
Start Time: Wed, 03 Mar 2022 20:54:45 +0100
Labels: app=cluster-collector
With the options Collect information about… you can now finally select which objects within your cluster should be monitored. Our preselection covers the most relevant objects. If you decide to monitor the Pods of CronJobs as well, please refer to the inline help for this point.
Last but not least, you can choose whether you want to monitor only certain namespaces within your clusters or whether explicit namespaces should be excluded from monitoring. You specify this using the Monitor namespaces option.
Your rule might now look like the following:

Important: Under Conditions > Explicit hosts you must now re-enter the previously created host:

Next, store the rule and perform a service discovery on this host. You will see the first cluster-level services right here:

Afterwards, activate all the changes you made and let the dynamic host configuration do the work from now on. It will generate all hosts for your Kubernetes objects after a short time.
5. Labels for Kubernetes objects
Checkmk automatically generates labels for Kubernetes objects such as clusters, deployments, or namespaces during service discovery.
All labels to Kubernetes objects that Checkmk automatically generates start with cmk/kubernetes/
.
For example, a pod always gets a label of the node (cmk/kubernetes/node:mynode
), a label that just shows that this object is a pod (cmk/kubernetes/object:pod
) and a label for the namespace (cmk/kubernetes/namespace:mynamespace
).
This makes it very easy to create filters and rules for all objects of the same type or in the same namespace.
6. Hardware/Software Inventory
Kubernetes monitoring of Checkmk also supports HW/SW inventory.

7. Removing Checkmk
If you have deployed Checkmk to your cluster via our manifests, you can remove created accounts, service and so on just as easily as they were set up. To do this, go back to the directory where the YAML files are located and run the following command:
user@host:~$ kubectl delete -f .
namespace "checkmk-monitoring" deleted
serviceaccount "checkmk" deleted
clusterrole.rbac.authorization.k8s.io "checkmk-metrics-reader" deleted
clusterrolebinding.rbac.authorization.k8s.io "checkmk-metrics-reader-binding" deleted
serviceaccount "cluster-collector" deleted
clusterrolebinding.rbac.authorization.k8s.io "checkmk-cluster-collector" deleted
clusterrolebinding.rbac.authorization.k8s.io "checkmk-token-review" deleted
deployment.apps "cluster-collector" deleted
serviceaccount "node-collector-machine-sections" deleted
serviceaccount "node-collector-container-metrics" deleted
clusterrole.rbac.authorization.k8s.io "node-collector-container-metrics-clusterrole" deleted
podsecuritypolicy.policy "node-collector-container-metrics-podsecuritypolicy" deleted
clusterrolebinding.rbac.authorization.k8s.io "node-collector-container-metrics-cluterrolebinding" deleted
daemonset.apps "node-collector-container-metrics" deleted
daemonset.apps "node-collector-machine-sections" deleted
service "cluster-collector" deleted
service "cluster-collector-nodeport" deleted