Checkmk
to checkmk.com

1. Introduction

kubernetes logo

The great success of Docker has led to people using Docker on an ever-larger scale. In contrast to virtual machines such as VMWare, its very low overhead makes the container ‘cheap’, and thus almost a mass-product. It goes without saying that a good tool for orchestrating the containers is essential. For the majority, the Open Source tool Kubernetes will be the tool of choice.

Important: Support for Kubernetes versions 1.18 and newer is currently limited, due to the fact that the official Kubernetes Python client is not yet compatible with Kubernetes' newest API versions.

Checkmk supports monitoring of Kubernetes. The focus is currently on states and metrics that are especially interesting for the administrator. The following Check plug-ins are available:

2. Setting up the monitoring

2.1. Service account

To set up a Kubernetes cluster in Checkmk you first need to have a service account and a related cluster role in Kubernetes, so that Checkmk can access the API. We have created the file check_mk_rbac.yaml for you as a ready template which you will find in the ‘Treasures’, in the share/doc/check_mk/treasures/kubernetes directory, or online here. The first part of this file looks something like this:

share/doc/check_mk/treasures/kubernetes/check_mk_rbac.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: check-mk
---
kind: ServiceAccount

We use check-mk here as the name and namespace respectively.

Load this file onto your Kubernetes cluster with the kubectl command:

user@host:~$ kubectl apply -f check_mk_rbac.yaml
namespace/check-mk created
serviceaccount/check-mk created
clusterrole.rbac.authorization.k8s.io/check-mk created
clusterrolebinding.rbac.authorization.k8s.io/check-mk created

If you use the Google Kubernetes engine, it may be that you receive an "Error from server (Forbidden): error when creating get check_mk_rbac.yaml": response. In this case you must first extend your user’s permissions. This is done with the following command (replacing MYNAME with your Google user name):

user@host:~$ kubectl create clusterrolebinding MYNAME-cluster-admin-binding --clusterrole=cluster-admin --user=MYNAME@example.org

If all has gone well, you can query the new service account with the kubectl get serviceaccounts command:

user@host:~$ kubectl get serviceaccounts check-mk -n check-mk -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"check-mk","namespace":"check-mk"}}
  creationTimestamp: "2019-01-23T08:16:05Z"
  name: check-mk
  namespace: check-mk
  resourceVersion: "4004661"
  selfLink: /api/v1/namespaces/check-mk/serviceaccounts/check-mk
  uid: 218179a3-1ee7-11e9-bf43-080027a5f141
secrets:
- name: check-mk-token-z9hbp

There you will also find the name of the associated secret. This has the form check-mk-token-<ID> (here in the example check-mk-token-z9hbp). The ID for the secret is generated automatically by Kubernetes. You can then use get secrets to query the contents of the secret:

user@host:~$ kubectl get secrets check-mk-token-z9hbp -n check-mk -o yaml
apiVersion: v1
data:
*  ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZ0lCQVRBTkJna3Foa2lHO...*
  namespace: Y2hlY2stbWs=
*  token: ZXlKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklpSjkuZXlKcGMzTWlPaUpyZFdKbGNtNWxkR1Z6TDNObG...*
kind: Secret
metadata:
  annotations:
    kubernetes.io/service-account.name: check-mk
    kubernetes.io/service-account.uid: 218179a3-1ee7-11e9-bf43-080027a5f141
  creationTimestamp: "2019-01-23T08:16:06Z"
  name: check-mk-token-z9hbp
  namespace: check-mk
  resourceVersion: "4004660"
  selfLink: /api/v1/namespaces/check-mk/secrets/check-mk-token-z9hbp
  uid: 2183cee6-1ee7-11e9-bf43-080027a5f141
type: kubernetes.io/service-account-token

The output will include the base64 encoded CA certificate (ca.crt), and the base64 encoded token (token) for the account. You can cut the certificate from the output of get secrets-- e.g. with the following command and immediately convert it into the form you need to import into Checkmk:

user@host:~$ kubectl get secrets check-mk-token-z9hbp -n check-mk -o jsonpath='{.data.ca\.crt}' | base64 --decode
-----BEGIN CERTIFICATE-----
MIIC5zCCAc+gAwIBAgIBATANBgkqhkiG9w0BAQsFADAVMRMwEQYDVQQDEwptaW5p
a3ViZUNBMB4XDTE4MDkxMDE2MDAwMVoXDTI4MDkwODE2MDAwMVowFTETMBEGA1UE
AxMKbWluaWt1YmVDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAK9Z
iG0gNZK5VU94a0E6OrUqxOQRdkv6S6vG3LnuozdgNfxsEetR9bMGu15DWaSa40JX
FbC5RxzNq/W9B2pPmkAlAguqHvayn7lNWjoF5P+31tucIxs3AOfBsLetyCJQduYD
jbe1v1/KCn/4YUzk99cW0ivPqnwVHBoMPUfVof8yA00RJugH6lMZL3kmOkD5AtRH
FTThW9riAlJATBofLfkgRnUEpfb3u1xF9vYEDwKkcV91ealZowJ/BciuxM2F8RIg
LdwF/vOh6a+4Cu8adTyQ8mAryfVPDhFBhbsg+BXRykhNzNDPruC+9wAG/50vg4kV
4wFpkPOkOCvB8ROYelkCAwEAAaNCMEAwDgYDVR0PAQH/BAQDAgKkMB0GA1UdJQQW
MBQGCCsGAQUFBwMCBggrBgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MA0GCSqGSIb3
DQEBCwUAA4IBAQAeNwON8SACLl2SB8t8P4/heKdR3Hyg3hlAOSGjsyo396goAPS1
t6IeCzWZ5Z/LsF7o8y9g8A7blUvARLysmmWOre3X4wDuPvH7jrYt+PUjq+RNeeUX
5R1XAyFfuVcWstT5HpKXdh6U6HfzGpKS1JoFkySrYARhJ+MipJUKNrQLESNqdxBK
4gLCdFxutTTFYkKf6crfIkHoDfXfurMo+wyEYE4Yeh8KRSQWvaKTdab4UvMwlUbO
+8wFZRe08faBqyvavH31KfmkBLZbMMM5r4Jj0Z6a56qZDuiMzlkCl6rmKynQeFzD
KKvQHZazKf1NdcCqKOoU+eh6q6dI9uVFZybG
-----END CERTIFICATE-----

2.2. Importing a certificate into Checkmk

For Checkmk to accept the Kubernetes CA certificate, you must add it the Setup menu at Setup > General > Global settings > Site Management > Trusted certificate authorities for SSL:

kubernetes ca

Without the correct import of the CA, the Checkmk service of the Kubernetes cluster will fail with bad handshake and certificate verify failed:

kubernetes ssl error

2.3. Storing the password (token) in Checkmk

The best way to save the service account password (token) is to use Checkmk’s password store. This is the safest option, since the deposit and the use of the passwords is organizationally separate. Alternatively, enter the password directly in plain text when creating the rule (see below).

The following command line truncates the password directly from the output of get secrets:

user@host:~$ kubectl get secrets check-mk-token-z9hbp -n check-mk -o jsonpath='{.data.token}' | base64 --decode
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjaGVjay1tayIsI
mt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjaGVjay1tay10b2tlbi16OWhicCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5
hbWUiOiJjaGVjay1tayIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjIxODE3OWEzLTFlZTctMTFlOS1iZjQzLTA4MDAyN2E1ZjE0MSIsInN1YiI6I
nN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjaGVjay1tazpjaGVjay1tayJ9.gcLEH8jjUloTeaAj-U_kRAmRVIiETTk89ujViriGtllnv2iKF12p0L9ybT1fO-1Vx7XyU8jneQRO9lZw8JbhVmaPjrkEc8
kAcUdpGERUHmVFG-yj3KhOwMMUSyfg6wAeBLvj-y1-_pMJEVkVbylYCP6xoLh_rpf75JkAicZTDmhkBNOtSf9ZMjxEmL6kzNYvPwz76szLJUg_ZC636OA2Z47qREUtdNVLyutls7ZVLzuluS2rnfoP
JEVp_hN3PXTRei0F5rNeA01wmgWtDfo0xALZ-GfvEQ-O6GjNwHDlsqYmgtz5rC23cWLAf6MtETfyeEJjRqwituhqUJ9Jp7ZHgQ%

The password really is that long. If you are working directly under Linux, you can also add a | xsel—​clipboard at the back. Then the password is not output, but copied directly to the clipboard (as if you had copied with the mouse):

user@host:~$ kubectl get secrets check-mk-token-z9hbp -n check-mk -o jsonpath='{.data.token}' | base64 --decode | xsel --clipboard

Add the password to the Checkmk password store with Setup > General > Passwords > Add password e.g. under the ID kubernetes:

kubernetes password

2.4. Adding a Kubernetes cluster to the monitoring

The monitoring under Checkmk functions in two levels. The Kubernetes cluster itself is monitored as a host. For the individual Kubernetes nodes we use the piggyback principle. That means each node is monitored as a separate host in Checkmk. The monitoring data from these hosts are not retrieved separately from Kubernetes, but instead derived from the data from the Kubernetes cluster.

Because Kubernetes cannot be queried over the normal Checkmk agent, you need the Kubernetes special agent — which is also known as a datasource program. In this case Checkmk does not contact the target host as usual over TCP port 6556, instead it invokes a utility program that interfaces with the target system via a Kubernetes application-specific API.

The procedure is as follows:

  1. Create a host in Checkmk for the Kubernetes master (Kubernetes control plane).

  2. Create a rule that assigns the special agent for Kubernetes to this Kubernetes host.

The rule set can be found at Setup > Agents > VM, Cloud, Container > Kubernetes:

kubernetes rule

Use API server endpoint to specify the destination for calls via the Kubernetes API. For example, a Custom URL has the form https://mykuber01.comp.lan:8443. However, you can also access it via HTTP (insecure). If you select Hostname or IP address, Checkmk will use name or IP address of the Kubernetes host you created earlier — and HTTPS as protocol. You can then change the default port 443 and specify a Custom path prefix, i.e. a path which is appended to the URL. A path prefix is important for Rancher, for example, because there are several Kubernetes clusters that can be included. The API of an individual cluster can then be reached, e.g., at /k8s/cluster/mycluster.

At Token you either enter the password in plain text (Explicit) or select it via the password store if you filed it there earlier.

At Retrieve information about, you can select Kubernetes objects such as pods, services, and deployments to monitor. These are each mapped as a host. We recommend that you let dynamic host configuration manage these hosts automatically.

The functions of the other options are best found in the Icon for showing or hiding context-sensitive help. inline help.

If you now call in the Setup menu the service configuration at the Kubernetes host, some of the services should already be found:

kubernetes monitoring cluster services

2.5. Monitoring the nodes

So that the nodes are also monitored, you must also create them as hosts in the Setup. You can do this with the Dynamic Configuration Daemon (DCD). Or you simply create these as hosts by hand.

It is important that the host names in Checkmk exactly match the names of the Kubernetes nodes. You can easily get these names from the Kubernetes host’s Nodes service.

kubernetes monitoring node services

By the way — with the Setup > Agents > Access to Agents > Hostname translation for piggybacked hosts ruleset you can define rules very flexibly, creating them based on the host names contained in piggyback data. This means that you can use host names in Checkmk that do not match the names of the nodes.

Unless you have a Checkmk agent installed on the nodes themselves, you will need to set Checkmk agent / API integrations to No API integrations, no Checkmk agent.

2.6. Labels for Kubernetes objects

Checkmk creates labels for nodes, pods, deployments and services automatically during the service discovery. The labels are defined in the same way as in Docker and have the form cmk/kubernetes_object:OBJECT.

3. Hardware/software inventory

The Kubernetes integration in Checkmk also supports the HW/SW inventory.

kubernetes monitoring hw sw inventory

4. Removing Checkmk

If you want to remove Checkmk’s service account and cluster role from Kubernetes, this can be performed with the following command:

user@host:~$ kubectl delete -f check_mk_rbac.yaml
namespace "check-mk" deleted
serviceaccount "check-mk" deleted
clusterrole.rbac.authorization.k8s.io "check-mk" deleted
clusterrolebinding.rbac.authorization.k8s.io "check-mk" deleted

5. Kubernetes in OpenShift installations

5.1. Creating a project

logo openshift

OpenShift is a product line of container application platforms for cloud computing developed by Red Hat, which is based, among other things, on Kubernetes.

Checkmk can also monitor an OpenShift-based Kubernetes. The procedure is very similar to that described above, but differs in some details when setting up the cluster for monitoring.

You can create your own project for monitoring in OpenShift. This can be performed from the command line with:

root@linux# oc new-project check-mk
Now using project "check-mk" on server "https://192.168.42.62:8443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app centos/ruby-25-centos7~https://github.com/sclorg/ruby-ex.git

to build a new example application in Ruby.

5.2. Next steps

The remaining steps for the inclusion of the cluster in the monitoring are as described at the beginning of this article. However, you always use the OpenShift tool — oc — as a command in the command line instead of kubectl described above (e.g. when querying the service account and token). You can output the IP address and port of the cluster with the following command:

root@linux# oc status

To get the token for the user, use the following command — here with the user check-mk we use in this article:

root@linux# oc serviceaccounts get-token check-mk

6. Kubernetes in Rancher installations

6.1. Create a Service-Account

With Rancher, setting up monitoring in Checkmk is basically identical to the above described variant directly via Kubernetes. Here as well you need the service account so that Checkmk can access the cluster. You create the account directly in the Rancher web interface, where subsequently you will also find its token and certificate, which you in turn import into Checkmk.

In Rancher, first navigate to Global > Security > Roles > Cluster to create a new role, checkmk:

rancher roles

For convenience, clone the Cluster Owner role:

rancher roles clone

Under Grant Resources revoke the Create, Delete, Patch and Update rights from the cloned role:

rancher roles clone rights

Now create a new checkmk Rancher user under Global > Users > Add User. In Global Permissions select the User-Base option to grant the user only the most necessary reading rights:

rancher adduser

6.2. Assign cluster role

Next, switch to your cluster and click on Edit in the cluster menu at the top right. Here you can use Add Member to add the newly-created user checkmk with the corresponding role checkmk to the cluster:

rancher addmember

6.3. Next steps

Then log in to Rancher with the new user, go to the cluster and click on Kubeconfig File. Here are three details you need for monitoring in Checkmk:

  • clusters > cluster > server: URL/path information for the Checkmk rule

  • clusters > cluster > certificate-authority-data: A base64-encoded certificate

  • users > user > token: The access password in the form of a bearer token

rancher kubeconfig

You still have to decode the certificate — for example, on the command line with base64 --decode, or in one of the many online services. From here the setup in Checkmk corresponds to the procedure for pure Kubernetes use from the Importing a certificate into Checkmk section.

7. Monitoring Kubernetes via the Event Console

7.1. Adding a Rancher cluster

If you manage your Kubernetes clusters with Rancher, you can use the Event Console to monitor events in Rancher. In the Rancher web interface you can easily activate the connection for an entire cluster or for individual projects.

Navigate either to your cluster or to a project under Project/Namespaces and there call Tools > Logging. The configuration is identical in both cases, only the page header (Cluster Logging or Project Logging) shows where you are. Select Syslog as the destination and first enter the Endpoint — here your Checkmk server’s IP address including port 514 — for example 192.168.178.101:514.

Leave the protocol as UDP. Under Program enter the desired name for the log as it should appear in the Event Console. Finally, you define the log level under Log Severity — for testing it is recommended to use Notice here in order to also get definite and immediate entries into the system.

rancher syslog

A corresponding Event Console rule must run in Checkmk so that the data also arrives in the monitoring. For example, in the rule definition, in the Matching Criteria box, you can enter for Match syslog application (tag) the log name just assigned at Rancher under Program to filter on it on a test basis:

kubernetes ec rancher rule

In the monitoring of Checkmk you can now see the events of your cluster or project in the Events views, which you can reach via Monitor > Event Console and also via the snapin Overview:

rancher syslog events

The log name specified in the Rancher configuration under Program appears in the Application column.

7.2. Include other clusters

If the clusters were not set up with an administration like Rancher, you can have them report to the Event Console using Fluentd. Fluentd is an Open Source, universal logging solution that can collect data for Elasticsearch for example, but also for the syslog format. You can easily run Fluentd as a container using a Kubernetes DaemonSet.

First, clone the Fluentd repository:

user@host:~$ git clone https://github.com/fluent/fluentd-kubernetes-daemonset

This contains various configuration files in YAML format, and the associated Docker files. To connect to Checkmk, you only need to set the value for SYSLOG_HOST in line 70 of the DaemonSet configuration fluentd-kubernetes-daemonset/fluentd-daemonset-syslog.yaml.

Enter as SYSLOG_HOST the host name or IP address of the Syslog Endpoint/Checkmk server here — 192.168.178.101 for example. Leave the SYSLOG_PORT at 514 and the SYSLOG_PROTOCOL at udp. The following excerpt shows the relevant lines of the file:

fluentd-kubernetes-daemonset/fluentd-daemonset-syslog.yaml
---
containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-syslog
        env:
          - name:  SYSLOG_HOST
            value: "192.168.178.101"
          - name:  SYSLOG_PORT
            value: "514"
          - name:  SYSLOG_PROTOCOL
            value: "udp"
---

Then use the DaemonSet with the kubectl command:

user@host:~$ kubectl apply -f fluentd-kubernetes-daemonset/fluentd-daemonset-syslog.yaml

Depending on the cluster, it will take a little time until the Fluentd container is running on each node.

You will then need another Event Console rule which will bring the data into the monitoring. For testing purposes, the fluentd value can be used as a filter for Match syslog application (tag) in the Matching Criteria box to get all events of the Fluentd instances. Now set fluentd instead of Rancher2.

You will then find the result just as described above in the monitoring of Checkmk in the in the Events views under Monitor > Event Console and in the Overview — this time with the new application name:

kubernetes ec fluentd events
On this page