Monitoring Linux - The new agent for Linux in detail

1. The new Linux agent

You can monitor Linux systems particularly well with Checkmk. This is not so much because the Checkmk development team feels 'at home' with Linux, but rather because Linux is a very open system and provides numerous well documented and easy to query interfaces for detailed monitoring.

Since most of the interfaces are not accessible over the network per se, the need to install a monitoring agent is unavoidable. That is why Checkmk has its own agent for monitoring Linux. This agent consists of a simple shell script that is minimalistic, transparent and secure.

In Checkmk version 2.1.0 there is now a newly enhanced Linux agent. More precisely, a new component has been added to the agent script check_mk_agent: the Agent Controller. The Agent Controller precedes the agent script, queries it and communicates with the Checkmk server on the script’s behalf. To do this, the Agent Controller registers with the Agent Receiver, which is also a new process running on the Checkmk server.

So, on the one hand, the new Linux agent takes over the agent script, and thus inherits that script’s advantages. On the other hand, the agent supplements the script in such a way that new functions can be added, such as TLS encryption of the communication, data compression — and also the reversal of the communication direction.

The registered, encrypted and compressed pull mode with the Agent Controller is available from version 2.1.0 for all Checkmk editions — as long as both the Checkmk server and the agent versions are 2.1.0 or higher.

The Agent Controller is started as a background process (daemon) by the init system systemd, therefore the agent requires a Linux distribution with systemd. Most likely, this requirement will be met on your host, since from 2015 most Linux distributions have adopted systemd as their init system.

However, the agent also supports a so-called legacy mode which supports Linux systems with computer architectures that are different to x86_64, without RPM or DEB package management and without the systemd init system. In this legacy mode, the new agent works like the old one, i.e. without the Agent Controller and thus without a registration at the Checkmk server.

The article you are currently reading covers the installation, configuration, and extensions of the Linux agent with the Agent Controller. However, it also shows you how to identify whether the agent needs to be set up on your Linux system in legacy mode without an Agent Controller. You will be able find all the information on this in the article covering monitoring Linux in legacy mode.

2. Architecture of the agent

The Checkmk agent consists of the agent script and the Agent Controller, which communicates with the Agent Receiver on the Checkmk server. See the general article on monitoring agents for details on the common architecture of Linux agent and Windows agent. This chapter is about the Linux specific implementation.

The agent script check_mk_agent is responsible for the collection of the monitoring data and calls existing system commands for the data collection in sequence. In order to obtain such information the agent also requires root privileges, so the check_mk_agent must be executed under root.

The agent script is minimalistic, secure, easily extensible, and transparent because it is a shell script where you can see what commands it calls.

The Agent Controller cmk-agent-ctl is the component within the agent that is responsible for transporting the data collected by the agent script. The controller is executed under the cmk-agent user, which has limited privileges, e.g. no login shell, and is used only for data transfer. The cmk-agent user is created during the installation of the agent package. The Agent Controller is started as a daemon of systemd and is coupled to it as a service. The controller listens on TCP port 6556 for incoming connections of the Checkmk site and queries the agent script via a Unix socket (of a systemd unit).

3. Installation

Checkmk provides several ways of installing the Linux agent — from a manual installation of the software package to the fully automated deployment including its update function. Some of these installation methods are only available in the Enterprise Editions:

Method Description CRE CEE

Method	Description	CRE	CEE
Supplied RPM/DEB package	Simple installation of a standard agent with manual configuration via configuration files. The installation routine checks and configures `systemd` and `xinetd` in all editions — in this order.	X	X
RPM/DEB package from Agent Bakery	Configuration via GUI, individual configuration per host possible.		X
Automatic update	The package from the agent bakery is installed for the first time by hand or by script and will from then on be automatically updated.		X

Supplied RPM/DEB package

Simple installation of a standard agent with manual configuration via configuration files. The installation routine checks and configures systemd and xinetd in all editions — in this order.

RPM/DEB package from Agent Bakery

Configuration via GUI, individual configuration per host possible.

Automatic update

The package from the agent bakery is installed for the first time by hand or by script and will from then on be automatically updated.

3.1. Downloading RPM/DEB packages

You install the Linux agent by installing the RPM or the DEB package. Whether you need RPM or DEB depends on the Linux distribution on which you want to install the package:

package end install on

package	end	install on
RPM	`.rpm`	Red Hat Enterprise Linux (RHEL) based systems, SLES, Fedora, openSUSE, derivatives thereof
DEB	`.deb`	Debian, Ubuntu, all other DEB based distributions

RPM

.rpm

Red Hat Enterprise Linux (RHEL) based systems, SLES, Fedora, openSUSE, derivatives thereof

DEB

.deb

Debian, Ubuntu, all other DEB based distributions

Before installation you will need to get the package and bring it to the host (for example with scp or WinSCP) where you want the agent to run.

Getting a package via Checkmk GUI

In the CRE Checkmk Raw Edition you can find the agent’s Linux packages via Setup > Agents > Linux. In the CEE Checkmk Enterprise Editions, you first get to the Agent Bakery in the Setup menu via Agents > Windows, Linux, Solaris, AIX, where you find the baked packages. From there, the Related > Linux, Solaris, AIX files menu item will take you to the list of agent files:

Download page with the RPM/DEB packages.

You will find the RPM and DEB packages on the download page

Everything you need can be found right in the first box named Packaged Agents: i.e. the ready-made RPM and DEB packages for installing the Linux agent with its default settings.

Getting a package via HTTP

Sometimes downloading to a machine and then copying to the target machine using scp or WinSCP can be very cumbersome. You can also download the package from the Checkmk server directly to the target system via HTTP. For this purpose, the agent file downloads are intentionally available without needing to log in, after all, the files do not contain any secrets. Anyone can download and install Checkmk themselves and thus access the files.

The easiest way to do this is with wget. You can get the URL from the browser. If you already know the name of the package, you can easily compose the URL yourself. Put /mysite/check_mk/agents/ in front of the filename, in the following example for the RPM package:

root@linux# wget http://mycmkserver/mysite/check_mk/agents/check-mk-agent-2.1.0b1-1.noarch.rpm

Tip: RPM even has a built-in wget. Here you can download and install with a single command:

root@linux# rpm -U http://mycmkserver/mysite/check_mk/agents/check-mk-agent-2.1.0b1-1.noarch.rpm

Getting a package via the REST API

Checkmk’s REST API provides the following methods for downloading agent packages from the Checkmk server:

Downloading the provided agent.
Downloading a baked agent by host name and operating system.
Downloading a baked agent by hash of the agent and operating system.

Via the REST API you also have the option of fetching the package from the Checkmk server directly to the target machine. For example, the DEB package included with the Linux agent can be fetched with the following curl command:

root@linux# curl -OJG "http://mycmkserver/mysite/check_mk/api/1.0/domain-types/agent/actions/download/invoke" \
--header 'Accept: application/octet-stream' \
--header 'Authorization: Bearer automation myautomationsecret' \
--data-urlencode 'os_type=linux_deb'

This is just a simple example to demonstrate how this particular REST API endpoint works to download the agent. For details on this and other REST API endpoints, see the API documentation available in Checkmk via Help > Developer resources > REST API documentation.

3.2. Package installation

After you have fetched the RPM or the DEB package and — if necessary — copied it to the host to be monitored using scp, WinSCP or other means, the installation is accomplished with a single command.

The RPM package is installed under root with the command rpm -U:

root@linux# rpm -U check-mk-agent-2.1.0b1-1.noarch.rpm

By the way, the -U option stands for 'update', but it can also perform an initial installation correctly. This also means that you can use this command to update an existing agent to the current version — and also use the same command for future updates of the agent package.

The installation of the DEB package — and an update — is done under root with the command dpkg -i:

root@linux# dpkg -i check-mk-agent_2.1.0b1-1_all.deb
(Reading database ... 739920 files and directories currently installed.)
Preparing to unpack .../check-mk-agent_2.1.0b5-1_all.deb ...
Unpacking check-mk-agent (2.1.0b5-1) ...
Setting up check-mk-agent (2.1.0b5-1) ...

Deploying systemd units: check-mk-agent.socket check-mk-agent-async.service cmk-agent-ctl-daemon.service check-mk-agent@.service
Deployed systemd
Creating/updating cmk-agent user account ...

WARNING: The Agent Controller is operating in an insecure mode! To secure the connection run cmk-agent-ctl register.

Reloading xinetd
Activating systemd unit 'check-mk-agent.socket'...
Created symlink /etc/systemd/system/sockets.target.wants/check-mk-agent.socket → /lib/systemd/system/check-mk-agent.socket.
Activating systemd unit 'check-mk-agent-async.service'...
Created symlink /etc/systemd/system/multi-user.target.wants/check-mk-agent-async.service → /lib/systemd/system/check-mk-agent-async.service.
Activating systemd unit 'cmk-agent-ctl-daemon.service'...
Created symlink /etc/systemd/system/multi-user.target.wants/cmk-agent-ctl-daemon.service → /lib/systemd/system/cmk-agent-ctl-daemon.service.

Here the package was installed for the first time on a previously agentless host. The cmk-agent user has been created and systemd has been configured. We address the interim warning about insecure mode, i.e. legacy pull mode, in a moment.

3.3. Installation using the Agent Bakery

The CEE Checkmk Enterprise Editions have a software module, the Agent Bakery, for automatically packaging customized agents. A detailed description of this can be found in the general article on the agents. Installation of the baked packages is done in the same way as described above for the included packages.

3.4. Automatic updates

If you use the Agent Bakery, you can also set up automatic updates of the agent. These updates are described in their own article.

3.5. What follows after the installation?

If the Agent Controller could be configured with systemd during installation, the next step is the registration, which sets up TLS encryption so that the encrypted agent output can be decrypted by the Checkmk server and then displayed in the monitoring.

There is a special feature available when the agent is installed with Agent Controller for the first time. In that case, the agent switches to the unencrypted legacy pull mode so that the Checkmk server is not cut off from the monitoring data and can continue to display it. This affects both a new installation and an update of an agent of version 2.0.0 and older.

You will receive a notice of the activated legacy pull mode in the command output during the installation of the agent. It will look something like this in the monitoring:

The WARN state of the 'Check_MK' service due to missing encryption.

Warning in Checkmk monitoring that TLS is not yet active

The Checkmk site recognizes from the agent output that the Agent Controller is present and thus TLS encryption is possible — but has not yet been enabled. The Check_MK Agent service changes to the WARN state and remains so until you register it. After registration, only the encrypted pull mode will be used for communication. The legacy pull mode is switched off and will remain so. However, it can be switched on again by command if necessary.

The situation is different if the Agent Controller could not be registered as a daemon with systemd during the installation. In such a case, host registration is not possible and the only available communication path remains the legacy mode. In the next chapter, you can determine whether you can proceed with registration by testing the Agent Controller and system environment.

Note: In the Checkmk Agent installation auditing rule set you will find various settings to check the state of the agent and make it visible in monitoring. Among other things, you can specify here which state the Check_MK Agent service should have if TLS configuration has not yet been performed.

4. Registration

4.1. Overview and prerequisites

Immediately after the installation of the new agent (also as an update of an agent of version 2.0.0 and older) only unencrypted communication in legacy pull mode is possible. An exclusively encrypted data transmission is only active after a trust relationship has been established.

Therefore, perform the registration promptly after installation/update. This chapter shows how to do this.

Registration, and thus the establishment of the mutual trust relationship, is done from a Checkmk user with access to the REST API. For this purpose, a good choice is the automation user, which is automatically created with every Checkmk installation and whose password you can randomly generate.

Requirements for the host

Registering with the Agent Controller requires a Linux system with an init system systemd version 219 or later and an x86_64 computer architecture. See the Testing Agent Controller and system environment section to learn how to verify these prerequisites.

Requirements for the server

To register a host for monitoring, this host must be able to reach the REST API of the Checkmk server (port 443 or 80) and the Agent Receiver (port 8000 for the first site, 8001 for the second…). Read the section Network environment for registration, in case your infrastructure cannot fulfill one of these requirements.

4.2. Adding a host to the Setup

First create the new host via Setup > Hosts > Add host. A host must of course exist in the configuration environment before it can be registered.

4.3. Testing the Agent Controller and system environment

The agent with the Agent Controller requires a Linux distribution with systemd, more precisely systemd in a version 219 or newer.

There is a good chance that this requirement is met on your host, since from 2015 most Linux distributions have adopted systemd as their init system, replacing other init systems such as SysVinit, e.g. SUSE Linux Enterprise Server from version 12, openSUSE from version 12.1, Red Hat Enterprise Linux from version 7, Fedora from version 15, Debian from version 8 and Ubuntu from version 15.04. Unfortunately, comparing the version number alone does not bring certainty, since systemd may be missing even on a current Linux system if it has 'only' been updated over the years.

Therefore, check on the host on which the agent is to be installed whether systemd is running and in which version:

root@linux# systemctl --version
systemd 245 (245.4-4ubuntu3.15)

The above command output shows that systemd is installed in the correct version. If systemd is not running, or is running in a version that is too old, the Agent Controller cannot be used. Complete the setup as described in the article Monitoring Linux in legacy mode.

Now check whether the Agent Controller can be started:

root@linux# cmk-agent-ctl --version

The version number should be shown in the output, for example:

cmk-agent-ctl 0.1.0

In rare cases, the following error message may appear:

bash: /usr/bin/cmk-agent-ctl: cannot execute binary file: Exec format error

The reason for this is that your Linux uses a different computer architecture than x86_64, for example the older 32-bit x86 or ARM. In this case, the Agent Controller cannot be used. Complete the setup as described in the article Monitoring Linux in legacy mode.

The next step is to find out which program is waiting for requests on port 6556:

root@linux# ss -tulpn | grep 6556
tcp	LISTEN	0	1024	0.0.0.0:6556	0.0.0.0:*	users:(("cmk-agent-ctl",pid=1861810,fd=9))

Here it is cmk-agent-ctl, thus the requirements for an encrypted communication have been fulfilled. If however systemd, xinetd or inetd are within the parentheses the prerequisites for using the Agent Controller are not met. In such a case, also complete the setup as described in the article Monitor Linux in legacy mode.

4.4. Registering a host with the server

Registration is made using the agent cmk-agent-ctl controller, which provides a command interface for configuring the connections. You can use the cmk-agent-ctl help command to display help on the options.

Next, go to the host that is to be registered. Here, with root privileges, make a request to the Checkmk site:

root@linux# cmk-agent-ctl register --hostname mynewhost \
    --server cmkserver --site mysite \
    --user automation --password 'test23'

The host name following the --hostname option must be exactly the same as it was when it was created in the Setup. The --server and --site options specify the name of the Checkmk server and the site. The server name may also be the IP address, the site name (here mysite) corresponds to the one you see in the URL path for the web interface. The options are completed by the name and password used by the automation user. If you omit the --password option, the password will be requested interactively.

If the specified values were correct, you will be asked to confirm the identity of the Checkmk site to which you want to connect. For clarity here, we have abbreviated the server certificate to be confirmed:

Attempting to register at cmkserver:8000/mysite. Server certificate details:

PEM-encoded certificate:
---BEGIN CERTIFICATE---
MIIC6zCCAdOgAwIBAgIUXbSE8FXQfmFqoRNhG9NpHhlRJ40wDQYJKoZIhvcNAQEL
[...]
nS+9hN5ILfRI+wkdrQLC0vkHVYY8hGIEq+xTpG/Pxw==
---END CERTIFICATE---

Issued by:
	Site 'mysite' local CA
Issued to:
	localhost
Validity:
	From Thu, 10 Feb 2022 15:13:22 +0000
	To   Tue, 13 Jun 3020 15:13:22 +0000

Do you want to establish this connection? [Y/n]
> Y

Confirm with Y to complete the process.

If no error message is displayed, the encrypted connection will have been established. All data will now be transmitted in compressed form via this connection.

4.5. Verifying the trust relationship

The cmk-agent-ctl status command should now show precisely one trust relationship with the Checkmk server:

root@linux# cmk-agent-ctl status
Connection: 12.34.56.78:8000/mysite
	UUID: d38e7e53-9f0b-4f11-bbcf-d19617971595
	Local:
		Connection type: pull-agent
		Certificate issuer: Site 'mysite' local CA
		Certificate validity: Mon, 21 Feb 2022 11:23:57 +0000 - Sat, 24 Jun 3020 11:23:57 +0000
	Remote:
		Connection type: pull-agent
		Registration state: operational
		Host name: mynewhost

Note: There can only ever be one trust relationship between host and site. For example, if you register the already registered host mynewhost under a different name (mynewhost2) but with the same IP address, then the new connection will replace the existing one. The connection from mynewhost to the site will be disconnected and no more agent data will be supplied to the monitoring from that host.

4.6. Registration by proxy

For easier registration of multiple hosts, any host on which the agent is installed can perform a registration on behalf of other hosts. The registration process exports a JSON file, which can then be transferred to the target host and imported there. Again, as before, the host registered in the job must already be set up on the site.

First, on any host in the Setup, the registration is performed by proxy. Here, of course, the Checkmk server comes in handy, as it is usually the first host to be set up. As with the example above, you can pass the password by option or be asked for it interactively if you omit the --password option. We redirect the JSON output to a file in the example:

root@linux# cmk-agent-ctl proxy-register \
    --hostname mynewhost3  \
    --server cmkserver --site mysite \
    --user automation > /tmp/mynewhost3.json

Next we transfer the /tmp/mynewhost3.json file to the host we registered for and import that file:

root@linux# cmk-agent-ctl import /tmp/mynewhost3.json

This process is also possible in a single step using a pipeline where the output of cmk-agent-ctl proxy-register is handed over as input to ssh hostname cmk-agent-ctl import:

root@linux# cmk-agent-ctl proxy-register --hostname mynewhost3 \
    --server cmkserver --site mysite \
    --user automation --password 'test23' | \
    ssh root@mynewhost3 cmk-agent-ctl import

4.7. Adding the host to the monitoring

Once the registration is complete, perform an connection test and a service discovery in the Checkmk server Setup. Then, as a final step, add the discovered services to the monitoring by activating the changes.

If the connection test fails, refer to the following chapter for information on testing and error diagnosis.

4.8. Deregistering a host

You can also deregister a host.

On a host connected to the Checkmk server, you can revoke the trust. Here, in the following command, the Universally Unique Identifier (UUID) to specify is the one output by the cmk-agent-ctl status command:

root@linux# cmk-agent-ctl delete d38e7e53-9f0b-4f11-bbcf-d19617971595

To delete all connections from the host and additionally restore legacy pull mode, enter the following command:

root@linux# cmk-agent-ctl delete-all --enable-insecure-connections

After that, the agent behaves as it did after the initial installation and before the first registration and sends its data unencrypted.

Complete the deregistration on the Checkmk server: In the Setup, on the Properties of host page, select the Host > Remove TLS registration menu item and confirm the prompt.

In case you prefer the command line: On the Checkmk server, for each connection of a host that is in monitoring, there is a soft link with the UUID that points to the folder with the agent output:

OMD[mysite]:~$ cd ~/var/agent-receiver/received-outputs
OMD[mysite]:~$ ls -l d38e7e53-9f0b-4f11-bbcf-d19617971595
lrwxrwxrwx 1 mysite mysite 67 Feb 23 07:18 d38e7e53-9f0b-4f11-bbcf-d19617971595 -> /omd/sites/mysite/tmp/check_mk/data_source_cache/push-agent/mynewhost

If you delete this soft link, you will need to re-register the host.

5. Testing and troubleshooting

A modular system may not work as intended in many situations.

Since the introduction with 2.1.0 of the two new agent components, the Agent Controller on the host and the Agent Receiver on the Checkmk server, the number of points where something can go wrong has increased.

When troubleshooting, a structured approach is thus recommended. You can of course also use the step-by-step analysis described here to get to know the data collection and communication provided by Checkmk in more detail.

All of the diagnostic options that are available from the Checkmk server side are described in the general article on monitoring agents. But, of course, there are other diagnostics available when logged in directly to the monitored host itself.

We’ll work our way from the agent script, through the Agent Controller and TCP port 6556, to the Checkmk site in the following sections. In most cases, after correcting an error, you can restart the service discovery and complete the inclusion in monitoring.

5.1. Output from the agent script

The agent script is a simple shell script that obtains data on your system and outputs it as loosely formatted text. You can call this script directly from the command line. Since the output can be a bit long, the option less to scroll the output is very handy here. You can exit it with the Q key:

root@linux# check_mk_agent | less
<<<check_mk>>>
Version: 2.1.0b1
AgentOS: linux
Hostname: mynewhost
AgentDirectory: /etc/check_mk
DataDirectory: /var/lib/check_mk_agent
SpoolDirectory: /var/lib/check_mk_agent/spool
PluginsDirectory: /usr/lib/check_mk_agent/plugins
LocalDirectory: /usr/lib/check_mk_agent/local
AgentController: cmk-agent-ctl 0.1.0

This is how you determine whether all of the desired data is included in the output — for example, whether all of the installed plug-ins are providing their expected output.

By the way, you do not have to be root to call the agent. However, the output will then lack some information that requires root privileges to obtain (e.g. multipath information and the outputs of ethtool).

5.2. The agent script in debug mode

To prevent any error output from inactive plug-ins or commands from 'contaminating' the required data, the agent script generally suppresses the standard error channel (STDERR). If you are looking for a specific problem, you can re-enable the STDERR by calling the agent script in a special debug mode. You do this with the -d option. This will also print all shell commands that the script executes.

To be able to work with less here, you have to combine standard output (STDOUT) and error channel with 2>&1:

root@linux# check_mk_agent -d 2>&1 | less

5.3. Network environment for registration

If registering a host fails even before a certificate is presented, knowledge about the ways of communication can help identifying the problem — and of course solving it.

After entering the cmk-agent-ctl register command, the Agent Controller first asks the Checkmk server for the Agent Receiver port using the REST API. As second step a connection to the Agent Receiver is established to request the certificate. You can simulate the first request on the host with a program like curl:

root@linux# curl -v --insecure https://mycmkserver/mysite/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke

The parameter --insecure instructs curl to skip the certificate check. This behavior reflects the behavior of the Agent Controller in this step. The response is only a few bytes, containing the port number of the Agent Receiver. For the first site this is usually just 8000, for the second 8001 and so on. Common problems regarding this request are:

The Checkmk server is unreachable from the host.
The port used by the REST API differs from the default ports 443 (https) or 80 (http).

When the curl command fails you might change routing or firewall settings to enable access, respectively add the internal CA to the trust chain of the host.

In case the host you are trying to register uses an HTTP proxy, curl will use it, but cmk-agent-ctl won’t do so with default settings. Use the additional --detect-proxy option to instruct cmk-agent-ctl to use a proxy configured via system settings.

However often it may be easier to find out the port of the Agent Receiver and note it down. To do so, on the Checkmk server run, logged in as site user:

OMD[mysite]:~$ omd config show | grep AGENT_RECEIVER
AGENT_RECEIVER: on
AGENT_RECEIVER_PORT: 8000

Now you can specify the port when entering the command for registration. This skips the first request to the REST API. Communication then takes place directly with the Agent Receiver without any detours:

root@linux# cmk-agent-ctl register --hostname mynewhost \
    --server mycmkserver:8000 --site mysite \
    --user automation --password 'test23'

Port 8000 also must be reachable from the host. In case it is not, you will get this error message:

ERROR [cmk_agent_ctl] Connection refused (os error 111)

Equivalent to port 443 (respectively 80) mentioned above, you can now adjust routing or firewall settings so that the host to be registered can reach the Checkmk server on the Agent Receiver’s port (8000 or 8001…)

Should security policies prohibit access to the Agent Receiver, there is still the possibility to use registration by proxy on the Checkmk server.

5.4. The Agent Controller in dump mode

The Agent Controller provides its own dump subcommand that displays the full agent output as it arrives in the monitoring:

root@linux# cmk-agent-ctl dump | less
<<<check_mk>>>
Version: 2.1.0b1
AgentOS: linux
Hostname: mynewhost

This allows you to verify that the data from the agent script has arrived at the Agent Controller. This output does not yet prove that the agent is also reachable over the network however.

In some cases, the output will look like this:

ERROR [cmk_agent_ctl] Error collecting monitoring data.

Caused by:
    No such file or directory (os error 2)

This would be the case when the Agent Controller daemon is not running in the background — immediately following an update, for example. Restart this background process:

root@linux# systemctl restart cmk-agent-ctl-daemon

If cmk-agent-ctl dump fails again, check if and which program is listening on port 6556:

root@linux# ss -tulpn | grep 6556
tcp	LISTEN	0	1024	0.0.0.0:6556 0.0.0.0:*	users:(("cmk-agent-ctl",pid=1861810,fd=9))

If the output is empty or there is a command other than cmk-agent-ctl within the parentheses, the system requirements for using the Agent Controller have not been met. In this case, complete the setup as described in the Monitor Linux in legacy mode article.

5.5. Remote connection test

If it has been verified that the agent script and its installed plug-ins are executing as expected locally, you can next check via netcat (or nc) whether port 6556 is reachable via the external IP address of the host:

root@linux# echo | nc 10.76.23.189 6556
16

The output 16 indicates that the connection was successfully established and that the TLS handshake can now take place. Since everything else here is TLS encrypted, no more detailed check is possible.

If a remote connection test fails, it is usually due to the firewall setting. In this case, configure iptables or nftables to allow access to TCP port 6556 from the Checkmk server.

If the communication between agent and Checkmk server is still unencrypted (as in legacy pull mode), or is and will remain unencrypted (as in legacy mode), this command will give you the full unencrypted agent output instead of the 16.

Note: For more diagnostics to run on the Checkmk server, see the general article on monitoring agents.

If the output remains unencrypted even after trying to register, use grep to determine the status from the output:

OMD[mysite]:~$ echo | nc 10.76.23.189 6556 | grep -A1 cmk_agent_ctl_status
<<<cmk_agent_ctl_status:sep(0)>>>
{"version":"0.1.0","ip_allowlist":[],"allow_legacy_pull":false, ... }

If the allow_legacy_pull variable is set to false, the Agent Controller itself does not allow plain text output, but another service, for example xinetd, is responsible for TCP port 6556. This a condition occasionally encountered after updating a system that does not meet the requirements for using the Agent Controller. In this case, first perform a deregistration and then complete the setup as described in the Monitor Linux in legacy mode article.

6. Security

6.1. Preliminary considerations

Security is an important criterion for any software, and monitoring is no exception. Since the monitoring agent is installed on every monitored server, a security problem here would have particularly serious consequences.

This is why security was emphasized in the design of Checkmk and has been an absolute principle since the earliest days of Checkmk: The agent does not read data from the network. Period. This means that it is impossible for an attacker to inject any commands or script components via the monitoring port 6556.

6.2. Transport Layer Security (TLS)

For an attacker, however, even a process list can be a first approach for drawing conclusions about worthwhile targets. Therefore, transport encryption between agent and Checkmk server with Transport Layer Security (TLS) is mandatory from Checkmk version 2.1.0. Here, the Checkmk server 'pings' the monitored host, which then establishes the TLS connection to the Checkmk server and transmits the agent output over it. Since only Checkmk servers with which a trust relationship exists can initiate this data transfer, there is no risk of data falling into the wrong hands.

6.3. Restricting access via IP addresses

Since only authorized Checkmk servers can retrieve data and unauthorized servers fail after a few bytes of handshake, the risk of a Denial of Service (DoS) attack is very low. For this reason, no further access restriction is currently planned. Of course you can block port 6556 against unauthorized access via iptables. Any rule that may exist and which has been transferred to clients via the Agent Bakery to restrict access to certain IP addresses is ignored by the Agent Controller.

6.4. Disabling built-in encryption

Especially when updating the agent, it may be that the built-in (symmetric) encryption is active, which is performed by the agent script itself. If TLS encryption and built-in encryption are active at the same time, then the entropy of the transmitted data is so high that compression, which is active from version 2.1.0 onwards, will not save any transmitted data — and will burden the CPUs of both the host and the Checkmk server with additional further encryption and decryption steps.

For this reason, you should disable the built-in encryption promptly after switching to TLS. In the CRE Checkmk Raw Edition you can do this by renaming the /etc/check_mk/encryption.cfg configuration file.

In the CEE Checkmk Enterprise Editions you can change existing rules to Use TLS encryption in Setup > Agents > Access to agents > Encryption (Linux, Windows) in the Encryption (Linux, Windows) section and then re-bake the agent packages. After the next automatic agent update, the agent script encryption is disabled but guaranteed by the Agent Controller. Note that after the automatic agent update, only registered hosts can provide monitoring data.

7. Disabling sections

The output from the Checkmk agent is divided into sections. Each of these sections contains related information and is usually simply the output of a diagnostic command. Sections always start with a section header. This is a line enclosed in <<< and >>>.

Except for Checkmk’s own sections, you can individually disable any of the 30+ sections that the agent generates by default. Specifically, this means that the corresponding commands will just not be executed by the agent, possibly saving computation time. Other reasons for disabling could be that you are simply not interested in certain information from a certain group of hosts, or that a certain host is providing erroneous values and you want to temporarily suspend retrieval of that data.

As a user of one of the CEE Checkmk Enterprise Editions you can simply create a rule via Setup > Agents > Windows, Linux, Solaris, AIX > Agent rules > Disabled sections (Linux agent), this rule will then be taken into account by the Agent Bakery.

List of agent rules for the Linux agent.

In the Enterprise Editions you can disable sections by rule

You will generally find a separate checkbox for each section that can be disabled. For each selected checkbox you will then find — after the newly baked agent has been installed on the selected hosts — a separate entry in the agent bakery configuration file /etc/check_mk/exclude_sections.cfg. For example, if you were to select Running processes and Systemd services, the appropriate configuration file would look like the following:

/etc/check_mk/exclude_sections.cfg

MK_SKIP_PS=yes
MK_SKIP_SYSTEMD=yes

Users of the CRE Checkmk Raw Edition can manually create the above /etc/check_mk/exclude_sections.cfg file and there enter the sections that should be disabled. All sections that can be disabled are listed in the ~/share/check_mk/agents/cfg_examples/exclude_sections.cfg file.

8. Extending the agent with plug-ins

8.1. What are agent plug-ins?

The /usr/bin/check_mk_agent agent script contains a whole set of sections which provide monitoring data for various check plug-ins which are then automatically found by the service detection. This includes all of the important monitoring for the operating system.

In addition, there is the possibility of extending the agent with agent plug-ins. These are small scripts or programs that are called by the agent and extend it with additional sections containing additional monitoring data. The Checkmk project delivers a whole series of such plug-ins, which — if they are correctly installed and configured — automatically deliver new services through a service detection.

Why aren’t these plug-ins simply hard-coded into the agent? For each of the plug-ins there will be one of the following reasons:

The plug-in is written in a programming language other than shell and therefore cannot be implemented inline (example: mk_logwatch).
The plug-in in any case needs a configuration, without which it would not work (example: mk_oracle).
The plug-in is so specialized that very few users would need it (example: plesk_domains).

8.2. Manual installation

The plug-ins included with Linux and Unix can all be found on the Checkmk server under share/check_mk/agents/plugins. They are also available from the agents download page in the Setup menu (as described in the installation chapter) in the Plugins box:

The beginning of the long list of available agent plug-ins

For all of the agent plug-ins we provide, there are matching check plug-ins that can evaluate the agent’s data and create services. These are already installed, so that newly found services can be detected and configured immediately.

Note: Before you install a plug-in on a host, take a look at its corresponding file. Often you will find important information there about the correct use of the plug-in.

The actual installation is then simple: Copy the file to /usr/lib/check_mk_agent/plugins. Make sure that it is executable. If not, use a chmod 755, otherwise the agent will not execute the plug-in. Note that especially if you do not transfer the files via scp but fetch them via HTTP from the download page, the execution permission will be lost.

Once the plug-in is executable and located in the correct directory, it will be automatically invoked by the agent and a new section will be created in the agent output. This section usually has the same name as the plug-in. Complex plug-ins, such as mk_oracle for example, even create a whole series of new sections.

8.3. Configuration

Some plug-ins will require a configuration file in /etc/check_mk/ in order to work. For others, a configuration is optional and enables special features or customizations. Still others will work simply as they are. There are several sources of information on a plug-in:

The documentation for the associated check plug-ins in your Checkmk site, which you can access via Setup > Services > Catalog of check plugins.
Comments in the plug-in itself (often very helpful!).
A suitable article in this manual, on monitoring Oracle for example.

8.4. Asynchronous execution

Just as with MRPE, you can also run plug-ins asynchronously. This is very useful if the plug-ins have a long runtime and the obtained status data does not need to be regenerated every minute anyway.

Asynchronous execution is not configured via a file, instead you create a subdirectory under /usr/lib/check_mk_agent/plugins whose name is a number: a number of seconds. Plug-ins in this directory are not only executed asynchronously, but at the same time you specify a minimum waiting time with the number of seconds before the plug-in should be executed again. If the agent is queried again before the time expires, it uses cached data from the last time the plug-in was executed. This allows you to configure a longer interval for the plug-in than the typical one minute.

The following example shows how to change the my_foo_plugin plug-in from synchronous execution to asynchronous execution with an interval of 5 minutes (or 300 seconds):

root@linux# cd /usr/lib/check_mk_agent/plugins
root@linux# mkdir 300
root@linux# mv my_foo_plugin 300

Note: Some plug-ins automatically implement asynchronous execution. This includes mk_oracle. Install such plug-ins directly after /usr/lib/check_mk_agent/plugins.

8.5. Installation using the Agent Bakery

In the CEE Checkmk Enterprise Editions, the included plug-ins can be configured via the Agent Bakery. This takes care of both the installation of the actual plug-in and the correct creation of its configuration file, should one be needed.

Each plug-in is configured via an agent rule. You can find the appropriate rule sets in Setup > Agents > Windows, Linux, Solaris, AIX > Agent rules > Agent Plugins:

Page with rules for configuring agent plug-ins in the Enterprise Editions.

List of rules for the Enterprise Editions agent plug-ins

8.6. Manual execution

Since agent plug-ins are executable programs, you can also run them manually for testing and diagnostic purposes. However, there are plug-ins that need certain environment variables set by the agent to find their configuration file, for example. Set these variables manually before execution:

root@linux# export MK_LIBDIR=/usr/lib/check_mk_agent
root@linux# export MK_CONFDIR=/etc/check_mk
root@linux# export MK_VARDIR=/var/lib/check_mk_agent
root@linux# /usr/lib/check_mk_agent/plugins/mk_foobar
<<<foobar>>>
FOO BAR BLA BLUBB 17 47 11

Some plug-ins also use special call options for debugging. Simply take a look at the plug-in file.

9. Integration of classic (Nagios) check plug-ins

9.1. Executing plug-ins via MRPE

There are two good reasons to continue using Nagios plug-ins under Checkmk. If you have migrated your monitoring from a Nagios based solution to Checkmk, you can continue to use older check plug-ins for which there is no Checkmk equivalent yet. In many cases these are self-written plug-ins in Perl or shell.

The second reason for using Nagios plug-ins is true end-to-end monitoring. Let’s assume you have your Checkmk server, a web server and a database server distributed over a large data center. In such a case, the database server response times measured from the Checkmk server are not very meaningful. It is far more important to know these values for the connection between the web server and the database server.

The Checkmk agent provides a simple mechanism to meet these two requirements: MK’s Remote Plugin Executor or MRPE for short. The name is deliberately an analogy to the NRPE of Nagios, which performs the same task there.

The MRPE is built into the agent and is configured with a simple text file, which you create as /etc/check_mk/mrpe.cfg. There you specify one plug-in call per line — along with the name you want Checkmk to use for the service it automatically creates for it. Here is an example:

/etc/check_mk/mrpe.cfg

Foo_Application /usr/local/bin/check_foo -w 60 -c 80
Bar_Extender /usr/local/bin/check_bar -s -X -w 4:5

Note: The Nagios plugins may not be placed in the directory /usr/lib/check_mk_agent/plugins. This directory is reserved for the agent plug-ins. Apart from this directory, you are free to choose as long as the agent can find and run the plugins there.

If you now run the agent locally, you will find a new section for each plug-in called <<mrpe>> which contains the name, exit code and output from the plug-in. You can check this with the following handy grep command:

root@linux# check_mk_agent | grep -A1 '^...mrpe'
<<<mrpe>>>
(check_foo) Foo_Application 0 OK - Foo server up and running
<<<mrpe>>>
(check_bar) Bar_Extender 1 WARN - Bar extender overload 6.012|bar_load=6.012

The 0 and 1 in the output stand for the exit codes of the plug-ins and follow the conventional scheme: 0 = OK, 1 = WARN, 2 = CRIT and 3 = UNKNOWN.

The rest will now be done automatically by Checkmk. Once you invoke a service discovery for the host, the two new services will show up as available. It will look like this:

List of detected services for the plug-ins set up via MRPE.

One service is detected for each of the two MRPE plug-ins

By the way, due to the syntax of the file, the name cannot contain spaces. However, you can replace a space with %20 using the same syntax as in URLs (ASCII code 32 for space is hexadecimal 20):

/etc/check_mk/mrpe.cfg

Foo%20Application /usr/local/bin/check_foo -w 60 -c 80
Bar%20Extender /usr/local/bin/check_bar -s -X -w 4:5

9.2. Asynchronous execution

Note that all plug-ins you list in mrpe.cfg will be executed synchronously in order. The plug-ins should therefore not have too long an execution time. If one plug-in hangs, the execution of all others will be delayed. This can lead to the complete querying of the agent by Checkmk running into a timeout and preventing the host from being reliably monitored.

If you really need longer running plug-ins, you should switch them to asynchronous execution and thus avoid such problems. To do this, set a time in seconds during which a calculated result should remain valid, e.g. 300 for five minutes. To do this, set the expression (interval=300) following the service name in mrpe.cfg:

/etc/check_mk/mrpe.cfg

Foo_Application (interval=300) /usr/local/bin/check_foo -w 60 -c 80
Bar_Extender /usr/local/bin/check_bar -s -X -w 4:5

This facility has several benefits:

The plug-in will run in a background process and will no longer slow down the execution of the agent.
Because the agent does not wait for execution, the result is not delivered until the next call of the agent.
At the earliest after 300 seconds the plug-in will be executed again. Until then, the old result is reused.

So this allows you to run tests that need a bit more computing time as well as over longer intervals, without having to configure anything on the Checkmk server.

9.3. MRPE with the Agent Bakery

Users of Enterprise Editions can also configure MRPE with the Agent Bakery. Responsible for this is the rule set Setup > Agents > Windows, Linux Solaris, AIX > Agent Rules > Generic Options > Execute MRPE checks. There you can configure the same things as described above. The file mrpe.cfg will then be generated automatically by the bakery.

Rule for MRPE configuration in the agent bakery.

MRPEs can be conveniently configured using a rule in the Enterprise Editions

Baking the plug-ins

You can also have the check plug-ins included in the package being delivered. With this, the agent is then complete and does not need any manual installation of additional files. The whole thing works like this:

Create the directory local/share/check_mk/agents/custom on the Checkmk server.
Create a subdirectory there — e.g. my_mrpe_plugins.
Again, create the subdirectory bin in it.
Copy your plug-ins into the bin folder.
Create a rule in Setup > Agents > Windows, Linux, Solaris, AIX > Agent rules > Generic Options > Deploy custom files with agent .
Select my_mrpe_plugins, save and bake!

The check plug-ins will now be installed into the default bin directory of your agent. By default this is /usr/bin. So when configuring the MRPE checks, use /usr/bin/check_foo instead of /usr/local/bin/check_foo.

10. Hardware monitoring

Monitoring a Linux server as completely as possible of course also includes monitoring its hardware. The monitoring is done partly using the Checkmk agent directly, and partly via special plug-ins. In addition, there are still cases where you can implement monitoring via SNMP or even via a separate management board.

10.1. Monitoring SMART values

Modern hard drives almost always have S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology). This system continuously records data on the state of the HDD or SSD and Checkmk can retrieve these values with the smart plug-in and evaluate the most important of them. For the plug-in to work after installation, the following requirements must be met:

The smartmontools package must be installed. You can install this on all modern distributions via the respective package manager.
If the hard disks are connected to a RAID controller and this allows access to the SMART values, the respective tool must also be installed. Supported are tw_cli (3ware) and megacli (LSI).

If these requirements are met and the plug-in is installed, the data is automatically read and appended to the agent’s output. In Checkmk you can then also activate the new services directly:

List of SMART services found in service discovery.

SMART services found by service detection.

As seen in the screenshot, if cmd_timeout occasionally occurs, switch the plug-in to asynchronous execution at intervals of a few minutes.

10.2. Monitoring by means of IPMI

IPMI (Intelligent Platform Management Interface) is an interface for hardware management which also enables monitoring of the hardware. Checkmk uses freeipmi for this purpose to access the hardware directly and without a network. freeipmi is installed from the package sources and is then ready for immediate use, so that the data will be transmitted the very next time Checkmk is called.

If freeipmi is not available or there are other reasons not to install it, ipmitool can also be used. ipmitool is often already present on the system and only needs to be supplied with an IPMI device driver, such as that provided by the openipmi package. Again, you do not need to do anything else after an installation, and the data will be collected automatically by Checkmk.

For error diagnosis you can also run the tools manually in a host shell. Once you have installed the freeipmi package, you can check its functions with this:

root@linux# ipmi-sensors Temperature
32 Temperature_Ambient 20.00_C_(1.00/42.00) [OK]
96 Temperature_Systemboard 23.00_C_(1.00/65.00) [OK]
160 Temperature_CPU_1 31.00_C_(1.00/90.00) [OK]
224 Temperature_CPU_2 NA(1.00/78.00) [Unknown]
288 Temperature_DIMM-1A 54.00_C_(NA/115.00) [OK]
352 Temperature_DIMM-1B 56.00_C_(NA/115.00) [OK]
416 Temperature_DIMM-2A NA(NA/115.00) [Unknown]
480 Temperature_DIMM-2B NA(NA/115.00) [Unknown]

If ipmitool has been installed, you can check the output of its data with the following command:

root@linux# ipmitool sensor list
UID_Light 0.000 unspecified ok na na 0.000 na na na
Int._Health_LED 0.000 unspecified ok na na 0.000 na na na
Ext._Health_LED 0.000 unspecified ok na na 0.000 na na na
Power_Supply_1 0.000 unspecified nc na na 0.000 na na na
Fan_Block_1 34.888 unspecified nc na na 75.264 na na na
Fan_Block_2 29.792 unspecified nc na na 75.264 na na na
Temp_1 39.000 degrees_C ok na na -64.000 na na na
Temp_2 16.000 degrees_C ok na na -64.000 na na na
Power_Meter 180.000 Watts cr na na 384.00

10.3. Manufacturer-specific tools

Many large server manufacturers also provide their own tools for collecting the hardware information and making it available via SNMP. The following prerequisites must be met in order to retrieve this data and provide it to Checkmk:

An SNMP server is set up on the Linux host.
The manufacturer’s tool is installed (e.g. Dell’s OpenManage or Supermicro’s SuperDoctor).
The host is configured in Checkmk for monitoring via SNMP in addition to the Checkmk agent. See the article on monitoring with SNMP to learn how to do this.

The new services for hardware monitoring supported by this are then automatically detected and no further plug-ins are required.

10.4. Additional monitoring via the management board

A management board can be configured for each host and additional data can be retrieved via SNMP. The services detected in this way are then also assigned to the host.

Setting up the management board is very simple. Simply enter the protocol, the IP address and the access data for SNMP in the host’s properties and save these new settings:

Configure the management board for SNMP in the properties of the host in the Setup.

The configuration of the management board for SNMP in the properties of the host.

The management board is configured for SNMP in the properties of the host in the Setup

With a service discovery, the newly discovered services will then be then enabled as usual.

11. Uninstallation

As with an installation, uninstalling the agent is also done using the operating system’s package manager. Specify the name of the installed package here, not the filename of the original RPM/DEB file.

This is how you find out which DEB package is installed:

root@linux# dpkg -l | grep check-mk-agent
ii  check-mk-agent          2.1.0b1-1          all          Checkmk Agent for Linux

The uninstallation of the DEB package is then done using dpkg --purge:

root@linux# dpkg --purge check-mk-agent
(Reading database ... 739951 files and directories currently installed.)
Removing check-mk-agent (2.1.0b5-1) ...
Removing systemd units: check-mk-agent.socket, check-mk-agent-async.service, cmk-agent-ctl-daemon.service, check-mk-agent@.service
Deactivating systemd unit 'check-mk-agent.socket'...
Deactivating systemd unit 'check-mk-agent-async.service'...
Deactivating systemd unit 'cmk-agent-ctl-daemon.service'...
Reloading xinetd
Purging configuration files for check-mk-agent (2.1.0b5-1) ...

How to find out which RPM package is installed:

root@linux# rpm -qa | grep check-mk

Uninstallation of the RPM package is done under root with the rpm -e command.

12. Files and directories

12.1. File paths on the monitored host

File Path Description

File Path	Description
`/usr/bin/`	Installation directory for the agent script `check_mk_agent` and the Agent Controller `cmk-agent-ctl` on the target system.
`/usr/lib/check_mk_agent`	Base directory for extensions to the agent.
`/usr/lib/check_mk_agent/plugins`	Directory for plug-ins which should be automatically executed by the agent and extend its output with additional monitoring data. Plug-ins can be written in any available programming language.
`/usr/lib/check_mk_agent/local`	Directory for custom local checks.
`/var/lib/check_mk_agent`	Base directory for agent data.
`/var/lib/check_mk_agent/cache`	Here cache data of individual sections is stored and appended back to the agent on each execution as long as the cache data is valid.
`/var/lib/check_mk_agent/job`	Directory for monitored jobs. These will be appended to the agent output on each execution.
`/var/lib/check_mk_agent/spool`	Contains data created e.g. by cronjobs which have their own section. These are also appended to the agent output. You can read more about this in the article The spool directory.
`/var/lib/cmk-agent/registered_connections.json`	Contains a list of connections registered with the Agent Controller.
`/var/agent-receiver/received-outputs`	Contains for each connection its UUID as a soft link pointing to the folder containing the agent output.
`/etc/check_mk`	Storage of configuration files for the agent.
`/etc/check_mk/mrpe.cfg`	Configuration file for MRPE — for running classic Nagios compatible check plug-ins.
`/etc/check_mk/encryption.cfg`	Configuration for built-in encryption of agent data.
`/etc/check_mk/exclude_sections.cfg`	Configuration file for the disabling certain sections of the agent.

/usr/bin/

Installation directory for the agent script check_mk_agent and the Agent Controller cmk-agent-ctl on the target system.

/usr/lib/check_mk_agent

Base directory for extensions to the agent.

/usr/lib/check_mk_agent/plugins

Directory for plug-ins which should be automatically executed by the agent and extend its output with additional monitoring data. Plug-ins can be written in any available programming language.

/usr/lib/check_mk_agent/local

Directory for custom local checks.

/var/lib/check_mk_agent

Base directory for agent data.

/var/lib/check_mk_agent/cache

Here cache data of individual sections is stored and appended back to the agent on each execution as long as the cache data is valid.

/var/lib/check_mk_agent/job

Directory for monitored jobs. These will be appended to the agent output on each execution.

/var/lib/check_mk_agent/spool

Contains data created e.g. by cronjobs which have their own section. These are also appended to the agent output. You can read more about this in the article The spool directory.

/var/lib/cmk-agent/registered_connections.json

Contains a list of connections registered with the Agent Controller.

/var/agent-receiver/received-outputs

Contains for each connection its UUID as a soft link pointing to the folder containing the agent output.

/etc/check_mk

Storage of configuration files for the agent.

/etc/check_mk/mrpe.cfg

Configuration file for MRPE — for running classic Nagios compatible check plug-ins.

/etc/check_mk/encryption.cfg

Configuration for built-in encryption of agent data.

/etc/check_mk/exclude_sections.cfg

Configuration file for the disabling certain sections of the agent.

12.2. File paths on the Checkmk server

File Path Description

File Path	Description
`local/share/check_mk/agents/custom`	Base directory for custom files to be delivered with a baked agent.
`share/check_mk/agents/cfg_examples/exclude_sections.cfg`	Example configuration file for disabling sections.

local/share/check_mk/agents/custom

Base directory for custom files to be delivered with a baked agent.

share/check_mk/agents/cfg_examples/exclude_sections.cfg

Example configuration file for disabling sections.

On this page

1. The new Linux agent
2. Architecture of the agent
3. Installation
4. Registration
5. Testing and troubleshooting
6. Security
7. Disabling sections
8. Extending the agent with plug-ins
9. Integration of classic (Nagios) check plug-ins
10. Hardware monitoring
11. Uninstallation
12. Files and directories
- 12.1. File paths on the monitored host
- 12.2. File paths on the Checkmk server