1. Hosts, services and agents
So, Checkmk is ready. But before we start with the actual monitoring, we will briefly explain some important terms. First of all, there is the host. In Checkmk, a host is usually a server, a virtual machine (VM), a network device, an appliance — or generally something with an IP address that is monitored by Checkmk. However, there are also hosts without an IP address, for instance Docker containers. Each host always has one of the states UP, DOWN or UNREACH.
On each host a number of services are monitored. A service can be anything — for example, a file system, a process, a hardware sensor, a switch port — but it can also simply be a certain metric such as CPU utilisation or RAM consumption. Each service can have one of the states OK, WARN, CRIT or UNKNOWN.
In order for Checkmk to be able to request data from a host, an agent is usually necessary. This is a small programme that is installed on the host and which provides data on the state (or 'health') of the host on request. Servers running Windows, Linux or Unix can only be effectively monitored by Checkmk if you install a Checkmk agent there — an agent provided by us. In the case of network devices and many appliances, the manufacturer will usually have built-in an agent that Checkmk can easily query using the standardised SNMP protocol. Cloud services such as Amazon Web Services (AWS) or Azure alternatively provide an interface ('API') that can be queried by Checkmk via HTTP.
2. Preliminary considerations for DNS
Even though Checkmk does not require name resolution of hosts, a well-maintained Domain Name System (DNS) makes configuration much easier and avoids errors, since Checkmk will then be able to resolve the host names on its own without you needing to enter IP addresses in Checkmk.
So setting up a monitoring system is a good opportunity to check whether your DNS is up to date and, if necessary, to add any missing entries.
3. Folder structures for hosts
Checkmk manages your hosts in a hierarchical tree of folders — quite analogous to what you know from files in your operating system. If you only monitor a handful of hosts, this may seem not so important to you. But remember — Checkmk is designed to monitor thousands and tens of thousands of hosts — so good order can be half the battle won.
Before you include the first hosts into Checkmk, it is therefore advantageous to think about the structure of these folders. On the one hand, the folder structure is useful for your own overview. More importantly, however, it can be used for the configuration of Checkmk. All configuration parameters of hosts can be defined in a folder, which are then automatically inherited by its subfolders and hosts contained there. Therefore, it is elementary, not only but especially for the configuration of large environments, to set up a well-considered folder structure from the beginning.
Once you have created a folder structure, you can change it — but you must do so very carefully. Moving a host to another folder can have the effect of changing its parameters without you being aware of it.
The real consideration when building a folder structure that will be most useful to you is the criteria by which you want to organise the folders. The criteria can be different at each level of the tree. For example, you can distinguish by location in the first level and by technology in the second level.
The following classification criteria have proven themselves in practice:
Sorting by location is particularly obvious in larger companies, especially if you distribute the monitoring over several Checkmk servers. Each server then monitors a region or a state/country, for example. If your folders map this distribution, then you can define, for example, in the folder 'Munich' that all hosts in this folder are to be monitored from the Checkmk site 'muc'.
Alternatively, 'organisation' (i.e. the answer to the question 'Who is responsible for a host?') may be a more meaningful criterion, since location and responsibility may not always be the same. For example, it may be that one group of your colleagues is responsible for the administration of Oracle, regardless of the actual physical location of the corresponding hosts. If, for example, the folder 'Oracle' is intended for the hosts of the Oracle colleagues, it is easy to configure in Checkmk that all hosts below this folder are only visible to these colleagues and/or that they can even maintain their hosts there themselves.
Structuring by technology could, for example, provide a folder for Windows
servers and one for Linux servers. This would simplify the implementation of the
sshd process must run on all Linux servers'. Another example
is the monitoring of devices such as switches or routers via SNMP. Here, no Checkmk
agent is used, but the devices are queried via the SNMP protocol. If these hosts
are grouped in separate folders, you can make the settings necessary for SNMP,
such as the 'Community', directly at the folder.
Since a folder structure can only rarely reflect the complexity of reality, Checkmk provides another supplementary possibility for structuring using the host tags — but more on this in a separate chapter on fine-tuning the monitoring. For more information on the folder structure, among other things, see the article on host administration.
4. Creating folders
You can access the administration of folders and hosts via the navigation bar, the Setup menu, the Hosts topic and the Hosts entry. The Main directory page is then displayed:
Before we create the first folder, we will briefly discuss the structure of this page, since you will find the various elements on most Checkmk pages in the same or a similar format. Below the page title Main directory you will find the breadcrumb path, which shows you where you are currently located within the Checkmk interface. Below this, the menu bar is displayed, which summarises the possible actions on this page in menus and menu items. The menus in Checkmk are always context-specific, i.e. you will only find menu entries for actions that make sense on the current page.
Below the menu bar you will find the action bar, in which the most important actions in the menus are offered as buttons for direct clicking. You can hide the action bar with the button to the right of the Help menu and show it again with . When the action bar is hidden, the icons are displayed in the menu bar to the right of .
Since we are currently on an empty page (without folders and without hosts), the important actions for creating the first object are additionally offered via even larger buttons — so that the options offered by the page cannot be overlooked. These buttons will disappear after the first object has been created.
Now let’s get back to the reason why we are on this page: the creation of folders. One folder — the Main folder — exists in every freshly set up Checkmk system. It is called the Main directory, as you can see in the title of the page. Below the Main directory, we will now create the three folders Windows, Linux and Network as a simple exercise.
Create the first of the three folders by selecting one of the actions offered to create a folder (e.g. the Add subfolder button). On the new page Create new folder enter the folder name in the first box Basic settings:
In the above image, the Show less mode is active and only the entry that is absolutely necessary for creating a folder is displayed. Confirm the entry with Save.
Analogous to the Windows folder, create the other two folders Linux and Network. After that, the situation will look like this:
Tip: When you point the mouse at the tab or the top of a folder icon, the folder unfolds to reveal the icons you need to perform important actions with the folder (change the properties, move the folder or delete it, etc.).
One more tip: At the top right of each page you will find the information whether — and if so, how many — changes have already been accumulated in the meantime. Since we have created three folders, there are three changes, but they do not need to be activated yet. We will deal with activating changes in more detail below.
5. Adding the first host
Now everything is in place and ready to add the first host to the monitoring — and what could be more obvious than to monitor the Checkmk server itself? It won’t be able to report its own total failure of course, but this is still useful because it gives you not only an overview of CPU and RAM usage, but also several metrics and checks concerning the Checkmk system itself.
The procedure for including a Linux host (as well as a Windows host, by the way) is in principle always the following:
Download the agent
Install the agent
Create the host
After creating the host, the setup is completed by configuring the services and activating the changes, which we will describe next.
5.1. Downloading the agent
Since the Checkmk server is a Linux machine, you need the Checkmk agent for Linux.
In the Enterprise Editions, Setup > Agents > Windows, Linux, Solaris, AIX takes you to a page that also gives you access to the Agent Bakery with which you can 'bake' individually configured agent packages. In addition, however, a generic agent that you can download immediately is always offered:
The Raw Edition does not have an agent bakery. In this version, Setup > Agents > Linux will take you directly to the download page, where you will find pre-configured agents and agent plug-ins:
Download the package file. Choose the RPM file format for Red Hat, CentOS and SLES or the DEB file format for Debian and Ubuntu.
5.2. Installing the agent
For the following installation example, we assume that the downloaded package
file is located in the
/root directory, i.e. in the home directory of
root user. If you have downloaded the file to another directory,
/root directory definition with the actual directory in the
following installation command. Similarly, replace the name of the package file
with the name of the file you downloaded.
The package file is only needed during the installation and it can be deleted after completing the installation.
Note: In our example, the agent is installed on the Checkmk server,
i.e. you do not need to copy the package file to another computer.
If the downloaded file is not on the host targetted for the installation of
the agent, you must first copy the file to the target host, for example with
the command line tool
scp. This is done in the same way as for the
installation of the Checkmk software and as described for the Linux installation,
e.g. for the installation under Debian and Ubuntu.
The installation is performed as
root on the command line, for the RPM
rpm, preferably with the
-U option, which stands for
upgrade and which ensures that the installation goes through without
errors even if an old version of the agent is already installed:
root@linux# rpm -U /root/check-mk-agent-2.0.0b5-a38356026f314d52.noarch.rpm
Or for the DEB file with the
dpkg -i command:
root@linux# dpkg -i /root/check-mk-agent_2.0.0b5-a38356026f314d52_all.deb
Important: The agent requires either the background programme
systemd, which is standard on newer Linux distributions,
or the auxiliary daemon
xinetd. You can see which daemon is running on
your computer from the output when you install the agent:
|Agent running …||Output|
Neither of the above two messages, instead:
If you have neither
xinetd later. This can be done on RedHat/CentOS with:
root@linux# yum install xinetd
on SLES with:
root@linux# zypper install xinetd
and on Debian/Ubuntu with:
root@linux# apt install xinetd
This completes the installation of the agent.
The Checkmk agent for Linux is an executable programme (shell script) that you can
test very easily by invoking the
root@linux# check_mk_agent <<<check_mk>>> Version: 2.0.0b5 AgentOS: linux Hostname: mycmkserver AgentDirectory: /etc/check_mk DataDirectory: /var/lib/check_mk_agent SpoolDirectory: /var/lib/check_mk_agent/spool PluginsDirectory: /usr/lib/check_mk_agent/plugins LocalDirectory: /usr/lib/check_mk_agent/local ...
The output from the
check_mk_agent command is very, very long,
so we have only listed the first few lines here.
To test the accessibility of the agent from outside, you can try a connection on
port 6556 from another computer via
telnet. Here the agent should
respond with this same output:
root@linux# telnet mycmkserver 6556 Trying 192.168.178.34... Connected to mycmkserver. Escape character is '^]'. <<<check_mk>>> Version: 2.0.0b5 AgentOS: linux Hostname: mycmkserver ...
Note: By default, the agent is accessible from the entire network and can be queried without a password. However, since the agent does not accept any commands from the network, a would-be attacker cannot gain access. Information such as the list of current processes is visible, however. You can find out how to secure the agent in the article on the Linux agent.
5.3. Creating a host
After installing the agent on the host, you can add the host to the monitoring — in the already-prepared Linux folder. Just a reminder: In this example, the Checkmk server and the host to be monitored are identical.
In the Checkmk interface, open the same Main directory page where you have already created the three folders: Setup > Hosts > Hosts. There, change to the Linux folder by clicking on that folder.
Click Add host and the Add host page will open:
As with the creation of the three folders above, the Show less mode is still active. Therefore, Checkmk only shows the most important host parameters in the form — the ones that are necessary to create a host. If you are interested, you can see the rest by clicking the three ellipsis at each of the open boxes and opening the two collapsed boxes at the bottom of the page. As mentioned at the beginning, Checkmk is a complex system that has an answer to every question. That’s why you can configure so much on a host (but not only there).
Tip: On many pages — including this one — you can also display help texts for the parameters. To do this, select Show inline help from the Help menu. The selected setting remains active on other pages until you switch off the help again.
But now for the inputs for creating the first host. You only have to fill in one field, namely Hostname in the Basic settings.
You can freely assign this name. However, you should know that the host name is of central importance, because it serves as an internal ID (or key) for unambiguous identification of the host at all points in the monitoring. Since it is so important in Checkmk and is often used, you should think carefully about the naming of your hosts. A host name can be changed at a later date, but as this is a time-consuming process, you should avoid it.
It is best if the host can be resolved under its name in the DNS. If this is the case, you will be finished with this form. If not, or if you do not want to use DNS, you can also enter the IP address manually in the IPv4 Address field.
Note: To ensure that Checkmk can always run stably and with good performance, it maintains its own cache for the resolution of host names. For this reason, the failure of the DNS service does not lead to a failure of the monitoring. The details on host names, IP addresses and DNS can be found in the article on managing hosts.
Before we go any further, the initial host must first be completely created. We are not there yet — even though we are getting close to it.
Murphy’s law — "Everything that can go wrong will go wrong" — can unfortunately not be repealed for Checkmk. Things can go wrong, especially when you are trying them for the first time. Good tools for diagnosing errors are therefore important.
Already when creating a host, Checkmk offers not only to save the entries (host name and IP address) on the Create new host page, but also to test the connection to the host. In the action bar on the Create new host page you will find, among other things, the Save & go to connection tests button. Click on this button.
The Test connection to host page will appear and Checkmk will try to reach the host in various ways. For Linux and Windows hosts only the two upper boxes are interesting:
The output in the Agent box assures you that Checkmk can successfully communicate with the agent you have previously installed manually on the host.
In further boxes you can see how Checkmk tries to make contact via SNMP. This predictably leads to SNMP errors in this example, but this is very useful for network devices, which we will discuss below.
On this page you can try a different IP address in the Host Properties box if necessary, run the test again and even transfer the changed IP address directly to the host properties with Save & go to host properties.
Click this button (whether you have changed the IP address or not) and you will land on the Properties of host page.
6. Configuring services
Once the host itself has been included, the really interesting part begins — the configuration of its services. On the host properties page mentioned above, click Save & go to service configuration and the Services of host page will appear.
On this page you specify which services you want to monitor on the host. If the agent on the host is accessible and running correctly, Checkmk automatically finds a number of services and suggests them for monitoring (shown here in an abbreviated form):
For each of these services, there are in principle the following possibilities:
Undecided : You have not yet decided whether to monitor this service.
Monitored : The service is being monitored.
Disabled : You have basically chosen not to monitor the service.
Vanished : The service was being monitored, but it now no longer exists.
The page shows all services ordered by these categories into tables. As you have not yet configured a service, you will only see the Undecided table. To start with, it is easiest to click on Monitor undecided services. That will transfer all services directly into the monitoring, and all Undecided services will become Monitored services.
You can always visit this page later to adjust the configuration of the services. Sometimes new services are created by changes to a host, e.g. when you include a Logical Unit Number (LUN) as a file system or configure a new Oracle database instance. These services then reappear as Undecided, at which point you can include them in the monitoring individually or all at once.
Conversely, services can also disappear, for example because a file system has been removed. These services then appear in the monitoring as UNKNOWN and on this page as Vanished and can be removed from the monitoring with Remove vanished services.
The button Fix all does everything at once — adding missing services and removing vanished ones.
7. Activating changes
Checkmk initially saves all changes you make only in a temporary 'configuration environment' that does not yet influence the currently-operating monitoring. Only by 'activating the accumulated changes' will they be transferred to the monitoring. You can read more about the background to this in the article on configuring Checkmk.
As we mentioned above, on the top right of each page you will find information on how many changes have so far accumulated that have not yet been activated. Click on the link with the number of changes. This will take you to the Activate pending changes page, which lists, among other things, the changes that have not yet been activated at Pending changes:
Now click on the Activate on affected sites button to apply the changes.
Shortly after, you will be able to see the result in the sidebar in Overview, which now shows the number of hosts (1) and the number of services you previously selected. Also in the standard dashboard, which you can reach by clicking on the Checkmk logo in the top left of the navigation bar, you will now be able to see that the system has become filled with life.
You have now successfully transferred the first host and its services into the monitoring — Congratulations!
8. Monitoring Windows
Just as for Linux, Checkmk also has its own agent for Windows. This is packaged as an MSI package. You will find it in the same place as the Linux agent (in the Checkmk Raw Edition just next to it under Setup > Agents > Windows). Once you have downloaded the MSI package and copied it to your Windows computer, you can install it by double-clicking, as is usual with Windows.
Note: You may need to configure the Firewall settings in Windows to allow Checkmk access over the network.
Once the agent has been installed, you can add the host to the monitoring. Follow the same procedure as described above for the Linux host, but create the host in the designated Windows folder. Since Windows is structured differently from Linux, the agent will naturally find other services. For a detailed introduction to this subject, see the article on monitoring Windows.
9. Monitoring with SNMP
Professional quality switches, routers, printers and many other devices and appliances already have a built-in interface for monitoring from the manufacturer — the Simple Network Management Protocol (SNMP). Such devices can be monitored very easily with Checkmk — and you don’t even need to install an agent.
The basic procedure is always the same:
In the device’s management interface, enable SNMP for read access from the IP address of the Checkmk server.
Assign a Community when doing so. This is nothing more than a password for access. Since this is usually transmitted in plain text in the network, it only makes limited sense to choose a very complicated password. Most users simply use the same community for all devices within a company. This also greatly simplifies the configuration in Checkmk.
In Checkmk, create the host for the SNMP device as described above, this time in the designated Network folder.
In the host properties, in the Monitoring agents box, check Checkmk agent / API integrations and select No API integration, no Checkmk agent.
In the same Monitoring agents box, check SNMP and select SNMP v2 or v3.
If the Community is not
public, under Monitoring agents again activate the SNMP credentials entry, select SNMP community (SNMP Versions 1 and 2c) and enter the Community in the input field below.
For the above last three points (4, 5, 6), the result should look like in the following screenshot:
Tip: If you have created all SNMP devices in a separate folder, simply carry out the configuration of the Data sources for the folder. This will automatically apply these settings to all of the hosts in this folder.
The rest runs as usual. If you want, you can take a look at the Test connection to host page with the Save & go to connection tests button. There you can immediately see whether access via SNMP works, here for a Cisco Catalyst 4500 switch, for example:
On the Properties of host page, click on Save & go to service configuration to display the list of all services. This naturally looks completely different from Linux or Windows. On all devices, by default Checkmk monitors all ports that are currently in use. You can customise this later as you wish. In addition, one service that is always OK shows you the general information about the device, and another service shows you the uptime.
A detailed description can be found in the article on monitoring via SNMP.
10. Clouds, containers and virtual machines
You can also monitor cloud services, containers and virtual machines (VM) with Checkmk, even if you do not have access to the actual servers. Checkmk uses the application programming interfaces (API) provided by the manufacturers for this purpose. These interfaces always use HTTP or HTTPS for access.
The basic principle is always the following:
Set up an account for Checkmk in the manufacturer’s management interface.
Create a host in Checkmk to access the API.
Set up a configuration for this host to access the API.
For the monitored objects such as VMs, EC2 instances, containers, etc., create additional hosts in Checkmk or automate their creation.