Checkmk
to checkmk.com

1. Introduction

In Checkmk you configure parameters for hosts and services by using rules. This feature makes Checkmk very effective in complex environments, and also brings a number of advantages to smaller installations. In order to clarify the principle of rule-based configuration we will compare it to the classic method:

1.1. The classic approach

As an example, let’s take the configuration of the thresholds for WARN and CRIT in the monitoring of file systems. With a data base oriented configuration, one would enter a line for every file system into a table:

HostFilesystemWarningCritical

myserver001

/var

 90%

 95%

myserver001

/sapdata

 90%

 95%

myserver001

/var/log

 90%

 95%

myserver002

/var

 85%

 90%

myserver002

/opt

 85%

 90%

myserver002

/sapdata

 85%

 95%

myserver002

/var/trans

100%

100%

This is relatively clear — but only because the table in this example is short. In practice there tend to be hundreds or thousands of file systems. Tools like copy & paste, and bulk operations can simplify the work, but the basic problem remains — how can you identify and implement a concept? What is the general rule? How should thresholds for future hosts be preset?

1.2. Rules-based is better!

A rules-based configuration however consists of the concept! We will replace the logic of the above table with a set of four rules. If we assume that myserver001 is a test system, and that in each case the first relevant rule applies to every file system, the result will be the same thresholds as in the table above:

  1. File systems with the mount point /var/trans have a 100/100% threshold.

  2. The /sapdata file system on myserver002 has a 85/95% threshold.

  3. File systems on test systems have a 90/95% threshold.

  4. All (unspecified) file systems have a 85/90% threshold.

Granted, for only two hosts that doesn’t achieve much, but with only a few more hosts it can quickly make quite a big difference. The advantages of the rules-based configuration are obvious:

  • The concept is clearly recognisable and can be reliably implemented.

  • You can change the concept at any time without needing to handle thousands of data sets.

  • Exceptions are always still possible, but are documented in the form of rules.

  • The incorporation of new hosts is simple and less fault-prone.

In summary, then: less work — more quality! For this reason, with Checkmk you will find an abundance of rules for customising hosts and services — such as thresholds, monitoring settings, responsibilities, notifications, agent configuration and many more.

1.3. Types of rule sets

Within Setup Checkmk organises rules in Rule sets. Every rule set has the task of defining a specific parameter for hosts or services. Checkmk contains more than 700 rule sets! Here are some examples:

  • Host check command — defines how to determine whether hosts are UP.

  • Alternative display name for services — defines alternative names for services’ displays.

  • JVM memory levels — sets thresholds and other parameters for the monitoring of Javas-VMs’ memory usage.

Every rule set is responsible either for hosts or for services — never for both. If a parameter can be defined for hosts as well as services, there is a pair of applicable rules — e.g., Normal check interval for host checks and Normal check interval for services checks.

A few rule sets, strictly-speaking, don’t define parameters, rather they create services. An example are the rules in the Active checks category. With these you can, e.g., set up an HTTP check for specific hosts. These rules are classified as host rules — due to the fact that if such a check exists on a host it is deemed to be a host characteristic of the host.

Further, there are rule sets that control the Service discovery. With these you can, for example, define via Windows service discovery for which Windows-services automatic checks should be created if they are found on a system. These are also host rules.

The bulk of the rule sets determine parameters for specific check plug-ins. An example is Network interfaces and switch ports. The settings in these rules are tailored very specifically to their appropriate plug-in. Such rule sets fundamentally only find use with those services that are based on this plug-in. In case you are uncertain which rule set is responsible for which services, then you can best find out by navigating directly via the service to the relevant rule. How to do this will be explained later.

1.4. Host tags

One thing we have so far not mentioned: In the above example there is a rule for all test systems. Where is it actually defined which host is a test system?

In Checkmk, something like test system is known as a host tag. You can freely-define your own tags via Setup > Hosts> Tags , and some tags are already predefined. Applying them to hosts is done either explicitly in the properties of the host or through inheritance in the folder hierarchy. How to do this is explained in the article on the hosts. How to create your own tags, and what the predefined tags are about will be explained later in this article.

2. Determining the correct rule sets

2.1. Host rule sets

If you wish to create a new rule that defines a parameter for one or more hosts, there are several ways to this end. The direct way is via the corresponding group in the setup menu, in this case Setup > Hosts > Host monitoring rules:

Setup menu with focus on the Host monitoring rules.

In the following view, all rule sets relevant for host monitoring are displayed. rule sets are displayed. The numbers behind the names of these rule sets show the number of rules already defined:

View of the Host monitoring rules in the setup menu.

However, you can reach your goal somewhat faster via the search field. To do this, of course, you need to know approximately what the ruleset is called. Here is the result of a search for host checks as an example.

Extract of the result of a search for host checks.

Another way is via the menu item Hosts > Effective parameters in the properties of an existing host in the setup or via the icon icon rulesets in the list of hosts of a folder.

List of some hosts in the setup menu, with a highlighting of the button for host parameters.

There you will find not only all the rule sets that affect the host, but also the parameter currently effective for this host. In the example of Host check command no rule applies for the shown host and it is therefore is set to the default PING (active check with ICMP echo request):

wato rules host rule sets

Click on Host check command in order to see the complete rule set.

If a rule already exists, instead of the Default value the number of the rule defining this parameter appears. Clicking on this takes you directly to the rule.

wato rules host rule sets2

2.2. Service rule sets

The path to the rule sets for services is similar. The general access is via the setup menu, in this case Setup > Services > Service monitoring rules or, more appropriately via the search field.

wato rules service monitoring rules

If you are not yet very experienced with the names of the rule sets, then the path via the service is simpler. Similarly to the hosts, there is also a page in which all of a service’s parameters are shown and where you have the possibility of directly accessing the applicable rule sets. You can access this parameter page with the icon rulesets symbol in a host’s list of services in the setup. The icon check parameters symbol takes you directly to the rule set that defines the parameter for the check plug-in for this service.

wato rules setup service list

By the way — the icon rulesets symbol for the parameter page is also found in the status window in every service’s context menu:

wato rules service context menu

2.3. Enforced services

In the setup menu you will also find an entry for Enforced Services. As the name suggests, you can use these rulesets to force services to be created on your hosts. Details can be found in the article about services. A small number of rulesets - such as Simple checks for BIOS/Hardware errors - can only be found under the enforced services. These are services which do not result from the service detection, but are created manually by you.

2.4. Rule sets in use

In each of the aforementioned lists of rule sets - whether in the Host monitoring rules or the Service monitoring rules - you can use Related > Used rulesets in the menu bar, to display only only the rulesets in which you have defined at least one rule. This is often a convenient way to get started if you want to make adjustments to your to your existing rules.Incidentally, some of the rules are created already when creating the Checkmk instance and are part of the sample configuration. These are also displayed here.

2.5. Ineffective rules

Monitoring is a complex matter. It can happen that there are rules which do not match a single host or service — either because you have made a mistake or because the matching because the matching hosts and services have disappeared. Such ineffective rules can be found in the aforementioned rule set listings via Related > Ineffective rulesets in the menu bar.

2.6. Obsolete rule sets

Checkmk is under constant development. Occasionally things are and it comes that some rule sets are replaced by others. are replaced. If you have such rulesets in use, the easiest way to find them is do a rule search. Open it via Setup > General > Rule search in the navigation bar. Then click in the menu bar on Rules > Refine search, select Search for deprecated rulesets as the option for Deprecated and select Search for rulesets that have rules configured as the option for Used. After an additional click on Search you get the desired overview.

Search for obsolete rule sets.

3. Creating and editing rules

The following image shows the Filesystems (used space and growth) rule set with four rules configured:

rules filesystem

New rules are created either with the Create rule in folder button, or by cloning an existing rule with button clone. Cloning creates an identical copy of the rule that you can then edit with icon edit. A new rule created using the Create rule in folder button will always appear at the end of the list of rules, whereas a cloned rule will be displayed as a copy below the original rule from which it was cloned.

The sequence in which the rules are listed can be changed with the icon drag buttons. The sequence is important because rules positioned higher in the list always have priority over those located lower.

The rules are stored in the same folders from which you also manage the hosts. The rules’ authorities are restricted to the hosts in this folder or in subfolders. In the case of conflicting rules, the rule lower in the folder structure has priority. In this way, for example, users with rights limited to certain authorised folders can create rules for their hosts without affecting the rest of the system. In a rule’s characteristics you can change its folder and thus ‘relocate’ it.

3.1. The analysis mode with ‘traffic lights’

When you access a rule set via a host or service — for example, by using the icon rulesets or button check parameters symbols in the host or service — Checkmk shows you the rule set in the analysis mode:

rules filesystem analyze

This mode has two features. Firstly, a second button for setting rules appears — Create mount point specific rule for as an example here. With this you can create a new rule which has the appropriate current host or service already preselected. You can create an exceptional rule very easily and directly in this way. Secondly, a ‘traffic light’ symbol appears in every line, the colour of which shows whether and/or how this rule affects the current host, or respectively, service. The following conditions are possible:

icon rulenmatch

This rule has no effect on the current host or service.

icon rulematch

This rule accesses and defines parameters.

icon ruleimatch

The rule is applicable. But because another rule higher in the hierarchy has priority this rule is ineffective.

icon rulepmatch

This rule is applicable. Another rule higher in the hierarchy in fact has priority but doesn’t define all parameters, so that at least one parameter is defined by this lower rule.

In the last condition — the rule is a icon rulepmatch partial match — can only occur for rule sets in which a rule can define multiple parameters by selecting individual check boxes. Theoretically, every parameter for another rule can also be set individually here. More on this later.

4. Rule characteristics

4.1. General options

Every rule is assembled from three blocks. Everything in the first block, Rule options, is optional, and serves primarily for documentation:

rules props properties
  • The Description will be shown in the table of all rules in a rule set.

  • The Comment field can be used for a longer description. It only appears in a rule’s edit mode. Via the icon insertdate symbol you can insert a date stamp and your login name in the text.

  • The Documentation-URL is intended for a link to internal documentation that you maintain in another system (e.g., a CMDB). It will appear as the clickable icon url symbol in the rules table.

  • With the Do not apply this rule check box you can temporarily disable this rule. It will then be flagged as icon disabled in the table and is thus ineffective.

4.2. The defined parameters

The second block is different for every rule. The following image shows a widely-used type of rule (DB2 Tablespaces). Using check boxes you can determine which individual parameters the rule should define. As described earlier, Checkmk ascertains, separately for each individual parameter, which rules will set the parameters. The rule in the image simply defines one value, and leaves all other settings unaffected.

rules props value 1

Some rule sets define no parameters, rather they only decide which hosts are in and which are not. An example is the Hosts to be monitored rule set with which you can remove some hosts completely from the monitoring. The value area then looks like this:

rules props value 2

If you select Make the outcome of the rule positive here, this means that the affected hosts are incorporated in bulk — in our example, they will be monitored.

4.3. Conditions

In the third block, Conditions, you can define for which hosts or services the rules should apply. Here there are different conditions, all of which must be met in order for the rule to be applied. The conditions are therefore logically AND-linked:

rules props conditions 1

Condition type

Here you have the option of using normal conditions as well as predefined conditions. These are managed via Setup > General > Predefined conditions. Here you simply give fixed names to the rule matches that you need again and again, and from then on simply refer to them in the rules. You can even later change the content of these conditions centrally and all the rules will be automatically-adjusted to suit. In the following example the predefined condition No VM has been selected:

rules props conditions 2

Folder

With the Folder condition you define that the rule only applies to hosts in this folder — or subfolder. If the setting is Main Directory, this condition is applicable to all hosts. As described above, the folders have an effect on the rule’s sequence. Rules in lower folders always have priority over higher ones.

Host tags

Host tags restrict rules to hosts according to whether they have — or do not have — specific host tags. Here AND-links are also always used. Every other host tag condition in a rule reduces the number of hosts affected by the rule.

If you wish to make a rule applicable for two possible values for a tag, (for example, Criticality as well as Productive system and Business critical), you cannot do this with a single rule. You will require a copy of the rule for every variable. Sometimes a negation can also help here. You can also define that a tag is not present as a condition (e.g., not test system). The so-called auxiliary tags are another possibility.

Because some users really use many host tags, we have designed this dialog so that not all tag groups are displayed by default. You have to specifically select the one needed for the condition. It works like this:

  1. In the selection box choose a tags group.

  2. Click Add tag condition — an entry for this group will then be added.

  3. Select is or is not.

  4. Select the desired comparison value.

rules props hosttags

Labels

You can also use the Labels for conditions in rules. Include conditions with Add label condition — choose either has or has not to formulate a positive or negative condition, and then enter the label in the usual form key`:`value. Please pay attention to the exact spelling here, and case-sensitivity where applicable — otherwise the condition will not work correctly.

rules props labels

Explicit hosts

This type of condition is intended for exception rules. Here you can list one or more host names. The rule will apply only to these hosts. Please note that if you check the Specify explicit host names box but enter no hosts, then the rule will be completely ineffective.

Via the Negate option you can define a reversed-exception. With this you can exclude explicitly-named hosts from the rule.

rules props explicithosts 1

Important: all host names entered here will be checked for exact congruence. Checkmk is fundamentally case-sensitive!

You can change this behaviour to regular expressions by prefixing host names with a tilde (~). In this case, as always in the Setup:

  • The match is applied to the beginning of the host name

  • The match is not case-sensitive

A point-asterisk (.*) in regular expressions allows an arbitrary sequence of characters following the point. The following example shows a condition which all hosts will match whose names contain the character sequence my (or My, MY, mY etc.):

rules props explicithosts 2

Explicit services

For rules that are applicable to services there is a last type of condition that defines a match on a service’s name, or respectively — for rules that set check parameters — the check item’s name. With what exactly the match will be made can be seen in the caption. In our example it is the name (instance) of a tablespace:

rules props explicitservices

A match with regular expressions fundamentally applies here. The sequence .*temp matches all tablespaces containing temp because the match is always applied to the start of the name. The dollar sign at the end of transfer$ represents the end and thereby forces an exact match. A tablespace with the name transfer2 will thus not match.

Please don’t forget: for rules concerning explicit services a match with the service name is required (e.g. Tablespace transfer). For check parameter rules a match with the item applies (e.g. transfer). The item is in fact the variable part of of the service name, and determines to which tablespace it applies.

There are incidentally services without an item. An example is CPU load. This exists only once for each host — so no item is required. It follows then that rules for such check types are also without conditions.

5. Types of rule analysis

In the introduction, regarding the principle of rules I wrote that the first applicable rule determines the results of an analysis. That is not the whole truth — there are altogether three different types of analysis:

AnalysisAction

The first rule

The first rule that applies defines the value. Subsequent rules will not be analysed. This is the normal situation with rules that set simple parameters.

The first rule per parameter

Every individual parameter will be set by the first rule that defines this parameter (check box selected). This is the normal situation for all rules with subparameters that are activated with check boxes.

All rules

All applicable rules add elements to the results. This type is used for the allocation of hosts and services to host, service and contact groups for example.

This information is shown at the top of every rule set.

rules matching strategy

6. Host tags in detail

As we have seen, the host tags are an important basis for defining rules. They are also however useful in other locations. In views, for example, there is a filter for host tags. The side bar element Virtual host tree can arrange your folders by host tags into a tree. And on the command line, with many commands you can select all hosts having the foo tagline by using the @foo syntax.

So that everything makes sense you should set up your own host tags scheme that optimally suits your environment. But before we show you how you can define your own host tags within the Setup, we should now explain a few terms.

6.1. Tag groups, check box tags, themes and auxiliary tags

Host tags are organised in groups. For this reason a host from every group can have a maximum of one tag! A good example of an own group would be Datacenter, with the possible tags DC 1 and DC 2. With these every host will be assigned to one or the other data centres. Should you wish to install a host that is located in neither of the data centres you will need a third option for selection — for example, Not in a datacenter.

Some users have attempted to display the application running on a host in a tag group. One such group was called, let’s say, Application, and had the tags ORACLE, SAP, MS Exchange, etc. …​ This WILL work until the day a host has two applications — and that day will certainly come!

The correct solution for this situation would be to create a tag group for each application, each with only two options — yes or no. Checkmk simplifies this by allowing you to create tag groups with only a single tag. These will not be shown as a selection field in the host mask, rather they will be displayed as a check box. Selecting the check box sets a tag, otherwise the tag will not apply. Such tag groups are also called Checkboxtags.

So that this doesn’t get confusing if you have many tag groups (e.g., because you have numerous different applications), you can collate the tag groups into Topics. All tag groups with the same topic will then be …​

  • …​ consolidated into their own box in the host details.

  • …​ displayed in the rule conditions as a list which can be expanded and collapsed using a small triangle icon.

The topics have ‘only’ an aid for visualisation function, and have no influence on the actual configuration.

Auxiliary tags solve the following problem: Imagine that you have defined an operating system tag group with the tags Linux, AIX, Windows 2008 and Windows 2012. Now you want to define a rule which will be valid for all Windows hosts. This cannot work, because in a situation as described above you can only ever choose one tag per group.

In order to get around this problem you can define a Windows auxiliary tag. Assign this auxiliary tag to both the Windows 2008 and the Windows 2012 tags. A host that possesses either of these tags will always automatically receive the Windows auxiliary tag from Checkmk. In the rules, Windows will appear as its own tag for formulating conditions.

6.2. Predefined tags

During installation, Checkmk furnishes you with numerous tag groups:

Tag groupidFunctionChoices

Checkmk Agent / API integrations

agent

Defines what type of data the host receives from its agents.

cmk-agent, all-agents, special-agents, no-agent

Criticality

criticality

The system’s service level. For the Do not monitor this host tag a predefined rule is provided which will disable the host monitoring. The other tags are merely examples without function. You can however assign these to hosts and then use them in rules.

prod, critical, test, offline

Networking Segment

networking

Treat this tag group only as an example. For the WAN (high latency) tag an example rule is deposited which matches the thresholds for PING response times to the higher message latency in WAN.

lan, wan , dmz

IP address family

address_family

Defines whether the host should be monitored per IPv4 or IPv6, or both. The group has the status builtin and can not be modified. This is necessary as the tags are required internally by Checkmk during the creation of the configuration.

ip-v4-only, ip-v6-only, ip-v4v6, no-ip

SNMP

snmp_ds

This determines whether SNMP data is evaluated.

no-snmp, snmp-v2, snmp-v1

Piggyback

piggyback

This characteristic defines whether and how piggyback data is expected/processed for corresponding hosts.

auto-piggyback, piggyback, no-piggyback

Modifying predefined tag groups

You can theoretically customise the predefined tag groups as long as they are not marked as builtin. Modifications in Criticality or Network Segment are non-critical as these are only provided as examples.

6.3. Editing tag groups

Creating your own tags is achieved using Setup > Hosts > Tags. Depending on the Checkmk version, in a freshly-installed system it will look something like this:

A list of all builin tag groups.

Creating a new tag group is performed with the Icon for creating a new tag group. Add tag group button, which opens the following input masks:

rules hosttags config properties

The Tag group ID is used internally to identify the tag group. This must be explicit and may not be subsequently changed. The standard syntax for permitted characters applies (only letters, digits, underscore).

The Title will be used everywhere in the GUI in connection with the tag group. Because this is purely a display text it can be changed at any time without affecting the existing configuration.

You can leave Topic blank. Your tag group will then be displayed with the predefined groups. You can also create your own topics and use these to arrange your tags clearly in a summary.

The Choices in the next box are naturally of most importance.

rules hosttags config choices

It is essential that the appropriate Tag ID is explicit — not only within the group but also across all groups! In case of doubt you can simply work with prefixes — e.g., loc_dc01 — instead of only dc01.

The sequence — which you can as usual change with the icon drag drag button — doesn’t just have a visual function — the first tag in the list is deemed to be the default value! That means that all hosts without an explicit setting for this tag group will be automatically set to this value.

Under Auxiliary tags you can apply auxiliary tags to a tag that should be automatically added to the host when this tag is selected.

6.4. Editing Auxiliary tags

You can create new Auxiliary tags with icon of the button for adding new auxiliary tags.. As per usual you assign a fixed ID and an informative title in the following dialog. You can add a Topic in the same way as in the tag groups.

rules hosttags auxtag basic

The assignment and the usage of these auxiliary tags is then done directly in the options for the tag groups itself.

6.5. Deleting and modifying existing tags and tag groups

Modifying an existing tag group configuration appears to be a simple operation at first — but that is unfortunately not always the case, as it can have big impacts on an existing configuration. Changes that solely affect the display or only add new selections are no problem and have no effect on the existing hosts and rules:

  • A change to the title or topic of tags and tag groups

  • Adding an additional tag to a tag group

All other changes can impact existing hosts or rules that use the affected tags. Checkmk does not simply forbids such changes, it attempts to adapt your existing configuration so that everything again functions effectively. What exactly that means depends on the type of operation.

Deleting tag groups

Information from the affected tags will be erased from all hosts. If the tag group is used as a condition in existing rules you will receive the follwing warning:

rules hosttags delete

Here you need to decide whether you wish to remove the conditions from the existing rules or whether you wish to delete the rules completely. Both actions can make sense, but Checkmk cannot decide which action is better for you. If uncertain, you should go through the rule set (linked via the warning) and manually delete or modify all of the conditions for the affected group as needed.

Deleting single tags

Deleting tags is achieved by editing the group, removing the tag and then saving the data. This action can trigger a similar warning to that when deleting a tag group.

Hosts that had set the affected tag will be automatically reset to the default value. This will always be (as described above) the top tag in the list.

Rules that have a negative condition for the Tag simply lose this condition, without comment. If you have, for example, a rule for all hosts that don’t have the loc_dc2 tag, and you delete the loc_dc2 tag completely from the configuration, then this condition is obviously superfluous.

If however a positive condition with the tag exists, you will receive the above warning and must decide how to adapt the configuration.

Renaming Tag-IDs

Unlike those in tag groups, you can in fact change the IDs of tags retrospectively. This is, so to speak, an exception to the Checkmk principle that IDs once set are unchangeable. It can however be useful if you want to prepare an import of data from an existing system for which you need to accommodate a different tag scheme.

To rename tag-IDs, go into the tag group’s edit mode and there simply change the IDs, but leaving the title unaltered in doing so. This last point is important so that Checkmk recognises that a rename has occured rather than simply an option being removed and a new one added.

Before Checkmk executes the changes to the configuration, it will inform you of the consequences:

rules hosttags rename

Checkmk will now update all relevant hosts, folders and rules as appropriate.

Please be aware that there can nonetheless be situations in which manual corrections need to be made in some locations. So, for example, Tag-IDs are components of URLs which summon views that filter by tags. Checkmk cannot alter these URLs for you. Likewise, filter configurations in reports and dashboards cannot be automatically updated. It is also a good idea at the beginning to give enough thought to the tag scheme so that possible later renames can be minimised.

6.6. Constructing a tree-view from host tags

In Checkmk hosts are usually organised in folders, which results in a natural hierarchy. You can display this as a tree view via the display sidebar Folders snapin, and from there the standard view calls the hosts, filtered per branch. The Tree of Folders snapin complements this tree with filter options for topics and options for different views. You can also create such a tree view with host tags, and thus generate a ‘virtual’ hierarchy-map — using the Virtual Host Tree snapin. In addition to the host tags you can also incorporate the folder structure in such trees in which both the number of virtual trees and their respective branches is unlimited.

Suppose that you have created the three tag groups place, device class and operating system for your hosts. On the top tree level you will then see a selection of the locations, including the device classes, and ultimately the operating systems. Each level of the hierarchy takes you straight to a view of all hosts which have these tags.

To create a virtual host tree, first add the snap-in using button sidebar add snapin in the bottom left of the sidebar.

rules virtual host tree snapin

Then call up the settings via Setup > General > Global Settings > User Interface > Virtual Host Trees, and create a new tree with Create new virtual host tree configuration.

rules virtual host tree new

Then assign an ID and Title to the tree, and optionally exclude empty tree branches at the Exclude empty tag choices check box. Then with Add new element add the desired tag groups in the desired order. If you have the folder hierarchy as the top level, simply start with WATO folder tree. As usual, the order/hierarchy in the display you can of course later change using the handles.

rules virtual host tree settings

Save and apply the changes — and the new tree structure will immediately provide several new views.

rules virtual host tree views
On this page