Checkmk
to checkmk.com

1. Checkmk-specific terms

Activating changes

Changes to the configuration only take effect on the monitoring after they have been activated in a second step; in a similar way to how partitioning programs handle this, for example: Configure, check, apply.

More under Activating changes.

Active check

An active check is a small program or script that establishes a direct connection to a service on the network or internet and queries the monitoring data from there. Active checks are used for network-based services like HTTP, SMTP or IMAP, e.g. check_http for querying web pages. An active check handles both the collection and the evaluation of the data. This differs from a check plug-in, which is sometimes called a passive check, as it only evaluates existing data.

Agent

An agent collects the data relevant for the monitoring from a host. This agent can be a small program installed on the host (the Checkmk agent), an SNMP agent running independently of Checkmk on the host, a special agent that obtains the information through an API provided by the target system — or an active check that queries network-based services.

More under Monitoring agents.

Agent Bakery

With the Agent Bakery in the CEE Checkmk Enterprise Editions, agents can be individually packaged and optionally also distributed automatically.

More under The Agent Bakery.

Agent plug-in

An agent plug-in extends the functions of the standard agent supplied with Checkmk. It is a small program or script that is called by the Checkmk agent which enhances the agent’s output with additional sections of monitoring data. An example of an agent plug-in is the Agent Updater.

Agent Updater

In the CEE Checkmk Enterprise Editions, the Agent Updater is an agent plug-in that enables agents to be updated automatically.

API integrations

When the Checkmk Setup refers to API integrations, it means monitoring data that uses the Checkmk agent’s data format but which originates from a different source. Such sources can be data source programs, special agents or hosts that piggyback their data. If data received via an API integration is to be used in monitoring, API integrations must be enabled in a host’s properties.

Automation user

A special account for querying and configuring Checkmk independently of the web interface, e.g. via API, command line, script or web service. By default, the automation user has a randomly chosen automation password (automation secret).

Business Intelligence (BI)

Business Intelligence in Checkmk makes it possible to clearly display the overall status of a superordinate level derived from many individual status values. This can be an abstract grouping of individual components or a business-critical application. For example, the status of an application email, consisting of various hosts, switches and services such as SMTP and IMAP, can be captured via a single visualization. The formulation of completely intangible and non-technical considerations is also possible here, for example the on-time availability of a product to be delivered: This goal lies in the future and depends on many aspects, the supply chain, a functioning machine park, available personnel, etc. The availability of a product can be defined in the future. Any threats to this abstract goal could be captured via the BI module.

Check

A check in the context of Checkmk is the checking of a host or service according to predefined rules, and a check plug-in is thus the process that determines the status of hosts and services. In other words, the execution of a check plug-in results in a status of OK, DOWN, UNREACH, WARN, CRIT, PEND or UNKNOWN being returned.

More under Rules.

Check plug-in

Check plug-ins are modules written in Python that run on the Checkmk site and which create and evaluate the services of a host. For example, the check plug-in df, found within a site at ~/share/check_mk/checks/ (legacy) or ~/lib/python3/cmk/base/plugins/agent_based/, creates services for a host’s existing mounted file systems from data from an agent in the site, and checks those services against the data, such as how much free space is left.

Checkmk extension package (MKP)

MKP is Checkmk’s own file format for aggregating and distributing extensions, i.e. custom check plug-ins, agent plug-ins, time series graph definitions, notification scripts, views, dashboards and so on.

Configuration environment

The Checkmk web interface is divided into monitoring and configuration environments. The latter refers to the areas where rules are built, hosts and services are added and defined, users are managed or general options are specified. The configuration environment is accessed via the Setup menu of the navigation bar.

More under The user interface.

Contact

Contacts are Checkmk users who are responsible for specific hosts and services. The assignment of contacts to hosts and services is done via contact groups. Contacts can also be user accounts that exist purely for notifications, such as for forwarding to a ticket system.

More under Contact groups.

Dashboard

A dashboard is a freely-configurable overview consisting of views and/or so-called dashlets. These elements are available, for example, in the form of lists (such as host problems), time series graphs or small speedometers that visualize individual values such as a CPU temperature.

More under Dashboards.

Distributed monitoring

Checkmk distinguishes between a distributed monitoring and a distributed setup. Distributed monitoring means that the entire monitoring system consists of more than one Checkmk site and all of the data is displayed together in one place. Or in other words: The monitoring then consists of a central site and at least one remote site, and the data of the remote site is also displayed in the central site. Distributed monitoring can optionally be combined with a distributed setup.

More under Central status.

Distributed setup

Checkmk distinguishes between a distributed monitoring and a distributed setup. Distributed setup means that the entire monitoring system consists of more than one Checkmk site and that the configuration is done at a single location. Or in other words: The monitoring then consists of a central site and at least one remote site, and the configuration of the remote site comes from the central site. A distributed setup always includes distributed monitoring.

Edition

The Checkmk editions are the various software variants of Checkmk available for downloading and installation. They are the open source CRE Checkmk Raw Edition, the CSE Checkmk Enterprise Standard Edition available by subscription, its sister version, the CFE Checkmk Enterprise Free Edition, which is free but limited to 25 hosts, and the multi-tenant CME Checkmk Enterprise Managed Services Edition.

Event Console (EC)

When monitoring hosts and services, Checkmk focuses on states. The Event Console is the module that, in contrast, takes care of events, i.e. monitoring from sources such as syslog or SNMP traps, but optionally also the Windows Event Log, log files and own applications. An example: A warning message from the SMTP service on a mail server would not change the status/state of its host or services — yet it is still relevant information that belongs in the monitoring. The Event Console can be used to record and display such events in Checkmk.

More under The Event Console.

Host

Under Checkmk, a host is any stand alone, physical or virtual system that is monitored by Checkmk. Usually these are things with their own IP address (servers, switches, SNMP devices, virtual machines), but also, for example, Docker containers or other logical objects which do not have such an IP address. Each host always has one of the states UP, DOWN, UNREACH or PEND and always has at least one service.

Broken down even further: For Checkmk, internally, a host is simply a structuring element that contains elements to be monitored, i.e., services. Each host necessarily has at least one service to verify actual accessibility (such as PING or the Checkmk agent itself, i.e. the service Check_MK). In this respect, host means little more than the heading under which a number of services are grouped.

Host group

Hosts are primarily managed by folders in Checkmk. Host groups provide a second level of grouping across the folder structure to select hosts in monitoring, e.g. in table views. Host tags, labels and folders are used to assign hosts to such groups via rules. Hosts can also be explicitly assigned to a host group.

More under Host groups.

Host status

The host status describes the state of the host, i.e. whether it can be reached via the network (UP), does not respond to requests from the network (DOWN) or whether its access path is blocked by failed intermediate devices (switches, routers, etc.) (UNREACH). For freshly added hosts that have never been queried before, there is also the PEND state, which is not a state in the true sense.

More under Hosts and services.

Host tag

Host tags are keywords that can be assigned to hosts so these can be targeted later, for example, in monitoring for views or in the configuration for rules. Host tags are divided into groups, for example a tag group Operating systems can be set up with the tags Linux and Windows. Some tag groups are predefined, such as the type of Checkmk agent used or the IP address family used to record whether a host should be monitored over IPv4, IPv6, or both versions. The tags also have predefined values and a default which is assigned to each host as long as it has not been overwritten with another option from the group.

More under Host tags.

Label

Hosts can be given host tags, but they can also be given direct labels. These labels are also divided into groups, with the group (or key) preceding the colon and the value following it. Such arbitrary key-value pairs (os:linux, os:windows, foo:bar etc.) can be set directly on a host without the prior configuration of a tag group, and can be used later for filtering in rules and views and for other purposes. They therefore do not have a predefined size, nor do they have a default value like host tags, but they are much more dynamic. In particular, Checkmk can take objects automatically generated by container systems like Kubernetes, Azure, or AWS into monitoring independently as hosts, and then enrich them with labels generated automatically from their metadata.

More under Labels.

Livestatus

Livestatus is the most important interface in Checkmk. Through it, Checkmk users get the fastest possible live access to all of the data for the hosts and services being monitored. For example, the data in the Overview snapin is retrieved directly through this interface. The fact that the data is fetched directly from RAM avoids slow disk accesses and gives fast access to monitoring without putting too much load on the system.

Local check

A local check is a (self-written) extension that runs in the form of a script in any coding language on the monitored host. Unlike regular checks, the status calculation runs directly on the host. The results are added to the regular agent output.

More under Local checks.

Metric

Measurable and calculable values for hosts and services, such as temperature, utilization or availability, which can be used for graphs, for example. Past values are stored in RRDs (Round Robin Database) and by default retained for up to 4 years.

Monitoring environment

The Checkmk web interface is divided into monitoring and configuration environments. The former refers to the areas where the status of the monitored infrastructure is displayed; these include the inventory, dashboards, lists of hosts, services, events or problems, historical data and so on. The monitoring environment is accessed via the Monitor menu of the navigation bar.

More under The user interface.

Navigation bar

The navigation bar is the main navigation panel in the Checkmk interface, on the left side with, among other things, the Monitor, Setup and Customize menus.

More under The navigation bar.

Notification

With a notification, a Checkmk user is actively informed of problems or other monitoring events, via HTML email, SMS, Slack or similar. Who is notified and how is determined by the notification rules. For example, if Mr. Hirsch receives an email informing him that the filesystem / service on host myserver123 has changed from WARN to CRIT, it is because Mr. Hirsch is a contact for that host and a notification rule states that all contacts for the host should receive an email when one of its services changes to CRIT.

More under Notifications.

Physical appliance

The physical appliance is a 19" server with pre-installed firmware prepared for Checkmk that can be deployed immediately in data centers. It comes with a graphical configuration interface that eliminates the need for any Linux knowledge.

Piggyback

Some hosts that are part of the monitoring are not queried directly because they are not physical devices, but rather they are virtual machines or containers, or the data can only be provided by a third-party system. These third-party systems (the physical hosts) provide the data as an attachment in their own agent output, so to speak. So, for example, a Docker server would piggyback the container data along with its own data.

Rule

Rules are the basis for configuring hosts and services in Checkmk. Rules in a rule set always control a single, focused aspect of a host or service. They can be provided with conditions, and can be 'stacked' on top of each other arbitrarily within a rule set. The evaluation then takes place from top to bottom, so that there can be standard rules when no condition applies, as well as very special rules that only affect a very specific host. Many rule sets in Checkmk already have predefined default values, so that rules only need to be created for deviating requirements.

More under Rules.

Rule set

A rule set represents a specific aspect of a host or service, such as CPU utilization thresholds. Any number of individual rules can be created within each rule set. For example, the CPU utilization on Linux/UNIX rule set could contain two rules that set the service to WARN status at 90 percent on certain hosts and as low as 70 percent on others.

More under Types of rule sets.

Scheduled downtime

Scheduled downtimes are planned outages, for example for updates of certain hosts. Scheduled downtimes temporarily override notifications, and, for example, are accounted for in the availability calculation and temporarily prevent related hosts and services from showing up as problems.

Service

A service is a logical object that summarizes one or more aspects of a host. For example, size, utilization and trends of file systems, CPU utilization, temperatures, age and number of running programs, ports, sensors and so on. At any given time, each service in the monitoring has one of the states OK, WARN, CRIT, UNKNOWN or PEND, is always assigned to exactly one host and optionally contains one or more metrics.

Service discovery

As soon as a host is added to the monitoring, Checkmk automatically detects all available services that can be included in the monitoring — and keeps this list up to date even during operation. Service discovery can also be started manually at any time via the configuration of a host.

Service group

Like hosts, services can also be grouped together so that these groups can be filtered later in views or addressed specifically in the configuration. Groups can be formed by folders, host tags, host and service labels, and host and service names filtered via regular expressions.

More under Service groups.

Service status

A service’s status is always OK WARN, CRIT or UNKNOWN, and describes the current state of the service according to predefined rules. For freshly added services that have never been queried before, there is also the PEND state, which is not a state in the strict sense.

More under Services.

Sidebar

The sidebar can be displayed from the navigation bar with a mouse click. Users can add various snapins to the sidebar to make navigation easier or to show important status data at a glance.

More under Sidebar.

Site

A site is a single, currently running Checkmk monitoring project. Checkmk can be run in parallel on the same server as multiple, independent sites, for example to test different Checkmk versions or editions, or to run a separate monitoring for (new) hosts that are not (yet) to be included in production monitoring.

More under Creating a site.

Snapin

Snapins, also called sidebar elements, are the individual building blocks that can be placed in the sidebar, for example Overview and Master control. Access to the snapins is provided by the plus icon at the bottom of the sidebar.

More under Sidebar.

SNMP

The 'Simple Network Management Protocol' is used to monitor and configure network devices such as routers, switches, or firewalls. Checkmk supports this protocol — but since it is comparatively inefficient, you should only use SNMP on devices that don’t support better monitoring alternatives, such as special agents.

More under SNMP.

Special agent

On some systems, the regular Checkmk agent cannot be installed and SNMP is not (satisfactorily) available. Instead, these systems provide management APIs based on Telnet, SSH or HTTP/XML. Via a special agent running on the Checkmk server, Checkmk queries these interfaces, integrating the host into Checkmk via API.

More under Special agents.

Time period

In Checkmk it is possible to restrict things like notifications, availability calculations and even the general execution of checks to certain times. Time periods can be used, for example, to define daily working hours, to specify vacations and holidays, or to separate weekends from weekdays. These time periods can then be used in rules.

More under Time periods.

View

In addition to the dashboards, the views are the most frequently used displays of hosts, services and other objects in the Checkmk interface. These are displayed as tables with attributes relevant to the current context. For example, All hosts and Host problems are views in monitoring. Supplied standard views can be customized in their display, and they can also serve as the basis for new views. It is also possible to create views from scratch.

Virtual appliance

The virtual appliance is a system created for VirtualBox or VMware ESXi with pre-installed firmware prepared for Checkmk. It includes a graphical configuration interface that eliminates the need for any Linux knowledge.

WATO

The 'Web Administration Tool' was the GUI tool for configuring Checkmk up until Checkmk version 1.6.0. With the introduction of WATO, for the first time users had the ability to customize Checkmk through a web interface instead of by using configuration files. WATO was replaced in version 2.0.0 by the Setup menu in the navigation bar.

More under Setup menu.

Werk

The Checkmk software development is organized in so-called 'Werks'. Each change, bug fix or new feature that will have an impact on the user’s experience is recorded in a separate Werk, along with notes on impacts and any possible incompatibilities. The list of Werks is available directly in Checkmk via the Help menu in the navigation bar and on the Checkmk home page.

More under Werks.

On this page