The monitoring core is at the heart of the Checkmk-System. Its tasks are:
the regular initiation of checks and the collection of their results.
providing current states to the GUI
the recognition of changes to states and the notifications generated from these.
The architecture diagram below shows the core and its connections to the primary components of the Checkmk Enterprise Editions:
The Checkmk Raw Edition is a construction based on the core from the well-established Nagios Open-Source Projekt. This offers numerous useful functions and has been proven over many years by millions of users worldwide. This inherent flexibility is one of the reasons for the success of Nagios.
Alternatively, the core from Icinga can also be utilised. This is particularly popular in Germany, and is based on the same program code, but in recent years it has been developed independently.
Even though Nagios and Icinga perform exceptionally — being flexible, fast, stable and well-proven — there are still situations in which their limits are reached. Where a large number of hosts and services are being monitored, three problems in particular become evident:
The high CPU consumption when executing checks
The long restart time when changing a configuration
The fact that the system is not available during such a restart
Since Checkmk is being used in ever-larger environments, and in order to overcome the limitations of Nagios as described above, in 2013 we commenced a new development of our own core specifically for the Checkmk Enterprise Editions. The Checkmk Micro Core — or CMC — has not simply been created as a fork from Nagios, rather it has a complete code basis of its own. The CMC utilises a unique software architecture, and it has been perfectly-tailored for Checkmk. Its primary advantages are:
High efficiency when executing checks
This applies for active checks as well as Checkmk-based checks. In benchmarking, a desktop-PC (Core i7) achieved more than 600,000 checks per minute.
Rapid activation of changes
A configuration with 20,000 hosts and 600,000 services can be loaded in 0,5 seconds.
Configuration changes during live operations
Currently-running checks and live status connections are not interrupted. The procedure is undetectable to monitoring users.
Rapid availability queries
Through the use of special caches, availability analyses - even over long time periods — can be calculated without a noticeable waiting time.
The CMC makes use of numerous additional features, such as, e.g., recurring planned-downtimes and acknowledgements with automatic expiry times.
Other elements have also been optimised. For example, performance data is passed without detours directly from the core to the RRD cache, notifications are created in a 'KeepAlive'-mode, and host checks are executed by a built-in ICMP helper. All of these reduce costly process-creations and save CPU resources.
These characteristics bring numerous advantages — even in smaller installations:
The low CPU consumption enables virtualisation to subsitute for hardware in many cases.
The shock-free activation of changes allows frequent configuration changes.
Demands such as cloud-monitoring can thus be satisfied, since servers can be added and removed in quick succession.
The graphic below shows the CPU utilization for a Checkmk-Server before and after changing from Nagios to the CMC. This graphic has been kindly provided by the company DFi Service SA. At this point in time they were monitoring 1,205 hosts and 13,555 services on a server with 10 cores.
Another project shows similar results. The following graphs show a restructuring from a Nagios core to the CMC in an environment with 56,602 services on 2,230 monitored hosts on a virtual machine with two cores:
The magnitude of the difference in an individual case naturally depends on many factors. In the above case a smaller instance that was not restructured runs on the same server. Without this the difference in consumption would be even more noticeable.
Further articles on the CMC:
The CMC can of course also run classic Nagios checks both actively and passively.
Checkmk is and will remain compatible with Nagios, and will continue to fully-support the Nagios core. Likewise the Checkmk Enterprise Editions will continue to have Nagios as an optional core — but only to support a migration from the Raw Edition to the Enterprise Editions.
Switching between the two cores is simple, as long as your configuration has been created cleanly with WATO. Details on this can be found in the Migration to the CMC article. By default the Checkmk Enterprise Editions create new instances with the CMC as the core.