1. OMD - The Open Monitoring Distribution
The Checkmk Monitoring System uses the Open Monitoring Distribution (OMD). Founded by Mathias Kettner, OMD is an open source project which revolves around the convenient and flexible installation of a monitoring solution made up of various components. The abreviation OMD might already be familiar to you as part of the RPM/DEB-Package installation.
An OMD-based installation is distinguished by a number of characteristics:
the ability to run multiple instances (or ‘sites’) in parallel
the ability to operate instances with differing versions of the monitoring software
an intelligent and easy to operate upgrade/downgrade-mechanism
uniform file paths — regardless of which Linux-platform is installed
a clear separation of data and software
a very simple installation — with no dependence on third-party software
a perfect preconfiguration of all components
2. Creating instances (or ‘sites’)
Perhaps the best thing about OMD is that it can manage any chosen number of monitoring instances on a server. These can also be referred to as Sites. Each ‘instance’ is a self-contained monitoring system which runs independently of the others.
An instance always has a distinct name, specified at its creation. This name is the same as that of the Linux-user which is created at the same time. The instance’s name conforms to the same conventions as user names under Linux.
The creation is performed with the omd create
command.
This must be executed as root
:
Creating temporary filesystem /omd/sites/mysite/tmp...OK
Updating core configuration...
Generating configuration for core (type cmc)...Creating helper config...OK
OK
Restarting Apache...OK
Created new site mysite with version 1.6.0.
The site can be started with omd start mysite.
The default web UI is available at http://myServer/mysite/
The admin user for the web applications is cmkadmin with password: lEnM8dUV
For command line administration of the site, log in with 'omd su mysite'.
After logging in, you can change the password for cmkadmin with 'htpasswd etc/htpasswd cmkadmin'.
When creating the cmkadmin
user a password
will be randomly-generated and issued.
What takes place during the creation of an instance ‘mysite
’?
An operating system user
mysite
, and a groupmysite
will be created.A new home directory
/omd/sites/mysite
will be created and assigned.This home directory will be populated with configuration files and sub-directories.
A basic configuration will be created for the new instance.
Important: Avoid using a name which is already allocated in another service. A duplicated allocation can cause problems.
2.1. User and group IDs
In some cases it is also desirable to specify the user/group ID of the new user to be created.
This is performed with the -u
and -g
options, e.g.:
root@linux# omd create -u 6100 -g 180 mysite
An overview of the further options can be shown with omd create --help
.
The most important options are:
-u UID | The new user will be created with the User-ID ‘UID’. |
-g GID | The new user’s group will be created with the Group-ID ‘GID’. |
--reuse | OMD assumes that the new user already exists, and does not create it. |
-t SIZE | The new site’s |
3. Instance User (Site User)
The further administration of the instance is always best performed with
the rights of the newly-created user. Switching users is done with su
:
root@linux# su - mysite
Please note that the ‘minus sign’ following the su
is essential.
It ensures that switching users processes ALL of the operations that take
place during a normal login. In particular, all environment variables will be
correctly set, and your session will continue as mysite
in the
home directory of the /omd/sites/mysite
instance.
As an instance-user you can execute all important operations affecting this site.
Entering the instance ID then of course becomes unnecessary when issuing the
relevant omd
-commands.
4. Starting and stopping instances
Your instance is now ready to be started — which can be done as root
with omd start mysite
. It is fundamentally better though to work with
the instance as the instance user (site user):
OMD[mysite]:~$ omd start
Starting Livestatus Proxy-Daemon...OK
Starting rrdcached...OK
Starting CMC Rushing Ahead Daemon...OK
Starting Check_MK Micro Core...OK
Starting dedicated Apache for site mysite...OK
Initializing Crontab...OK
Unsurprisingly, stopping is achieved with omd stop
:
OMD[mysite]:~$ omd stop
Removing Crontab...
Stopping dedicated Apache for site mysite....OK
Stopping Check_MK Micro Core...killing 15085...OK
Stopping CMC Rushing Ahead Daemon...killing 15071....OK
Stopping rrdcached...waiting for termination...OK
Stopping Livestatus Proxy-Daemon...killing 15049....OK
Starting and stopping an instance is nothing other than starting or stopping a collection of services. These can also be individually managed by specifying the name of the service, e.g.:
OMD[mysite]:~$ omd start apache
Starting dedicated Apache for site mysite...OK
The names of the various services can be found in the
~/etc/init.d
directory. Please note the leading tilde — this
represents the home directory for the instance-user (the site-directory).
This is not the same as /etc/init.d
!
Alongside start
and stop
, there are also the
restart
, reload
and status
commands.
Reloading Apache is, for example, always necessary following a manual change
to the Apache-configuration. Please note that this does not apply to the global
Apache-process on the Linux-server, but rather the site’s own dedicated
Apache-process:
OMD[mysite]:~$ omd reload apache
Reloading dedicated Apache for site mysite....OK
In order to be able to maintain an overview of state of the site following all
of the starts and stops, simply use omd status
:
OMD[mysite]:~$ omd status
liveproxyd: stopped
rrdcached: running
cmcrushd: running
cmc: stopped
apache: running
crontab: running
-----------------------
Overall state: partially running
5. Deleting instances
Deleting an instance is as easy as creating one — with the omd rm
command. The instance will first be automatically stopped.
root@linux# omd rm mysite
omd rm mysite
omd rm mysite
PLEASE NOTE: This action removes all configuration files
and variable data of the site.
In detail the following steps will be done:
- Stop all processes of the site
- Unmount tmpfs of the site
- Remove tmpfs of the site from fstab
- Remove the system user <SITENAME>
- Remove the system group <SITENAME>
- Remove the site home directory
- Restart the system wide apache daemon
(yes/NO): yes
It goes without saying that this action also deletes all of the instance’s data!
If you are no fan of confirmation prompts, or wish to perform the deletion
as part of a script, the deletion can be forced with the -f
option.
Attention: here the -f
must be placed before the rm
:
root@linux# omd -f rm mysite
6. Configuring the components
As already mentioned, OMD is a system that integrates multiple software components
into a monitoring system. In so doing, some components are optional, and for some
there are alternatives or different operational settings. All of this can be
comfortably configured with omd config
. There are also scripting and
interactive modes. This latter can be simply opened by a site-user with:
OMD[mysite]:~$ omd config

If you alter a setting, the OMD will be immediately notified that the site must be stopped (if that is not already the case), and does this as needed:

Please don’t forget to restart the site following the completion of the work.
omd config
will NOT do this for you automatically.
6.1. Script-interfaces
Those who don’t like the interactive mode, or prefer to work with scripts,
can set the individual variables using commands. For this there is the
omd config set
command. The following example sets the CORE
variable to cmc
:
OMD[mysite]:~$ omd config set CORE cmc
As always, this can be performed as root
if the site’s name is added
as an argument:
root@linux# omd config mysite set CORE cmc
The current configuration of all variables can be viewed using omd config show
:
OMD[mysite]:~$ omd config show
APACHE_MODE: own
APACHE_TCP_ADDR: 127.0.0.1
APACHE_TCP_PORT: 5000
AUTOSTART: off
CMCRUSHD: on
CORE: cmc
[...]
6.2. Commonly used settings
There are numerous settings in omd config
. The most important are:
Variable | Standard | Function |
---|---|---|
CORE | cmc | Selection of the monitoring core. As well as the Checkmk Micro Core (CMC), the standard Nagios core is still available. In earlier versions this was set as the default. |
MKEVENTD | on | Activates the Checkmk Event Console, with which the syslog messages, SNMP-Traps and other events can be processed |
MKNOTIFYD | on | Enterprise Editions: Activates the notification spooler. Firstly, this forwards remotely-generated notifications to a central system. This will require mknotifyd on the central and remote sites respectively. An asynchronous delivery of messages can additionally be performed using this. |
AUTOSTART | on | Set this to |
LIVESTATUS_TCP | off | Allows external access to the status data for this site. A distributed monitoring can be constructed with this. The status of this instance can be incorporated into the central instance. Please only activate it in a secure network. |
7. Copying and renaming instances
It is sometimes useful to create a copy of an instance, for testing purposes
or for the preparation of an update. Of course one could simply copy the
/omd/sites/alt
directory to /omd/sites/neu
.
That will however not work because:
Many configuration files include the site’s name.
In addition, at numerous locations there are absolute data paths with the
/omd/sites/alt
prefix.Not least, a user and a group with the site’s name to which everything belongs, must be available.
To simplify the copying of an instance, there is the omd cp
command,
which takes all of these factors into consideration. Its use is very simple.
As argument simply enter the name of the existing site followed by the name
of the new one. For example:
root@linux# omd cp alt neu
The copy can only work if:
The site has been stopped.
No processes that belong to the instance user are running.
The above points ensure that at the time of the copy the instance is in a consistent state and cannot change during the action.
7.1. Limiting data volume
If a large number of hosts are being monitored, the volume of data to be copied
can be quite substantial. The greater part of this is the performance data which
is stored in RRD-files. But the log files containing historic events can also
produce larger data volumes. If the history is not required (for example,
if only testing is being performed), these can be omitted from the copy.
In such cases the following options can be added to omd cp
:
--no-rrds | The copy will exclude performance data (RRDs) |
--no-logs | All log files and remaining historic data will be excluded |
-N | This is an abreviation of `--no-rrds --nologs ` |
The order of the options is important:
root@linux# omd cp --no-rrds alt neu
7.2. Renaming instances
Renaming an instance is performed with the omd mv
command.
This functions similarly to the copy command and has the same prerequisites.
The options to restrict the data volume are not available since the data is only
being moved to another directory and is not being duplicated. For example:
root@linux# omd mv alt neu
7.3. Further options for cp and mv
Both operations will create new Linux-users in exactly the same way as
create
does, thus some of the options for omd create
are also
available for use:
-u UID | The new user will be created with the User-ID UID. |
-g GID | The new user’s group will be created with the Group-ID GID. |
--reuse | OMD assumes that the new user already exists and does not create it. |
-t SIZE | The new site’s |
8. Showing changes with omd diff
When creating a new Checkmk-instance the omd create
command populates
the etc
directory with numerous predefined configuration files.
A number of directories will also be created under var
and local
.
Now it is probably the case that in the course of time a number of the files
will have been customised. When after a time you wish to determine which files
are no longer in the condition as originally supplied, the omd diff
command can provide the answer. Amongst other things, this is useful before an
update of Checkmk, since your changes could conflict with changes in
the default files.
In a request without additional arguments, all changed files will be listed:
OMD[mysite]:~$ omd diff
* Deleted var/log/nagios.log
* Changed content var/check_mk/wato/auth/auth.php
* Changed content etc/htpasswd
! Changed permissions etc/htpasswd
* Changed content etc/diskspace.conf
* Changed content etc/auth.secret
* Changed content etc/apache/apache.conf
You can also enter a query for a specific directory:
OMD[mysite]:~$ omd diff etc/apache
* Changed content etc/apache/apache.conf
If you wish to see the changes in detail, simply enter the complete file name:
OMD[mysite]:~$ omd diff etc/apache/apache.conf
--- /dev/fd/63 2017-01-24 09:14:46.248968199 0100#
[green]#++ /omd/sites/mysite/etc/apache/apache.conf 2017-01-24 09:12:37.705355164 +0100
@@ -66,8 +66,8 @@
StartServers 1
MinSpareServers 1
MaxSpareServers 5
-ServerLimit 128
-MaxClients 128
+ServerLimit 64
+MaxClients 64
MaxRequestsPerChild 4000
###
9. Backing-up and restoring instances
9.1. Backing-up instances with omd backup
The site management in Checkmk has a built-in mechanism for backing up and
restoring Checkmk-instances. The omd backup
and omd restore
commands are the basics for packing all of an instance’s data into a
tar archive, and respectively, extracting that data for a restore.
From Version 1.4.0 Checkmk additionally uses the Backup WATO-module which makes a backup and restore possible without the command line, and which also enables the setting-up of regular backup jobs.
Backing up an instance with omd backup
does not require
root
-permissions. An instance user can perform this.
Simply enter as an argument the name for the backup file to be created:
OMD[mysite]:~$ omd backup /tmp/mysite.tar.gz
Please note however:
The created file type is a gzip-compressed tar archive. Therefore use
.tar.gz
or.tgz
as the file extension.Do not store the backup in the instance directory, since this will of course be completely backed up – thus every subsequent backup will contain a copy of ALL of its predecessors!
If the backup’s target directory is not writable for an instance user,
the backup can otherwise be performed as a root
-user.
In this case an additional argument is always required specifying the name
of the instance to be backed up:
root@linux# omd backup mysite /var/backups/mysite.tar.gz
The backup contains all of the instance’s data — except for the volatile data
under tmp/
. With the tar tzf
command one can easily have a
look at the file’s contents:
OMD[mysite]:~$ tar tvzf /tmp/mysite.tar.gz | less
lrwxrwxrwx mysite/mysite 0 2017-01-24 09:02 mysite/version -> ../../versions/2017.01.16.cee
drwxr-xr-x mysite/mysite 0 2017-01-24 09:12 mysite/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/userfiles/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/userfiles/styles/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/userfiles/scripts/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/userfiles/templates/
drwxr-xr-x mysite/mysite 0 2017-01-24 09:02 mysite/local/share/nagvis/htdocs/userfiles/gadgets/
9.2. Backup without history
The lion’s share of an instance’s data is the performance data
retained in the RRDs. The monitoring history can also be very large. If neither
of these are absolutely required, with the following options the history data
can be omitted, thus making the backup smaller and faster running.
The options must be coded after the word ‘backup’
:
--no-rrds | Omits backing up the RRD-databases (performance data) |
--no-logs | Omits the monitoring history stored in the log files |
-N | An abreviation of |
Example:
OMD[mysite]:~$ omd backup -N /tmp/mysite.tar.gz
9.3. Backing up a running instance
A backup does not require the instance to be stopped, and therefore can be
executed while the system is running. In order to ensure a consistent condition
of the RRDs used for recording the performance data,
the omd backup
command automatically alters the Round-Robin-Cache
to a mode with which the running updates are written only to the journal,
and no longer to the RRDs. The journal files are the last to be backed up — thus it can be achieved that as much as possible of the performance data that
has been generated during the backup is also included in the backup.
9.4. Restore
The restoring of a backup is as simple as the backup itself.
The omd restore
command restores an instance from a backup.
This is even possible for a user. The instance must be stopped for this
procedure. The instance will not be newly-generated (which would require
root
-permissions), rather it will be completely emptied
and then refilled:
OMD[mysite]:~$ omd stop
OMD[mysite]:~$ omd restore /tmp/mysite.tar.gz
Following the restore the instance can be restarted:
OMD[mysite]:~$ omd start
A restore can also be performed by a root
-user. If an instance with the
same name already exists, this must first be deleted. This can be performed
either with an omd rm
, or by simply including
the --reuse
option.
A --kill
additionally ensures that the existing instance is first
stopped. It is not necessary to use the instance’s name with
the restore
, since this is contained in the backup:
root@linux# omd restore --reuse --kill /var/backup/mysite.tar.gz
root@linux# omd start mysite
When operating as root
, you can restore the instance with a different
name from that in the backup. Include the desired alternative name as an
argument following the restore
command:
root@linux# omd restore mysite2 /var/backup/mysite.tar.gz
Restoring site mysite2 from /tmp/mysite.tar.gz...
* Converted ./.modulebuildrc
* Converted ./.profile
* Converted .pip/pip.conf
* Converted etc/logrotate.conf
The long list of conversions found here has the same function as for the renaming of instances described earlier: The instance’s name is included in numerous configuration files, and with this these occurrences will be replaced automatically by the new name.
9.5. Live migration of instances with backup & restore
The omd backup
and omd restore
commands can — in the good old
Unix tradition — instead of files, also work with the standard input/output.
Instead of a data path for the tar file, simply enter a hyphen (-
).
In this way a pipe can be constructed and the data ‘streamed’ directly to another computer without requiring intermediate files. The larger the backup, the more advantageous this will be since no temporary space in the backed up server’s file system will be needed.
The following command backs up an instance to another computer using SSH:
root@linux# omd backup mysite - | ssh user@otherserver "cat > /var/backup/mysite.tar.gz"
If you want to reverse the SSH-access — by which you prefer to log in TO the Checkmk-instance FROM the backup server — that is also possible, as shown in the following example. For this, first an SSH-Login as an instance user must be permitted:
root@otherserver# ssh mysite@checkmkserver "omd backup -" > /var/backup/mysite.tar.gz
If you are clever, and combine the above with an omd restore
which
reads the data from the standard input, you can copy a complete,
running instance from one server to another — and without needing any
additional space for a backup file:
root@otherserver# *ssh mysite@checkmkserver "omd backup -" | omd restore - *
And now, the same procedure with a reversed SSH-access — but this time from the source system to the target system:
root@linux# omd backup mysite - | ssh root@otherserver "omd restore -"