Checkmk
to checkmk.com

1. Mode of operation

IT-operations distinguish between two types of outage: planned and unplanned. The monitoring system initially cannot know if a detected outage was planned or not. With the concept of scheduled downtimes the system can be informed of a host’s or service’s planned outages by defining a scheduled downtime for the corresponding object. If a host or service is in such a maintenance period, it has the following effects:

  • In the views, an icon appears next to the affected hosts and services: Services are marked with a Icon for displaying the scheduled downtime for services. guiding cone, hosts with a Icon for displaying the scheduled downtime for hosts. blue pause icon. Services whose hosts are in downtime also get the blue pause icon. In the history, started downtimes are marked with Icon for indicating a started scheduled downtime. and finished downtimes with Icon for indicating a finished scheduled downtime..

  • Problem notifications are deactivated during the downtime.

  • The affected hosts/services are not identified as having a problem in the Overview.

  • Scheduled downtimes are specially taken into account in the availability analysis.

  • For information a special notification will be triggered at the start and end of a scheduled downtime.

2. Entering scheduled downtimes

Defining scheduled downtimes is achieved via commands. All actions pertaining to scheduled downtimes are available here in a single menu:

basics downtimes schedule

The Downtime Comment field must always be filled out. You can include a URL, such as http://www.example.com in this field, which will be replaced by a clickable link. There are multiple ways to define start and end time ranges. From the simple 2 hours, which defines the downtime as starting from now, to the entry of an explicit time range where a future maintenance can also be defined.

2.1. Flexible with max. duration

This option is useful when, e.g., you know that a host will enter a DOWN state for a few minutes, but the exact time of the event cannot be predicted. With this option the scheduled downtime does not begin automatically at a nominated time, rather first when a real Problem status appears for the host.

Example: You define a scheduled downtime as being from 14:00 to 16:00, and activate the flexible option with a duration of 30 minutes. At 14:00 the scheduled downtime will not activate, but rather is in a standby position. As soon as the host enters a DOWN or UNREACH state, the scheduled downtime (30 min) will begin and the ‘blue moon’ will appear. This will remain so for the duration of the time nominated in the option, regardless of the actual status of the host, and if need be beyond the end-time specified for the downtime.

Therefore with flexible scheduled downtimes the start/end time is only the time window in which the scheduled downtime can begin. If no problem status occurs within this time window the scheduled downtime will simply be skipped. These conditions of course also apply for services.

2.2. Also set downtime on child hosts

This option is useful for routers and switches, and also e.g., for virtualization hosts. In this way Checkmk will also automatically set a scheduled downtime on all directly-connected hosts, and also on indirectly-connected hosts if the Do this recursively box is selected.

2.3. Repeat this downtime on a regular basis…​

Here you can enter scheduled downtimes that repeat regularly. This will be explained in more detail in a separate section below.

2.4. Schedule downtimes on hosts

For example, if you set downtimes in the Services of Host view, the scheduled downtime would actually apply directly to all or all selected services. If this is not desired, you do not have to navigate to the Status of Host view first, but can use this option to set the downtime directly to the host.

basics downtimes schedule host

3. Displaying and deleting scheduled downtimes

Scheduled downtimes have their own view in Checkmk — this is accessed via Views > Other > Downtimes:

basics downtimes downtimeslist

As in every view, you can narrow the selection with the icon filter filter. With the commands commands, in this view you can remove one or more downtime(s), and even alter them retroactively (only in the commercial editions), e.g., if the times need to be extended when the downtime is proving to be longer than anticipated.

basics downtimes editdowntimes

3.1. History

The Monitor > History > Downtime history view does not display the current scheduled downtimes, rather their histories — thus all events with which a scheduled downtime began or ended (with a natural end or via a delete command).

basics downtimes downtimes history

4. Regularly-scheduled downtimes

CEE Some maintenance is performed regularly — a once-weekly automatic restart of a server, for example. Manually entering a scheduled downtime for each occasion would be time-consuming. If you would only like to silence the notifications, you could configure time periods and the Notification period for Hosts/Services rules set. These have various restrictions however — one important restriction being that global configuration permissions are required for setting time periods.

For this purpose the commercial editions offer the concept of automatic, periodically-recurring, scheduled downtimes. These can be set in two different ways.

4.1. Setting via commands

The first method works in the manner we are already familiar with for one-off scheduled downtimes — via a command — but with the additional Repeat this downtime on a regular basis every …​ check box.

basics downtimes periodic

With this you select the period when the maintenance should repeat. The first occurrence is entered as usual. The Custom time range button is available here. The periods are calculated from the start-time entered. The following possibilities are available:

hour

The scheduled downtime repeats hourly at the same time

day

Daily, at the same time every day

week

Recurs every seven days on the same weekday and time of day as on the first occasion

second week

Same as for weekly, but every 14 days

fourth week

Same as for weekly, but now every 28 days

same nth weekday (from beginning)

With this you can achieve results such as "every second Monday in the month". Here Checkmk takes the day of the week as the starting point, checks which day in the month it is, and bases the period on this day. If the starting date is the second Monday in the month, then a maintenance will be scheduled for the second Monday in every subsequent month.

same nth weekday (from end)

This is similar, except that it is calculated from the end of the month — for example "every last Friday in the month".

same day of the month

In this case the weekday is irrelevant. Here the date in the month is used. So, if the starting date is the 5th, the downtime will be scheduled to occur on the 5th of each month.

4.2. Definition using rules

An elegant alternative method for the configuration of periodic scheduled downtimes is to define them using rules. With host tags you can define things such as e.g., Every production Windows-server has a scheduled downtime every Sunday from 22:00 to 22:10.

You can in fact achieve almost the same results by using the host search to find all the affected servers, and then entering the scheduled downtime via a command. But this functions only with existing servers. If in the future a new host is added to the monitoring it will not be covered by this entry. Alternatively, if you work with rules this will not be a problem. A further advantage with rules is that the maintenance policy can be altered very easily at a later date — simply by modifying the rules.

The rules for recurring scheduled downtimes can be found under Setup > Host monitoring rules > Recurring downtimes for hosts respectively Setup > Service monitoring rules > Recurring downtimes for services.

basics downtimes recurring rule

5. Scheduled downtimes and availability

As mentioned at the beginning, scheduled downtimes have an effect when evaluating the availability analysis. By default all scheduled downtimes are calculated in their own ‘pot’ and shown in the Downtime column.

basics downtimes availability list

Precisely how scheduled downtimes are to be assessed can be defined via Availability > Change computation options:

basics downtimes availability option

Honor scheduled downtimes

Scheduled downtimes are included in the availability graphs and displayed as a separate column. This is the standard procedure.

Exclude scheduled downtimes

Scheduled downtimes are ignored completely when calculating availability. All availability statistics refer only to the remaining time. Therefore — excluding scheduled downtimes, for what percentage of the time was the object available?

Ignore scheduled downtimes

Scheduled downtimes will not be factored in — only the object’s actual states are relevant.

Under Phases there is the additional Treat phases of UP/OK as non-downtime option. If this option is selected, then if the object, despite being in maintenance, still has an OK or UP state, the times are not treated as scheduled downtimes. Thus only the maintenance time that resulted in a real outage will be included in the calculations.

On this page