1. Mode of operation
IT-operations distinguish between two types of outage: planned and unplanned. The monitoring system initially cannot know if a detected outage was planned or not. With the concept of scheduled downtimes the system can be informed of a host’s or service’s planned outages by defining a scheduled downtime for the corresponding object. If a host or service is in such a scheduled downtime, it has the following effects:
In the views, an icon appears next to the affected hosts and services: Services are marked with a guiding cone, hosts with a blue pause icon. Services whose hosts are in downtime also get the blue pause icon. In the history, started downtimes are marked with and finished downtimes with .
Problem notifications are deactivated during the downtime.
The affected hosts/services are not identified as having a problem in the Overview.
Scheduled downtimes are specially taken into account in the availability analysis.
For information a special notification will be triggered at the start and end of a scheduled downtime.
2. Entering scheduled downtimes
Defining scheduled downtimes is achieved via commands. All actions pertaining to scheduled downtimes are available here in a single box:
The Comment field must always be filled out.
You can include a URL, such as https://www.example.com
in this field, which will be replaced by a clickable link.
There are multiple ways to define start and end time ranges.
From the simple 2 hours, which defines the downtime as starting from now, to the entry of an explicit time range where a future downtime can also be defined.
2.1. Regularly scheduled downtimes
Some maintenance is performed regularly — a once-weekly automatic restart of a server, for example. Manually entering a scheduled downtime for each occasion would be time-consuming. If you would only like to silence the notifications, you could configure time periods and the Notification period for Hosts/Services rule set. These have various restrictions however — one important restriction being that global configuration permissions are required for setting time periods.
For this purpose the commercial editions offer the concept of automatic, periodically-recurring, scheduled downtimes. These can be set in two different ways.
Setting using a command
The first way is via the Repeat option.
With this you select the period when the downtime should repeat. Enter the first occurrence via Start and End. The period is calculated from the start time entered here. The following options are available:
never |
The scheduled downtime is not repeated, i.e. only executed once (default setting). |
hour |
The scheduled downtime repeats hourly at the same time. |
day |
Daily, at the same time every day. |
week |
Recurs every seven days on the same weekday and time of day as on the first occasion. |
second week |
Same as for weekly, but every 14 days. |
fourth week |
Same as for weekly, but now every 28 days. |
same nth weekday (from beginning) |
With this you can achieve results such as "every second Monday in the month". Here Checkmk takes the day of the week as the starting point, checks which day in the month it is, and bases the period on this day. If the starting date is the second Monday in the month, then a downtime will be scheduled for the second Monday in every subsequent month. |
same nth weekday (from end) |
This is similar, except that it is calculated from the end of the month — for example "every last Friday in the month". |
same day of the month |
In this case the weekday is irrelevant. Here the date in the month is used. So, if the starting date is the 5th, the downtime will be scheduled to occur on the 5th of each month. |
Setting using rules
An elegant alternative method for the configuration of periodic scheduled downtimes is to define them using rules. With host tags you can define things such as e.g., Every production Windows server has a scheduled downtime every Sunday from 22:00 to 22:10.
You can in fact achieve almost the same results by using the host search to find all the affected servers, and then entering the scheduled downtime via a command. But this functions only with existing servers.
If in the future a new host is added to the monitoring it will not be covered by this entry. However, if you work with rules this will not be a problem. A further advantage with rules is that the maintenance policy can be altered very easily at a later date — simply by modifying the rules.
The rules for recurring scheduled downtimes can be found under Setup > Hosts > Host monitoring rules > Recurring downtimes for hosts respectively Setup > Services > Service monitoring rules > Recurring downtimes for services.
2.2. Advanced options
In addition to the regular scheduled downtimes just described, there are other options for defining scheduled downtimes. These can be found in the Advanced options:
The option Only for hosts: Set child hosts in downtime is useful for routers and switches, but also for virtualization hosts, for example. In this way Checkmk will also automatically set a scheduled downtime on all directly-connected hosts, and also on indirectly-connected hosts via the host in question (if Include indirectly connected hosts (recursively) is selected).
With the Only start downtime if host/service goes DOWN/UNREACH… option the scheduled downtime does not begin automatically at a nominated time, rather first when a real problem status appears for the host. This option is useful when, e.g., you know that a host will enter a DOWN state for a few minutes, but the exact time of the event cannot be predicted.
Example: You define a scheduled downtime as being from 14:00 to 16:00, and activate the option Only start downtime if host/service goes DOWN/UNREACH during the defined start and end time with a duration of 30 minutes. At 14:00 the scheduled downtime will not activate, but will be in a standby position. As soon as the host enters a DOWN or UNREACH state, the scheduled downtime will begin and the blue pause icon will appear. This will remain so for the duration of the time nominated in the option, regardless of the actual status of the host, and if need be beyond the end-time specified for the downtime.
Therefore with flexible scheduled downtimes the start/end time is only the time window in which the scheduled downtime can begin. If no problem status occurs within this time window the scheduled downtime will simply be skipped. These conditions of course also apply for services.
3. Activating scheduled downtimes
Click on Schedule downtime on service or Schedule downtime on host to activate the settings you have just defined for the relevant services or hosts.
If you have just scheduled downtimes for services, for example in the Services of Host view, you can also click on Schedule downtime on host to ensure that the scheduled downtimes do not relate to the services, but instead directly to the associated host.
4. Editing and removing scheduled downtimes
Scheduled downtimes have their own view in Checkmk — this is accessed via Monitor > Overview > Scheduled downtimes:
As in every view, you can narrow the selection with a filter. With the commands, in this view you can remove one or more downtime(s), and even alter them retroactively (only in the commercial editions), e.g., if the times need to be extended when the downtime is proving to be longer than anticipated.
5. History
The Monitor > History > Downtime history view does not display the current scheduled downtimes, rather their histories — thus all events with which a scheduled downtime began or ended (with a natural end or via a remove command).
6. Scheduled downtimes and availability
As mentioned at the beginning, scheduled downtimes have an effect when evaluating the availability analysis. By default all scheduled downtimes are calculated in their own ‘pot’ and shown in the Downtime column.
Precisely how scheduled downtimes are to be assessed can be defined via Availability > Change computation options:
Honor scheduled downtimes |
Scheduled downtimes are included in the availability graphs and displayed as a separate column. This is the standard. |
Exclude scheduled downtimes |
Scheduled downtimes are ignored completely when calculating the 100 %. All availability percentages therefore refer only to the remaining times, in order to answer the question: What percentage of non-maintenance time was the object available? |
Ignore scheduled downtimes |
Scheduled downtimes will not be factored in — only the object’s actual states are relevant. |
Under Phases there is the additional Treat phases of UP/OK as non-downtime option. If this is selected, times in which an object is undergoing maintenance but is still OK or UP at the same time are not treated as scheduled downtimes. Thus only the scheduled time that resulted in a real outage will be included in the calculations.