Regular expressions in Checkmk

1. Introduction

Regular expressions — regex (or rarely regexp) — are used in Checkmk to specify service names and in many other situations. They are patterns that match a certain text or do not match (nonmatch). You can do many practical things with them, such as formulating flexible rules that apply to all services with foo or bar in their name.

Regular expressions are often confused with filename search patterns, since the special characters * and ?, as well as square and curly brackets, can exist in both.

In this article we will show you the most important functions of regular expressions, of course in the context of Checkmk. Since Checkmk uses two different components for regular expressions, sometimes the devil is in the detail. Essentially, the monitoring core uses the C library and all other components use Python 3. Where differences exist, we will explain them.

Tip: In Checkmk regexp are allowed in input fields on various pages. If you are unsure, use the context-sensitive help via the Help menu (Help > Show inline help). There you can see whether regular expressions are permitted and how they can be used.

When working with older plug-ins or plug-ins from external sources, it may happen that these may use Python 2 or Perl and deviate from the conventions described here.

In this article we will show you the most important capabilities of regular expressions — but by no means all of them. If the possibilities shown here do not go far enough, below you will find references where you can read all of the relevant details. And then there is always the internet.

If you want to programme your own plug-ins that, for example, use regular expressions to find anomalies in log files, you can use this article as a basis. However, when searching in large volumes of data optimisation of performance is an important aspect. If in doubt always consult the documentation for the regex library being used.

2. Working with regular expressions

In this section we use concrete examples to show how to work with regular expressions, from simple matches of single characters or strings, to complex groups of characters.

2.1. Alphanumeric characters

With regular expressions, it is always a question of whether a pattern matches a certain text (e.g. a service name). The simplest application example is a chain of alphanumeric characters. These (and the minus sign used as a hyphen) simply match themselves in an expression.

When searching in the monitoring environment Checkmk is usually not case-sensitive. In most cases, the expression CPU load matches the text CPU load as well as cpu LoAd. Searching in the configuration environment, on the other hand, is usually case-sensitive. Justified exceptions to these standards are possible and are described in the inline help.

Attention: In input fields without a regular expression where an exact match is specified (mostly with host names), upper and lower case are always distinguished!

2.2. The point ( . ) as a wild card

In addition to the 'plain text' character strings, there are a number of characters and character strings that have 'magic' functions. The most important such character is the . (point). It exactly matches any single arbitrary character:

Regular Expression Match No Match

Regular Expression	Match	No Match
`Me.er`	`Meier` `Meyer`	`Meyyer`
`.var.log`	`1var2log` `/var/log`	`/var//log`

Me.er

Meier
Meyer

Meyyer

.var.log

1var2log
/var/log

/var//log

2.3. Repetition of characters

One would very often like to define that a sequence of characters of a certain length may occur. For this purpose one specifies the number of repetitions of the preceding character in curly brackets:

Regular Expression Function Match No Match

Regular Expression	Function	Match	No Match
`Ax{2,5}B`	`x` occurs at least twice but not more than five times	`AxxB` `AxxxxB`	`AxB` `AxxxxxxB`
`Ax{0,5}B`	`x` occurs at most five times, but it does not have to occur	`AB` `AxxxxxB`	`AxxxxxxB`
`Ax{3}B`	`x` occurs exactly three times	`AxxxB`	`AxxB` `AxxxxB`
`Ax{0,}B`	`x` can occur any number of times	`AB` `AxxxxxxB`
`Ax{1,}B`	`x` occurs at least once	`AxB` `AxxxxxB`	`AB`
`Ax{0,1}B`	`x` occurs no more than once	`AB` `AxB`	`AxxB`

Ax{2,5}B

x occurs at least twice but not more than five times

AxxB
AxxxxB

AxB
AxxxxxxB

Ax{0,5}B

x occurs at most five times, but it does not have to occur

AB
AxxxxxB

AxxxxxxB

Ax{3}B

x occurs exactly three times

AxxxB

AxxB
AxxxxB

Ax{0,}B

x can occur any number of times

AB
AxxxxxxB

Ax{1,}B

x occurs at least once

AxB
AxxxxxB

AB

Ax{0,1}B

x occurs no more than once

AB
AxB

AxxB

There are abbreviations for the last three above conditions: * matches the preceding character any number of times, + matches at least one occurrence and ? matches at most one occurrence.

You can also use the period . with the repeat operators to search for a sequence of arbitrary characters in a more defined way:

Regular Expression Match No Match

Regular Expression	Match	No Match
`State.*OK`	`State is OK` `State = OK` `StateOK`	`StatOK`
`State*OK`	`StateOK` `StatOK`	`State OK`
`a = 5`	`a=5` `a = 5`	`a==5`
`State.+OK`	`State is OK` `State=OK` `State OK`	`StateOK`
`State.?OK`	`State=OK` `State OK` `StateOK`	`State is OK`

State.*OK

State is OK
State = OK
StateOK

StatOK

State*OK

StateOK
StatOK

State OK

a *= *5

a=5
a = 5

a==5

State.+OK

State is OK
State=OK
State OK

StateOK

State.?OK

State=OK
State OK
StateOK

State is OK

2.4. Character classes, numbers and letters

Character classes allow certain sections of the character set to be matched, for example, "here must come a digit". To do this, place all of the characters to be matched within square brackets. With a minus sign you can also specify ranges. Note: The sequence in the 7-bit ASCII character set applies.

For example, [abc] stands for exactly one of the characters a, b or c, and [0-9] for any digit — both can be combined. Also a negation of the whole is possible — with a ^ in the parenthesis, [^abc] then stands for any character except a, b, c.

Character classes can of course be combined with other operators. Let’s start with some abstract examples:

Character Class Function

Character Class	Function
`[abc]`	Exactly one of the characters a, b, c.
`[0-9a-z_]`	Exactly one digit, lower case letter or underscore.
`[^abc]`	Any character except a, b, c.
`[ --]`	Exactly one character, ranging from a blank character to a hyphen, conforming to the ASCII standard. The following characters are in this range: `!"#$%&'()*+,`
`[0-9a-z]{1,20}`	A sequence of at least one and at most 20 letters and/or digits in any order.

[abc]

Exactly one of the characters a, b, c.

[0-9a-z_]

Exactly one digit, lower case letter or underscore.

[^abc]

Any character except a, b, c.

[ --]

Exactly one character, ranging from a blank character to a hyphen, conforming to the ASCII standard. The following characters are in this range: !"#$%&'()*+,

[0-9a-z]{1,20}

A sequence of at least one and at most 20 letters and/or digits in any order.

Here are some practical examples:

Regular Expression Match No Match

Regular Expression	Match	No Match
`[0-7]`	`0` `5`	`9`
`[0-7]{2}`	`00` `53`	`183`
`M[ae]{1}[iy]{1}e?r`	`Meier` `Meyer` `Mayr`	`Myers`
`myhost_[0-9a-z_]{3}`	`myhost_1a3` `myhost_1_5`	`myhost_xy`
`[+0-9/ ()-]+`	`+49 89 998209700` `089 / 9982 097-00`	`089 : 9982 097-00` (here only the group before the colon is matched)

[0-7]

0
5

9

[0-7]{2}

00
53

183

M[ae]{1}[iy]{1}e?r

Meier
Meyer
Mayr

Myers

myhost_[0-9a-z_]{3}

myhost_1a3
myhost_1_5

myhost_xy

[+0-9/ ()-]+

+49 89 998209700
089 / 9982 097-00

089 : 9982 097-00
(here only the group before the colon is matched)

Note: If you need one of the characters -, [ or ], you will have to use a trick. Write the - (minus sign) at the end of the class — as already shown in the previous example. When evaluating the regular expressions the minus sign, if it is not in the middle of three characters, is not evaluated as an operator, but as exactly this character. If necessary insert a closing square bracket as the first character in the class, and an opening bracket as the second character. Since no empty classes are allowed, the closing square bracket is then interpreted as a normal character. A class with these special characters would look like this: []-], or respectively [][-] if the opening square bracket is also needed.

2.5. Beginning and end — prefix, suffix and infix

In many cases it is necessary to distinguish between matches at the beginning, at the end or simply somewhere within a string. For a match of the beginning of a string (prefix match) use the ^ (circumflex), for the end (suffix match) use the $ (dollar sign). If neither of these operators is specified, most regular expression libraries use the infix-match as the default — it is searched for anywhere in the character string. For exact matches, use both ^ and $.

Regular Expression Match No Match

Regular Expression	Match	No Match
`/var`	`/var` `/var/log` `/usr/var`
`^/var`	`/var` `/var/log`	`/usr/var`
`/var$`	`/var` `/usr/var`	`/var/log`
`^/var$`	`/var`	`/var/log` `/usr/var`

/var

/var
/var/log
/usr/var

^/var

/var
/var/log

/usr/var

/var$

/var
/usr/var

/var/log

^/var$

/var

/var/log
/usr/var

Note: In monitoring and the Event Console, infix match is the standard. Expressions that occur anywhere in the text are found, i.e. the search for 'memory' also finds 'kernel memory'. In the Setup GUI, on the other hand, when comparing regular expressions with service names and other things, Checkmk basically checks whether the expression matches the beginning of the text (prefix match) — this is usually what you are looking for:

If you do need an infix match in places where prefix match is provided, simply extend your regular expression with .* at the beginning to match any prefixed string:

Regular Expression Match No Match

Regular Expression	Match	No Match
`/var`	`/var` `/var/log`	`/usr/var`
`.*/var`	`/var` `/usr/var` `/var/log`
`/var$`	`/var`	`/var/log` `/usr/var`

/var

/var
/var/log

/usr/var

.*/var

/var
/usr/var
/var/log

/var$

/var

/var/log
/usr/var

Tip: You can preface any search at the beginning of a string with ^ and any search within a string with .*, the regular expression interpreters will ignore redundant symbols.

2.6. Masking special characters with a backslash

Since the point matches everything, it naturally also matches a point. If you now want to match exactly one point, you have to mask it with a \ (backslash). This applies analogously for all other special characters. These are: \ . * + ? { } ( ) [ ] | & ^ and $. Coding a \ backslash results in the special character following it being treated as a normal character:

Regular Expression Match No Match

Regular Expression	Match	No Match
`example\.com`	`example.com`	`example\.com` `example-com`
`How\?`	`How?`	`How\?` `How`
`C:\\Programs`	`C:\Programs`	`C:Programs` `C:\\Programs`

example\.com

example.com

example\.com
example-com

How\?

How?

How\?
How

C:\\Programs

C:\Programs

C:Programs
C:\\Programs

Attention Python: Since in Python the backslash in the internal string representation is masked internally with another backslash, these two backslashes must be masked again, which leads to a total of four backslashes:

Regular Expression Match No Match

Regular Expression	Match	No Match
`C:\\\\Programs`	`C:\Programs`	`C:Programs` `C:\\Programs`

C:\\\\Programs

C:\Programs

C:Programs
C:\\Programs

2.7. Alternative values

With the vertical line | you can define alternatives, i.e. use an OR operation: 1|2|3 matches 1, 2 or 3. If you need such alternatives in the middle of an expression, group them within round brackets:

Regular Expression Match No Match

Regular Expression	Match	No Match
`CPU load\|Kernel\|Memory`	`CPU load` `Kernel`	`CPU utilization`
`01\|02\|1[1-5]`	`01` `02` `11` bis `15`	`05`

CPU load|Kernel|Memory

CPU load
Kernel

CPU utilization

01|02|1[1-5]

01
02
11 bis 15

05

2.8. Match groups

Match groups (or capture groups) fulfil two functions: The first function is the grouping of alternatives or partial matches, as shown in the previous example. Nested groupings are also possible. In addition, the repeat operators *, +, ? and {…} may be used preceeded by round brackets. Thus the expression (/local)?/share matches both /local/share and /share.

The second function is to 'capture' matched character groups in variables. In the Event Console (EC), Business Intelligence (BI), in bulk renaming of hosts and in piggyback mappings, there is the possibility of using the text part corresponding to the regular expression in the first parenthesis as \1, the part corresponding to the second parenthesis as \2, and so on. The last example in the table shows the use of alternatives within a match group.

Regular Expression Text to be matched Group 1 Group 2

Regular Expression	Text to be matched	Group 1	Group 2
`([a-z]+)([123]+)`	`def231`	`def`	`231`
`server-(.*)\.local`	`server-lnx02.local`	`lnx02`
`server\.(intern\|dmz\|123)\.net`	`server.dmz.net`	`dmz`

([a-z]+)([123]+)

def231

def

231

server-(.*)\.local

server-lnx02.local

lnx02

server\.(intern|dmz|123)\.net

server.dmz.net

dmz

The following image shows such a renaming of multiple hosts in a single action. All host names that match the regular expression server-(.*)\.local will be replaced by \1.servers.local. Where the \1 stands exactly for the text 'captured' by the .* in the parenthesis:

In the concrete example, server-lnx02.local is thus converted into lnx02.servers.local.

3. Table of special characters

Here you will find a list summarising all of the special characters and regular expression functions used by Checkmk, as explained above:

.

matches any character.

Evaluates the next special character as a normal character.

{5}

The previous character must occur exactly five times.

{5,10}

The previous character must occur at least five and at most ten times.

*

The previous character may occur any number of times (corresponds to {0,}).

+

The previous character may occur any number of times, but must occur at least once (equivalent to {1,}).

?

The previous character may occur zero or once (equivalent to {0,1}).

[abc]

Represents exactly one of the characters a, b or c.

[0-9]

Stands for exactly one of the characters 0, 1 … 9 (i.e. a digit).

[0-9a-z_]

`Stands for exactly one digit, a lower case letter or the underscore.

[^"']

Stands for exactly one character except the single or double inverted comma.

$

Matches the end of a text.

^

Matches the beginning of a text.

A|B|C

Matches A or B or C.

(A)

Matches the subexpression A to a match group.

\t

Matches a tab stop (tabulator). This character often occurs in log files or CSV tables.

\s

Matches all spaces (ASCII uses 5 different types of space).

The following characters must be masked by a backslash, if they are to be used literally: \ . * + ? { } ( ) [ ] | & ^ $.

3.1. Unicode in Python 3

In particular, if proper names in comments or descriptive texts have been copied and pasted, and therefore Unicode characters or different types of spaces appear in the text, Python’s extended classes are very helpful:

\t

Matches a tab stop (tabulator), partly in log files or CSV tables.

\s

Matches all spaces (Unicode supports 25 different spaces, ASCII 5).

\S

Invert from \s, i.e. matches all characters that are not spaces.

\w

Matches all characters that are part of a word, i.e. letters, and in Unicode also accents, Chinese, Arabic or Korean glyphs.
Attention: Numbers are part of the word here.

\W

Inversion of \w, i.e. matches everything that is typically not part of a word (spaces, punctuation marks, emoticons, special mathematical characters).

In places in which Checkmk allows Unicode matching, \w is particularly useful when searching for similarly-spelled words in different languages, for example proper names that are sometimes written with and sometimes without an accent.

Regular Expression Match No Match

Regular Expression	Match	No Match
`\w{1,3}ni\w{1,2}el`	`Schnitzel` (German) `șnițel` (Romanian)	`šnicl` (Croatian) Schnit’el (with omission character)

\w{1,3}ni\w{1,2}el

Schnitzel (German)
șnițel (Romanian)

šnicl (Croatian)
Schnit’el (with omission character)

4. Testing regular expressions

The logic of regular expressions is not always easy to understand, especially in the case of nested match groups, and the question of the order and which end of the string is to be matched. Better than trial and error in Checkmk, there are two ways of testing regular expressions: Online services such as regex101.com prepare matches graphically and explain the order of evaluation in real time:

The second testing procedure is the Python prompt, which comes with every Python installation. With Linux and Mac OS Python 3 is usually pre-installed. Precisely because regular expressions at the Python prompt are evaluated exactly as in Checkmk, there are no discrepancies in an interpretation, even with complex nesting. With the test in the Python interpreter you are always on the safe side.

After opening, you have to import the module re. In the example we switch the distinction between upper and lower case with re.IGNORECASE off:

OMD[mysite]:~$ python3
Python 3.8.10 (default, Jun  2 2021, 10:49:15)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.IGNORECASE
re.IGNORECASE

To emulate the behaviour of C’s regular expressions, which are also used in many Python components, you can restrict to ASCII:

>>> re.ASCII
re.ASCII

Now you can use the function re.match() to directly match a regular expression against a string and output the match group: group(0) stands for the whole match, and group(1) the match that is the first that matches the sub-expression enclosed within round brackets:

>>> x = re.match('M[ae]{1}[iy]{1}e?r', 'Meier')
>>> x.group(0)
'Meier'
>>> x = re.match('M[ae]{1}[iy]{1}e?r', 'Mayr')
>>> x.group(0)
'Mayr'
>>> x = re.match('M[ae]{1}[iy]{1}e?r', 'Myers')
>>> x.group(0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: no such group
>>> x = re.match('server-(.*)\.local', 'server-lnx23.local')
>>> x.group(0)
'server-lnx23.local'
>>> x.group(1)
'lnx23'

5. Additional external documentation

Ken Thompson, one of the creators of UNIX back in the 1960s, was the first to develop regular expressions in today’s form — among other things in the Unix command grep, which is still in use. Since then, numerous extensions and dialects of regular expressions have been created — including extended regexes, Perl-compatible regexes as well as a very similar variant in Python.

In filters in views Checkmk uses POSIX extended regular expressions (extended REs). These are evaluated in the monitoring core in C using the C-library’s regex function. You can find a complete reference for this in the Linux man page for regex(7):

OMD[mysite]:~$ man 7 regex

REGEX(7)                   Linux Programmer's Manual                   REGEX(7)

NAME
       regex - POSIX.2 regular expressions

DESCRIPTION
       Regular expressions ("RE"s), as defined in POSIX.2, come in two forMFS:
       modern REs (roughly those of egrep; POSIX.2 calls these "extended" REs)
       and obsolete REs (roughly those of *ed*(1); POSIX.2 "basic" REs). Obso-
       lete REs mostly exist for backward compatibility in some old programs;

In all other places, all of the functions of Python’s regular expressions are available. This includes, among other things the configuration rules, Event Console (EC) and Business Intelligence (BI).

The regular expressions in Python are an extension of the extended REs and are very similar to those in Perl. They support, for example, the so-called negative lookahead, a non-greedy * asterisk, or an enforcement of upper/lower case distinction. The details of the capabilities of these regular expressions can be found in the Python online help for the re module, or in more detail in the Python online documentation:

OMD[mysite]:~$ pydoc3 re
Help on module re:

NAME
    re - Support for regular expressions (RE).

MODULE REFERENCE
    https://docs.python.org/3.8/library/re

    The following documentation is automatically generated from the Python
    source files. It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations. When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module provides regular expression matching operations similar to
    those found in Perl. It supports both 8-bit and Unicode strings; both
    the pattern and the strings being processed can contain null bytes and
    characters outside the US ASCII range.

    Regular expressions can contain both special and ordinary characters.
    Most ordinary characters, like "A", "a", or "0", are the simplest
    regular expressions; they simply match themselves. You can
    concatenate ordinary characters, so last matches the string 'last'.

A very detailed explanation of regular expressions can be found in a Wikipedia article.

On this page

1. Introduction
2. Working with regular expressions
3. Table of special characters
- 3.1. Unicode in Python 3
4. Testing regular expressions
5. Additional external documentation

3.1. Server and VMs

3.2. Appliance, container, cloud

3.3. Updates

4.1. Server

4.2. Sites

5.1. Hosts

5.2. Services

5.3. Rules

5.4. Supporting configurations

5.5. Users and permissions

5.6. Notifications

5.7. Events

6.1. Checkmk agents and SNMP

6.2. Agent extensions

6.3. VM, cloud, container

6.4. Endpoints

7.1. General

7.2. Commands in views

8.1. Analysis

8.2. Prognosis

11.1. APIs for automation

11.2. APIs for development

11.3. Development of check plug-ins

12.1. The Checkmk Micro Core (CMC)

1. Introduction

2. Working with regular expressions

2.1. Alphanumeric characters

2.2. The point ( . ) as a wild card

2.3. Repetition of characters

2.4. Character classes, numbers and letters

2.5. Beginning and end — prefix, suffix and infix

2.6. Masking special characters with a backslash

2.7. Alternative values

2.8. Match groups

3. Table of special characters

3.1. Unicode in Python 3

4. Testing regular expressions

5. Additional external documentation