1. Introduction
The Livestatus is the most important interface in Checkmk. This is the fastest possible way to get all the monitored host’s and service’s data, including live data. So, for example, the Overview's data is retrieved directly via this interface. Since this is read directly from the RAM, slow hard drive access is avoided, thus providing rapid access to the monitoring without putting too much load on the system.
In order to structure the data this data is arranged in tables and columns.
The hosts
table includes, for example, the name
,
state
and numerous other columns. Each line in the hosts
table
represents a host, the services
table a service,
and so on. In this way the data can be simply searched and retrieved.
This article should help you to use this interface for your own queries, extensions and customisations. As an instance user you can – using copy and paste – directly test all of the queries and commands in this article.
2. The Livestatus Query Language (LQL)
2.1. Using the LQL in the shell
Access to the Livestatus is made over a Unix-Socket using the Livestatus Query Language (LQL). It’s syntax is based on HTTP.
Via the command line there are a number of ways of accessing the interface.
One possibility is to use the printf
and unixcat
commands to
send an instruction to the socket. The unixcat
tool is already included
in Checkmk for the instance user. Important: all inputs to the socket
are case-sensitive so this must always be observed:
OMD[mysite]:~$ printf "GET hosts\nColumns: name\n" | unixcat ~/tmp/run/live
The interface expects all commands and headers in a separate row. You can mark
such a line break with \n
. As an alternative to the command above,
you can also use the lq
script command, which saves you a bit of
work by auto-completing some fields when entering:
OMD[mysite]:~$ lq "GET hosts\nColumns: name"
Or you can start the interactive entry stream and enter the command followed by the header. With a blank line you execute the command with its header, and with a further line the socket access is ended. Note that in the example, everything before the blank line belongs to the command, and everything between the first and second blank lines is the response:
OMD[mysite]:~$ lq
GET hosts
Columns: name
myserver123
myserver124
myserver125
OMD[mysite]:~$
The following examples are always executed with the lq-command – in the direct form when the query is short, and as an entry stream for longer queries.
LQL commands
In the first examples you have already seen the first of two commands:
with GET
you can call-up all available tables. In the command reference
can be found a complete listing, with a description, of all available
tables, and this article also contains a general
explanation on using the Livestatus.
With COMMAND
you can issue commands directly to the core,
for example, to set a downtime, or to completely deactivate notifications.
A list of all available commands can in any case be found in the command
reference in Commands.
LQL headers
For every GET-command you can insert various headers in order to restrict the results from a query, to output only specific columns for a table, and much more. The following are the two most important headers:
Header | Description |
---|---|
Columns | Only the specified columns will be produced by a query. |
Filter | Only the entries which meet a specific condition will be produced. |
A list of all headers, each with a short description can be found here.
Show available columns and tables
One will not be able to recall all of the tables and their columns,
and access to this handbook (with the references in the online version) may not
always be possible. It is however possible to quickly create a query which
provides the desired information. To receive a list of all available tables,
submit the following query, and delete the duplicated lines in the output with
sort
. In the output the first four lines can be viewed as an example:
OMD[mysite]:~$ lq "GET columns\nColumns: table" | sort -u
columns
commands
comments
contactgroups
For a query of all columns in a table you must of course specify these.
Substitute hosts
with the desired table. Here as well the first four
lines in the output can be viewed as an example:
OMD[mysite]:~$ lq "GET columns\nFilter: table = hosts\nColumns: name"
accept_passive_checks
acknowledged
acknowledgement_type
action_url
2.2. Using LQL in Python
Since Checkmk is based very heavily on Python, scripts in this language are practical. The following script can be used as a basis for an access to the Livestatus socket:
#!/usr/bin/env python
# Sample program for accessing Livestatus from Python
import json, os, socket
# for local site only: file path to socket
address = "%s/tmp/run/live" % os.getenv("OMD_ROOT")
# for local/remote sites: TCP address/port for Livestatus socket
# address = ("localhost", 6557)
# connect to Livestatus
family = socket.AF_INET if type(address) == tuple else socket.AF_UNIX
sock = socket.socket(family, socket.SOCK_STREAM)
sock.connect(address)
# send our request and let Livestatus know we're done
sock.sendall(str.encode("GET status\nOutputFormat: json\n"))
sock.shutdown(socket.SHUT_WR)
# receive the reply as a JSON string
chunks = []
while len(chunks) == 0 or chunks[-1] != "":
data = sock.recv(4096)
chunks.append(str(data.decode("utf-8")))
sock.close()
reply = "".join(chunks)
# print the parsed reply
print(json.loads(reply))
2.3. Using the Livestatus-API
Checkmk provides an API for the Python, Perl and C++ programming languages, which simplifies the access to Livestatus. An example code is available for each language which explains its use. The paths to these examples can be found in the chapter Files and directories.
3. Simple queries
3.1. Column queries (Columns)
In the examples we have seen so far, ALL information for ALL hosts has been queried.
In practice however, one will probably only require specific columns.
With the Columns
header that has already been mentioned the output
can be limited to this column. The individual column names will be separated by
a simple blank character.
OMD[mysite]:~$ lq "GET hosts\nColumns: name address"
myserver123;192.168.0.42
myserver234;192.168.0.73
As can be seen, in a line the individual values are separated by a semicolon.
Important: If using these headers the header will be suppressed in the output. This can be re-inserted in the output with the ColumnHeaders header.
3.2. Setting a simple filter
To limit the query to specific lines, the columns can be filtered for specified contents. If only services with a specific status are to be searched for, this can be achieved with a filter:
OMD[mysite]:~$ lq "GET services\nColumns: host_name description state\nFilter: state = 2"
myserver123;Filesystem /;2
myserver234;ORA MYINST Processes;2
In the example all services with a CRIT status will be searched-for, and the host name, the service description and its status will be output. Such filters can of course be combined, and restricted to those services with a CRIT status, and which have not yet been acknowledged:
OMD[mysite]:~$ lq "GET services\nColumns: host_name description state\nFilter: state = 2\nFilter: acknowledged = 0"
myserver234;Filesystem /;2
As can be seen, one can also filter by columns which are not listed in Columns
.
Operators and regular expressions
So far only only matching numbers have been filtered. The interim result from a query can also be searched for ‘less than’ with numbers, or for character strings. The available operators can found in the Operators chapter in the command reference. Thus you can, for example, filter for regular expressions in the columns:
OMD[mysite]:~$ lq "GET services\nColumns: host_name description state\nFilter: description ~~ exchange database|availability"
myserver123;Exchange Database myinst1;1
myserver123;Exchange Availability Service;0
myserver234;Exchange Database myinst3;0
With the right operator you can search the columns in various ways.
The Livestatus will always interpret such an expression as ‘can appear anywhere
in the column’, as long as it has not been otherwise defined.
Indicate the start of a line with, for example, the ^
character,
and the end of a line with the $
character. A comprehensive list of
all special characters in Checkmk regular expressions can be found in the
article covering Regular expressions.
4. Complex queries
4.1. Filters for lists
Some columns in a table return not just a single value, rather a whole list of them.
So that such a list can be effectively searched, in these cases the
operators have another function. A complete list of the operators can be found
in Operators for lists.
So for example, the operator >=
has the function ‘contains’. With this
you could, for example, search for specific contacts:
OMD[mysite]:~$ lq "GET hosts\nColumns: name address contacts\nFilter: contacts >= hhirsch"
myserver123;192.168.0.42;hhirsch,hhirsch,mfrisch
myserver234;192.168.0.73;hhirsch,wherrndorf
As can be seen in the above example, the contacts will be listed, separated by commas,
in the contacts
column. This allows them to be clearly distinguished
as not being the start of another column. A special feature of the equality
operator is that it checks whether a list is empty:
OMD[mysite]:~$ lq "GET hosts\nColumns: name contacts\nFilter: contacts ="
myserver345;
myserver456;
4.2. Combining filters
Several filters have earlier already been combined. It would seem to be intuitive that the data must pass through all filters in order to be shown. The filters will thus be linked by the logical operation and. To link particular filters with a logical or, at the end of the filter string code an or: followed by an integer. The counter specifies how many of the last lines may be combined with an or. In this way groups can be formed and combined as required. The following is a simple example. Here two filters are combined so that all services which have either the status WARN or UNKNOWN will be shown:
OMD[mysite]:~$ lq
GET services
Columns: host_name description state
Filter: state = 1
Filter: state = 3
Or: 2
myserver123;Log /var/log/messages;1
myserver123;Interface 3;1
myserver234;Bonding Interface SAN;3
OMD[mysite]:~$
The result from a combination can also be negated, or groups can in turn be combined into other groups. In the example, all services are shown whose status is not OK, and whose description either begins with Filesystem, or who have a status other than UNKNOWN:
OMD[mysite]:~$ lq
GET services
Columns: host_name description state
Filter: state = 3
Filter: description ~ Filesystem
And: 2
Filter: state = 0
Or: 2
Negate:
myserver123;Log /var/log/messages;1
myserver123;Interface 3;1
myserver234;Filesystem /media;2
myserver234;Filesystem /home;2
4.3. Specifying an output format
The output format can be specified in two ways. One method is to redefine the separators used in the standard output. The other method is to output conforming to Python or JSON formats.
Customising csv
As already described, you can precisely customise the standard output
format csv
(lower case!) and define how the individual elements
should be separated from each other.
Checkmk recognises four different separators for structuring the data.
Following a colon, code an appropriate standard ASCII value so that the
filter is structured as follows:
Separators: 10 59 44 124
These separators have the following functions:
Separator for the datasets:
10
(line break)Separator for the columns in a data set:
59
(semicolon)Separator for the elements in a list:
44
(comma)Separator for the elements in a service list:
124
(vertical bar)
Each of these values can be selected to structure the output as desired. In the following example the individual columns in a data set have been separated with a tabulator (9) rather than a semicolon (59):
OMD[mysite]:~$ lq
GET services
Columns: host_name description state
Filter: description ~ Filesystem
Separators: 10 9 44 124
myserver123 Filesystem /opt 0
myserver123 Filesystem /var/some/path 1
myserver123 Filesystem /home 0
Important: The order of the separators is fixed and may not be altered.
Changing output formats
As well as producing outputs in csv
, Livestatus can also output
in other formats. These have the advantage of being easier and cleaner to parse
in higher programming languages.
Accordingly, the outputs may be coded in the following formats:
Format | Description |
---|---|
python | Generates an output as a list compatible with 2.x. Text is formatted in Unicode. |
python3 | Likewise generates output as a list, and when doing so takes account of changes in the data type – for example, the automatic conversion of text to Unicode. |
json | The output will like wise be generated as a list, but only a json-compatible format will be used. |
CSV | Formats the output conforming to RFC-4180. |
csv | See customising |
Please do not confuse the CSV Format
with the csv
-output
from Livestatus which is used if no output format has been specified.
A correct coding of upper case/lower case is thus absolutely essential.
For the customisation, at the end specify OutputFormat
instead of Separator
:
OMD[mysite]:~$ lq
GET services
Columns: host_name description state
Filter: description ~ Filesystem
OutputFormat: json
[["myserver123","Filesystem /opt",0]
["myserver123","Filesystem /var/some/path",1]
["myserver123","Filesystem /home",0]]
5. Retrieving statistics (Stats)
5.1. Introduction
There will be situations in which you have no interest in the status of a
single service or group of services. Far more important is the number of services
with a current WARN status, or the number of monitored data bases.
Livestatus is able to generate and output statistics with Stats
.
5.2. Numbers
The Overview receives its data by retrieving statistics for hosts, services and events through Livestatus and displaying them in Checkmk’s interface. With direct access to Livestatus you can produce your own summary:
OMD[mysite]:~$ lq
GET services
Stats: state = 0
Stats: state = 1
Stats: state = 2
Stats: state = 3
34506;124;54;20
By the way, such statistics can be combined with all filters.
5.3. Grouping
Statistics can also be combined with and/or
. The headers are then
called StatsAnd
or StatsOr
. Use StatsNegate
if the
output should be reversed. In the example the total number of hosts will be output
(the initial Stats
), and in addition the output will include the count
of hosts marked as stale
and which are also not listed in a Downtime
(Stats 2 and 3 are linked with a logical 'AND'):
OMD[mysite]:~$ lq
GET hosts
Stats: state >= 0
Stats: staleness >= 3
Stats: scheduled_downtime_depth = 0
StatsAnd: 2
734;23
Do not be confused by the various options for combining the
results from filters and statistics.
While all hosts meeting the conditions will be output using the
Filter
header, with statistics the output will be
the sum of how often the Stats
filter applies.
5.4. Minimum, maximum, average, etc.
It is also possible to perform calculations on values and, for example, output an average value or a maximum value. A complete list of all of the possible operators can be found here.
In the following example the output will list the average, minimum and maximum times a host’s check plug-ins require for calculating a status:
OMD[mysite]:~$ lq
GET services
Filter: host_name = myserver123
Stats: avg execution_time
Stats: max execution_time
Stats: min execution_time
0.0107628;0.452087;0.008593
Calculations with metrics are handled in a somewhat special way.
Here as well, all of the Stats
-header functions are available for use.
These are however applied individually to all of a service’s metrics.
As an example, in the following example the metrics from a host group’s CPU-usage
will be added together:
OMD[mysite]:~$ lq
GET services
Filter: description ~ CPU utilization
Filter: host_groups >= cluster_a
Stats: sum perf_data
guest=0.000000 steal=0.000000 system=34.515000 user=98.209000 wait=23.008000
6. Limiting an output (Limit)
The number of lines in an output can be intentionally limited. This can be useful if, for example, you only wish to see if you can get any sort of response to a Livestatus query, but want to avoid getting a multi-page output:
OMD[mysite]:~$ lq "GET hosts\nColumns: name\nLimit: 3"
myserver123
myserver234
myserver345
Note that this limit also functions when it is combined with other headers.
If for example, with Stat
you count how many hosts have an UP status,
and limit the output to 10, only the first 10 hosts will be taken into account.
7. Time limits (Timelimit)
Not only the count of lines to be output can be restricted – the maximum elapsed time that a query is permitted to run can also be limited. This option can prevent a Livestatus query blocking a connection forever if it gets hung up for some reason. The time restriction specifies a maximum time in seconds that a query is permitted to process:
OMD[mysite]:~$ lq "GET hosts\nTimelimit: 1"
8. Activating column headers (ColumnHeaders)
With ColumnHeaders
the names of the columns can be added to the output.
These are normally suppressed in order to simply further processing:
OMD[mysite]:~$ lq "GET hosts\nColumns name address groups\nColumnHeaders: on"
name;address;groups
myserver123;192.168.0.42;cluster_a,headnode
myserver234;192.168.0.43;cluster_a
myserver345;192.168.0.44;cluster_a
9. Authorisations (AuthUser)
If you want to make scripts available based on the Livestatus, the user
should probably only see the data for which they are authorised.
Checkmk provides the AuthUser
header for this function,
with the restriction that it may not be used in the following tables:
columns
commands
contacts
contactgroups
eventconsolerules
eventconsolestatus
status
timeperiods
Conversely, this header may be used in all tables that access the hosts
or services
tables. Which among these a user is authorised for depends
on the user’s contact groups.
In this manner a query will only output data that the contact is also permitted to see.
Note here the difference between strict
and loose
permission settings:
OMD[mysite]:~$ lq "GET services\nColumns: host_name description contacts\nAuthUser: hhirsch"
myserver123;Uptime;hhirsch
myserver123;TCP Connections;hhirsch
myserver123;CPU utilization;hhrisch,kkleber
myserver123;File /etc/resolv.conf;hhirsch
myserver123;Kernel Context Switches;hhrisch,kkleber
myserver123;File /etc/passwd;hhirsch
myserver123;Filesystem /home;hhirsch
myserver123;Kernel Major Page Faults;hhrisch
myserver123;Kernel Process Creations;hhirsch
myserver123;CPU load;hhrisch,kkleber
10. Time delays (Wait)
With the Wait-header you can create queries for specific data sets without needing to know whether the prerequisites for the data have been satisfied. This can be useful when, for example, you need comparison data for a specific error situation, but you don’t want to put a continuous, unnecessary load on the system. Information will therefore only be retrieved when it is really required.
A full list of the Wait-headers can be found here.
In following example the Disk IO SUMMARY service for an ESXi-Server will be
output, as soon as the status of the CPU load service changes to a specific
VM CRIT. With the WaitTimeout
header the query will then be executed
if the condition has not been satisfied after 10000 milliseconds.
This prevents the Livestatus connection being blocked for a long time:
OMD[mysite]:~$ lq
GET services
WaitObject: myvmserver CPU load
WaitCondition: state = 2
WaitTrigger: state
WaitTimeout: 10000
Filter: host_name = myesxserver
Filter: description = Disk IO SUMMARY
Columns: host_name description plugin_output
myesxserver;Disk IO SUMMARY;OK - Read: 48.00 kB/s, Write: 454.54 MB/s, Latency: 1.00 ms
A further application is to combine this with a command. You can issue a command and retrieve the results as soon as they are available. In the following example we want to query and display the current data from a service. For this, first the command will be submitted, and then a query issued.
For this you execute a command, followed by a regular query.
This checks whether the data from the Check_MK service is newer than that at a particular point in time. As soon as the precondition has been satisfied the status of the Memory service will be output.
OMD[mysite]:~$ lq "COMMAND [$(date +%s)] SCHEDULE_FORCED_SVC_CHECK;myserver;Check_MK;$(date
+%s)"
OMD[mysite]:~$ lq
GET services
WaitObject: myserver Check_MK
WaitCondition: last_check >= 1517914646
WaitTrigger: check
Filter: host_name = myserver
Filter: description = Memory
Columns: host_name description state
myserver;Memory;0
Important: Note that the time stamp as used in last_check
in the example MUST be substituted with an appropriate one – otherwise the
condition will always be satisfied and the output will be produced immediately.
11. Time zones (Localtime)
Many monitoring environments query hosts and services on a global level. In such cases it can quickly develop into a situation of distributed monitoring instances working in different time zones. Since Checkmk utilises Unix Time – which is independent of time zones – this should not be a problem.
Should a server nevertheless be assigned to an incorrect time zone,
this difference can be compensated for with the Localtime
header.
Provide the current time to the query as well. Checkmk will then autonomously
round up to the next half-hour, and adjust for the difference.
You can provide the time automatically if you invoke the query directly:
OMD[mysite]:~$ lq "GET hosts\nColumns: name last_check\nFilter: name = myserver123\nLocaltime: $(date +%s)"
myserver123;1511173526
Otherwise provide the result from date +%s
if you want to use the input stream:
OMD[mysite]:~$ lq
GET hosts
Columns: name last_check
Filter: name = myserver123
Localtime: 1511173390
myserver123;Memory;1511173526
12. Status codes (ResponseHeader)
If you write an API you will probably want to receive a status code as a response,
so that you can process the output better.
The ResponseHeader
header supports the off
(Standard)
and fixed16
values, and with these provides a status message
exactly 16 Bytes long in the first line of the response.
In the case of an error, the subsequent lines will contain a comprehensive
description of the error code. These are thus also very useful for looking for
the error in the query’s results.
The status report in the first line combines the following:
Bytes 1-3: The status code. The complete table of possible codes can be found here.
Byte 4: A simple blank character (ASCII-character: 32)
Bytes 5-15: The length of the actual response as an integer. Unnecessary bytes are filled by blank characters.
Byte 16: A line feed (ASCII-character: 10)
In the following example we will execute a faulty query in which a filter is in fact erroneously coded with a column name.
OMD[mysite]:~$ lq "GET hosts\nName: myserver123\nResponseHeader: fixed16"
400 33
Columns: undefined request header
Important: In an error situation the output format is always an error message in text form. This applies regardless of any adaptations you may have made.
13. Keeping a connection alive (KeepAlive)
Particularly with scripts which establish a Livestatus connection over the
network, you may possibly want to keep the channel open to
save the overhead generated when repeatedly establishing the connection.
You can achieve this with the KeepAlive
header, and in this way are
able to reserve a channel.
By the way — following a command a Livestatus connection
always stays open. No additional header needs to be input for this.
Important: Because the channel is blocked to other processes for the duration of the connection, it can become a problem if no other connections are available for use. Other processes must therefore wait until a connection is free. In the standard configuration Checkmk holds 20 connections ready — raise the maximum number of these connections as necessary with Setup > General > Global Settings > Monitoring Core > Maximum concurrent Livestatus connections.
Always combine KeepAlive
with the Response header
,
in order to be able to correctly distinguish the individual answers from each other:
OMD[mysite]:~$ lq
GET hosts
ResponseHeader: fixed16
Columns: name
KeepAlive: on
200 33
myserver123
myserver234
myserver345
GET services
ResponseHeader: fixed16
Columns: host_name description last_check
Filter: description = Memory
200 58
myserver123;Memory;1511261122
myserver234;Memory;1511261183
Make sure that there is no empty line between the first answer and the second request. As soon as a header is omitted from a query, following the next output the connection will closed as usual by the blank line.
14. Log retrieval
14.1. Overview
With the table log
in Livestatus you have a direct access to the core’s monitoring history,
so that using the LQL you can conveniently filter for particular events.
The availability tables, for example, will be generated with the help of these tables.
In order to enhance the overview and to restrict a query thematically, you have access
to the following log classes:
Class | Description |
---|---|
0 | All messages not covered by other classes |
1 | Host and service alerts |
2 | Important program events |
3 | Notifications |
4 | Passive Checks |
5 | External commands |
6 | Initial or current status entries (e.g., after a log rotation) |
7 | Changes in the program’s status |
Just by using these log classes you can already restrict which type of entry should be shown very well. The time range taken into account in the query will additionally be restricted. This is important since otherwise the instance’s complete history will be searched – which could logically apply a strong brake on the system due to the flood of information.
A further sensible restriction of the output are the (Columns
)
which are to be shown for an entry.
In example below we will search for all notifications that have been
logged in the last hour:
OMD[mysite]:~$ lq "GET log\nFilter: class = 3\nFilter: time >= $$(date +%s)-3600\nColumns: host_name service_description time state"
myserver123;Memory;1511343365;0
myserver234;CPU load;1511343360;3
myserver123;Memory;1511343338;2
myserver234;CPU load;1511342512;0
Important: Ensure that in the entry stream’s interactive mode none the of variables as used in the example can be used, and always restrict the queries to a time range.
14.2. Configuring the monitoring history
It is possible to influence the rotation of the files, and their maximum sizes. You can additionally specify how many lines of a file should be read in before Checkmk interrupts. All of this can affect the performance of your queries, depending on the instance’s construction. The following three parameters are available which can be found and customised in Setup > General > Global Settings > Monitoring Core:
Name | Description |
---|---|
History log rotation: Regular interval of rotations | Here it can be defined within which time range the history should be continued in a new file. |
History log rotation: Rotate by size (Limit of the size) | Independently of the time range, here the maximum size of a file is defined. The size represents a compromise between the possible read rate and the possible IOs. |
Maximum number of parsed lines per log file | When the specified number of lines have been read in, reading of the file will stop. This avoids time-outs if for any reason a file becomes very large. |
15. Checking availability
With the statehist
table you can query the raw data on the availability
of hosts and services, and therefore have access to all of the information as
used by the interface’s availability display.
Always enter a time range, otherwise all available logs will be searched,
which can put a heavy load on the system.
The following additional specifics also apply:
The time range in which a host/service had a particular status can be output as an absolute as well as a Unix-Time, and also as a relative and as a percentage proportion of the queried time range.
During times in which a host/service was not monitored the status will be
-1
.
Checking whether, when and for how long a host/service has been monitored is made possible in Checkmk through the logging of the initial status. Thus you can not only see which status existed at a specific time, but you can also retrace whether it was actually being monitored at that point in time. Important: This logging is also active with a Nagios-Core. Here it can be deactivated however:
log_initial_states=0
In the example below it can be seen how the query of a percentage allocation, and the absolute times for a particular status look. The last 24 hours have been specified as the time range, and the query restricted to the availability of a service on a particular host:
OMD[mysite]:~$ lq
GET statehist
Columns: host_name service_description
Filter: time >= 1511421739
Filter: time < 1511436139
Filter: host_name = myserver123
Filter: service_description = Memory
Stats: sum duration_ok
Stats: sum duration_warning
Stats: sum duration_critical
Stats: sum duration_part_ok
Stats: sum duration_part_warning
Stats: sum duration_part_critical
myserver123;Memory;893;0;9299;0.0620139;0;0.645764
How a complete list of the available columns can be retrieved is explained in more detail in the Command reference.
16. Variables in Livestatus
At various locations in the Checkmk-interface you can use variables to
make context-based assignments. Some of this data is also retrievable over
the Livestatus. Because these variables must be also be resolved,
the availabilities of these columns are duplicated in a table –
once as a literal entry, and once in which the variable has been
substituted with the appropriate value.
An example of such is the notes_url
column which outputs a URL
with the variable:
OMD[mysite]:~$ lq "GET hosts\nColumns: name notes_url"
myserver123;https://mymonitoring/heute/wiki/doku.php?id=hosts:$HOSTNAME$
If however, instead of this you query the note_url_expanded
column,
you will receive the macro’s actual value:
OMD[mysite]:~$ lq "GET hosts\nColumns: name notes_url_expanded"
myserver123;https://mymonitoring/heute/wiki/doku.php?id=hosts:myserver123
17. Using Livestatus via a network
17.1. Connections via TCP/IP
To access Livestatus via the network, you can connect the Unix socket of the live status process to a TCP port. This way you can execute scripts on remote machines and collect the data directly from where they should be processed.
When a site is turned off, access via TCP can be enabled with the
omd
command:
OMD[mysite]:~$ omd config set LIVESTATUS_TCP on
Once the site has been started, Livestatus via TCP is usually active on the default port 6557. For Checkmk servers with multiple sites using Livestatus via TCP, the next higher unused port is chosen.
All settings such as port and authorized IP addresses can be configured via omd config
.
Alternatively these settings can be made in the setup.
In Checkmk the SSL encryption of Livestatus communication is enabled by default:
Local SSL connection test
Livestatus uses a certificate that is automatically generated when the site is created.
This certificate is located in the var/ssl/ca-certificates.crt
file together with all other CA certificates trusted by the site.
In order for the command line tool openssl s_client
to be able to validate the certificate used by the Livestatus server, this file must be designated as Certificate Authority File.
We have massively shortened the output from the command call here, […]
shows the omissions:
OMD[mysite]:~$ openssl s_client -CAfile var/ssl/ca-certificates.crt -connect localhost:6557
CONNECTED(00000003)
Can't use SSL_get_servername
depth=1 CN = Site 'mysite' local CA
verify return:1
depth=0 CN = mysite
verify return:1
---
Certificate chain
0 s:CN = mysite
i:CN = Site 'mysite' local CA
1 s:CN = Site 'mysite' local CA
i:CN = Site 'mysite' local CA
---
Server certificate
[...]
Start Time: 1664965470
Timeout : 7200 (sec)
Verify return code: 0 (ok)
Extended master secret: no
Max Early Data: 0
---
read R BLOCK
As soon as there is no further output, you can issue LQL commands interactively, and terminate the interaction with an empty line (press the return key twice).
If this works, you can also pipe Livestatus queries, and use the additional -quiet
parameter to suppress debugging output:
OMD[mysite]:~$ echo -e "GET hosts\nColumns: name\n\n" | \
openssl s_client -quiet -CAfile var/ssl/ca-certificates.crt -connect localhost:6557
Can't use SSL_get_servername
depth=1 CN = Site 'mysite' local CA
verify return:1
depth=0 CN = mysite
verify return:1
myserver23
myserver42
myserver123
myserver124
The output preceeding the four hostnames is written to STDERR by the openssl
command.
It can be suppressed by appending 2>/dev/null
.
Remote access to Livestatus
If you access Livestatus from remote machines, you should not use the entire list of certificates trusted by the Checkmk site on those machines. Instead, read the site CA’s certificate from the setup alone.
To do this, go to Global Settings > Site management > Trusted certificate authorities for SSL.
Here you can copy and paste the certificate used by the site CA.
Copy the complete text of the first certificate under Content of CRT/PEM file into a file — in our example we use /tmp/mysite_ca.pem
.
If the remote host has now been enabled for Livestatus access, Livestatus queries via script will be possible with this certificate file:
user@host:~$ echo -e "GET hosts\nColumns: name\n\n" | \
openssl s_client -quiet -CAfile /tmp/mysite_ca.pem -connect cmkserver:6557
Note: The certificate file does not provide authentication, it only ensures transport encryption! Access protection is regulated exclusively via the IP addresses that are authorized to access the Livestatus port.
Livestatus with stunnel
In case you want to make the encrypted remote Livestatus port available as local unencrypted port, you can use the program stunnel.
[pinning client]
client = yes
accept = 0.0.0.0:6557
connect = <myremotesiteip>:6557
verifyPeer = yes
CAfile = /etc/stunnel/myremotesite.pem
After restart of stunnel, unencrypted access to the local port is possible.
user@host:~$ echo -e "GET hosts\nColumns: name\n\n" | nc localhost 6557
SSL in scripts
If you want to use scripts to access Livestatus via SSL, avoid using openssl s_client
.
The primary purpose of this tool is to test connection establishing and to debug certificate chains.
To see if the expected output is complete in the event of connection failures, we recommend evaluating the response header.
A well-maintained API that supports SSL and header evaluation is the one for Python, which can be found at share/doc/check_mk/livestatus/api/python
.
Other suitable APIs are listed in the chapter covering Files and Directories.
17.2. Connections via SSH
If access to Livestatus from outside your local network is required, access protection based on IP addresses alone may not be practical. The easiest way to gain authenticated access here is to use the Secure Shell.
With SSH, you have the ability to pass a command that will be executed on the remote server:
user@host:~$ ssh mysite@myserver 'lq "GET hosts\nColumns: name"'
myserver123
myserver234
Alternatively, you can forward the Livestatus port to the host you are currently working on via an SSH tunnel:
user@host:~$ ssh -L 6557:localhost:6557 mysite@myserver
If the connection has been established, in a second console session you can test whether access with openssl s_client
is possible:
user@host:~$ openssl s_client -CAfile /tmp/mysite_ca.pem -connect localhost:6557
If this test is successful, any script you have written for direct Livestatus network access can be used on localhost
.
18. Setting commands
18.1. Overview
Livestatus can not only be used for data queries,
but also for issuing commands directly to the core (CMC or Nagios).
A correct command always includes a time stamp – this can in fact be anything required.
Because it will additionally be used in the Logs to track the
time of the processing however, it is sensible to enter the time as precisely as possible.
Commands with a missing time stamp will be discarded, without issuing an error
message, and with only a simple entry in the cmc.log
!
So that the time stamp can be as precise as possible, it is recommended to not set the command in the input stream, but rather to issue it directly. In such a situation there is also access to variables and the actual current time can be provided:
OMD[mysite]:~$ lq "COMMAND [$(date +%s)] DISABLE_NOTIFICATIONS"
This format works with both the Nagios-Core in the Checkmk Raw Edition and with the CMC in the Checkmk Enterprise Editions. In the two cores the commands only partly-overlap however. A complete list of the commands for the Nagios-Core can be found directly on the Nagios website. The commands available for the CMC can be found in the Command reference.
18.2. Special features in Nagios
In the list of the commands the syntax is in the following form:
#!/bin/sh
# This is a sample shell script showing how you can submit the CHANGE_CUSTOM_HOST_VAR command
# to Nagios. Adjust variables to fit your environment as necessary.
now=`date +%s`
commandfile='/usr/local/nagios/var/rw/nagios.cmd'
/bin/printf "[%lu] CHANGE_CUSTOM_HOST_VAR;host1;_SOMEVAR;SOMEVALUE\n" $now > $commandfile
As you have learned, Checkmk uses a much simpler format for issuing commands. To make the Nagios format compatible with Checkmk, you simply need the command, the time stamp, and where applicable, the variables:
OMD[mysite]:~$ lq "COMMAND [$(date +%s)] CHANGE_CUSTOM_HOST_VAR;host1;_SOMEVAR;SOMEVALUE"
19. Files and directories
Path | Function |
---|---|
| The Unix-Socket through which queries and commands are submitted. |
| Script command for simplifying issuing of queries and commands to the Unix-Socket in the Livestatus. |
| The CMC’s log file, in which along with other data the queries/commands are documented. |
| The CMC’s log file, in which all changes occurring during the core’s running time are entered – e.g., changes in the state of a host/service. |
| The |
| The Nagios-Core’s log file, in which along with other data the queries/commands are documented. |
| The |
| In this directory a number of examples of Livestatus queries can be found which you can try out. The examples are based on the |
| the API for Python is in this directory, as well as a number of examples. Also read the |
| The API for Perl can be found here. Here as well there is a |
| There are also example codes for the C++ programming language. The code for the API itself is likewise located in an uncompiled form here, so that you have the best insight into the API’s functionality. |