cmk.agent_based.v1

all_of(spec_0, spec_1, *specs)

Detect the device if all passed specifications are met

Parameters:

spec_0 (SNMPDetectSpecification) – A valid specification for SNMP device detection
spec_1 (SNMPDetectSpecification) – A valid specification for SNMP device detection

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = all_of(exists("1.2.3.4"), contains("1.2.3.5", "foo"))

any_of(*specs)

Detect the device if any of the passed specifications are met

Parameters:: spec – A valid specification for SNMP device detection
Return type:: SNMPDetectSpecification
Returns:: A valid specification for SNMP device detection

Example

>>> DETECT = any_of(exists("1.2.3.4"), exists("1.2.3.5"))

exists(oidstr)

Detect the device if the OID exists at all

Parameters:: oidstr (str) – The OID that is required to exist
Return type:: SNMPDetectSpecification
Returns:: A valid specification for SNMP device detection

Example

>>> DETECT = exists("1.2.3")

equals(oidstr, value)

Detect the device if the value of the OID equals the given string

Parameters:

oidstr (str) – The OID to match the value against
value (str) – The expected value of the OID

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = equals("1.2.3", "MySwitch")

startswith(oidstr, value)

Detect the device if the value of the OID starts with the given string

Parameters:

oidstr (str) – The OID to match the value against
value (str) – The expected start of the OIDs value

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = startswith("1.2.3", "Sol")

endswith(oidstr, value)

Detect the device if the value of the OID ends with the given string

Parameters:

oidstr (str) – The OID to match the value against
value (str) – The expected end of the OIDs value

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = endswith("1.2.3", "nix")

contains(oidstr, value)

Detect the device if the value of the OID contains the given string

Parameters:

oidstr (str) – The OID to match the value against
value (str) – The substring expected to be in the OIDs value

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = contains("1.2.3", "isco")

matches(oidstr, value)

Detect the device if the value of the OID matches the expression

Parameters:

oidstr (str) – The OID to match the value against
value (str) – The regular expression that the value of the OID should match

Return type:

SNMPDetectSpecification

Returns:

A valid specification for SNMP device detection

Example

>>> DETECT = matches("1.2.3.4", ".* Server")

not_exists(oidstr)

The negation of exists()

Return type:: SNMPDetectSpecification

not_equals(oidstr, value)

The negation of equals()

Return type:: SNMPDetectSpecification

not_contains(oidstr, value)

The negation of contains()

Return type:: SNMPDetectSpecification

not_endswith(oidstr, value)

The negation of endswith()

Return type:: SNMPDetectSpecification

not_matches(oidstr, value)

The negation of matches()

Return type:: SNMPDetectSpecification

not_startswith(oidstr, value)

The negation of startswith()

Return type:: SNMPDetectSpecification

Bases: _AttributesTuple

Attributes to be written at a node in the HW/SW Inventory

check_levels(value, *, levels_upper=None, levels_lower=None, metric_name=None, render_func=None, label=None, boundaries=None, notice_only=False)

Generic function for checking a value against levels.

Parameters:

value (float) – The currently measured value
levels_upper (tuple[float, float] | None) – A pair of upper thresholds, ie. warn and crit. If value is larger than these, the service goes to WARN or CRIT, respecively.
levels_lower (tuple[float, float] | None) – A pair of lower thresholds, ie. warn and crit. If value is smaller than these, the service goes to WARN or CRIT, respecively.
metric_name (str | None) – The name of the datasource in the RRD that corresponds to this value or None in order not to generate a metric.
render_func (Callable[[float], str] | None) – A single argument function to convert the value from float into a human readable string.
label (str | None) – The label to prepend to the output.
boundaries (tuple[float | None, float | None] | None) – Minimum and maximum to add to the metric.
notice_only (bool) – Only show up in service output if not OK (otherwise in details). See notice keyword of Result class.

Return type:

Generator[Result | Metric, None, None]

Example

>>> result, metric = check_levels(
...     23.0,
...     levels_upper=(12., 42.),
...     metric_name="temperature",
...     label="Fridge",
...     render_func=lambda v: "%.1f°" % v,
... )
>>> print(result.summary)
Fridge: 23.0° (warn/crit at 12.0°/42.0°)
>>> print(metric)
Metric('temperature', 23.0, levels=(12.0, 42.0))

check_levels_predictive(value, *, levels, metric_name, render_func=None, label=None, boundaries=None)

Generic function for checking a value against levels.

Parameters:

value (float) – Currently measured value
levels (dict[str, object]) – Predictive levels. These are used automatically. Lower levels are imposed if the passed dictionary contains “levels_lower” as key, upper levels are imposed if it contains “levels_upper”. If value is lower/higher than these, the service goes to WARN or CRIT, respecively.
metric_name (str) – Name of the datasource in the RRD that corresponds to this value
render_func (Callable[[float], str] | None) – Single argument function to convert the value from float into a human readable string. readable fashion
label (str | None) – Label to prepend to the output.
boundaries (tuple[float | None, float | None] | None) – Minimum and maximum to add to the metric.

Return type:

Generator[Result | Metric, None, None]

get_average(value_store, key, time, value, backlog_minutes)

Return new average based on current value and last average

Parameters:

value_store (MutableMapping[str, Any]) – The Mapping that holds the last value. Usually this will be the value store provided by the API.
key (str) – Unique ID for storing this average until the next check
time (float) – Timestamp of new value
value (float) – The new value
backlog_minutes (float) – Averaging horizon in minutes

This function returns the new average value aₙ as the weighted sum of the current value xₙ and the last average:

aₙ = (1 - w)xₙ + waₙ₋₁

= (1-w) ∑ᵢ₌₀ⁿ wⁱxₙ₋ᵢ

This results in a so called “exponential moving average”.

The weight is chosen such that for long running timeseries the “backlog” (all recorded values in the last n minutes) will make up 50% of the weighted average.

Assuming k values in the backlog, compute their combined weight such that they sum up to the backlog weight b (0.5 in our case):

b = (1-w) ∑ᵢ₌₀ᵏ⁻¹ wⁱ => w = (1 - b) ** (1/k) (“geometric sum”)

For shorter timeseries we give the backlog more than those 50% weight with the advantages that

the initial value becomes irrelevant, and

for beginning timeseries we reach a meaningful value more quickly.

Return type:: float
Returns:: The computed average

get_rate(value_store, key, time, value, *, raise_overflow=False)

Update value store.
Calculate rate based on current value and time and last value and time

Parameters:

value_store (MutableMapping[str, Any]) – The mapping that holds the last value. Usually this will be the value store provided by the APIs get_value_store().
key (str) – Unique ID for storing the time/value pair until the next check
time (float) – Timestamp of new value
value (float) – The new value
raise_overflow (bool) – Raise a GetRateError if the rate is negative

This function returns the rate of a measurement rₙ as the quotient of the value and time provided to the current function call (xₙ, tₙ) and the value and time provided to the previous function call (xₙ₋₁, tₙ₋₁):

rₙ = (xₙ - xₙ₋₁) / (tₙ - tₙ₋₁)

Note that the function simply computes the quotient of the values and times given, regardless of any unit. You might as well pass something different than the time. However, this function is written with the use case of passing timestamps in mind.

A GetRateError will be raised if one of the following happens:

the function is called for the first time

the time has not changed

the rate is negative and raise_overflow is set to True (useful for instance when dealing with counters)

In general there is no need to catch a GetRateError, as it inherits IgnoreResultsError.

Example

>>> # in practice: my_store = get_value_store()
>>> my_store = {}
>>> try:
...     rate = get_rate(my_store, 'my_rate', 10, 23)
... except GetRateError:
...     pass  # this fails the first time, because my_store is empty.
>>> my_store  # now remembers the last time/value
{'my_rate': (10, 23)}
>>> # Assume in the next check cycle (60 seconds later) the value has increased to 56.
>>> # get_rate uses the new and old values to compute (56 - 23) / (70 - 10)
>>> get_rate(my_store, 'my_rate', 70, 56)
0.55

Return type:: float
Returns:: The computed rate

get_value_store()

Get the value store for the current service from Checkmk

The returned value store object can be used to persist values between different check executions. It is a MutableMapping, so it can be used just like a dictionary.

Return type:: MutableMapping[str, Any]

class HostLabel(name: str, value: str)

Bases: _KV

Representing a host label in Checkmk

This class creates a host label that can be yielded by a host_label_function as regisitered with the section.

>>> my_label = HostLabel("my_key", "my_value")

class IgnoreResults(value='currently no results')

Bases: object

A result to make the service go stale, but carry on with the check function

Yielding a result of type IgnoreResults will have a similar effect as raising an IgnoreResultsError, with the difference that the execution of the check funtion will not be interrupted.

yield IgnoreResults("Good luck next time!")
return

is equivalent to

raise IgnoreResultsError("Good luck next time!")

This is useful for instance if you want to initialize all counters, before returning.

exception IgnoreResultsError

Bases: RuntimeError

Raising an IgnoreResultsError from within a check function makes the service go stale.

Example

>>> def check_db_table(item, section):
...     if item not in section:
...         # avoid a lot of UNKNOWN services:
...         raise IgnoreResultsError("Login to database failed")
...     # do your work here
>>>

Bases: _MetricTuple

Create a metric for a service

Parameters:

name – The name of the metric.
value – The measured value.
levels – A pair of upper levels, ie. warn and crit. This information is only used for visualization by the graphing system. It does not affect the service state.
boundaries – Additional information on the value domain for the graphing system.

If you create a Metric in this way, you may want to consider using check_levels().

Example

>>> my_metric = Metric("used_slots_percent", 23.0, levels=(80, 90), boundaries=(0, 100))

class OIDBytes(value: str)

Bases: _OIDSpecTuple

Class to indicate that the OIDs value should be provided as list of integers

Parameters:: oid – The OID to fetch

Example

>>> _ = OIDBytes("2.1")

class OIDCached(value: str)

Bases: _OIDSpecTuple

Class to indicate that the OIDs value should be cached

Parameters:: oid – The OID to fetch

Example

>>> _ = OIDCached("2.1")

class OIDEnd

Bases: _OIDSpecTuple

Class to indicate the end of the OID string should be provided

When specifying an OID in an SNMPTree object, the parse function will be handed the corresponding value of that OID. If you use OIDEnd() instead, the parse function will be given the tailing portion of the OID (the part that you not already know).

regex(pattern, flags=0)

Cache compiled regexes.

For compatibilty, this is part of the API. Note that there are two other ways to achieve better performance when dealing with regexes using the python standard libraries re module.

One option is to compile regexes using re.compile() and store them in a global constant.

The other is to not explicitly compile the patterns and use the module scope match functions like re.match(“.*”, “foobar”). That way re will deal with memoizing.

Return type:: Pattern[str]

class Result(*, state: State, summary: str, details: str | None = None)

class Result(*, state: State, notice: str, details: str | None = None)

Bases: _ResultTuple

A result to be yielded by check functions

This is the class responsible for creating service output and setting the state of a service.

Parameters:

state – The resulting state of the service.
summary – The text to be displayed in the services summary view.
notice – A text that will only be shown in the summary if state is not OK.
details – The alternative text that will be displayed in the details view. Defaults to the value of summary or notice.

Note

You must specify exactly one of the arguments summary and notice!

When yielding more than one result, Checkmk will not only aggregate the texts, but also compute the worst state for the service and highlight the individual non-OK states in the output. You should always match the state to the output, and yield subresults:

>>> def my_check_function() -> None:
...     # the back end will comput the worst overall state:
...     yield Result(state=State.CRIT, summary="All the foos are broken")
...     yield Result(state=State.OK, summary="All the bars are fine")
>>>
>>> # run function to make sure we have a working example
>>> _ = list(my_check_function())

The notice keyword has the special property that it will only be displayed in the summary if the state passed to _this_ Result instance is not OK. Otherwise we assume it is sufficient to show the information in the details view:

>>> def my_check_function() -> None:
...     count = 23
...     yield Result(
...         state=State.WARN if count <= 42 else State.OK,
...         notice=f"Things: {count}",  # only appear in summary if count drops below 43
...         details=f"We currently have this many things: {count}",
...     )
>>>
>>> # run function to make sure we have a working example
>>> _ = list(my_check_function())

If you find yourself computing the state by comparing a metric to some thresholds, you probably should be using check_levels()!

class Service(*, item: str | None = None, parameters: Mapping[str, object] | None = None, labels: Sequence[ServiceLabel] | None = None)

Bases: _ServiceTuple

Class representing services that the discover function yields

Parameters:

item – The item of the service
parameters – The determined discovery parameters for this service
labels – A list of labels attached to this service

Example

>>> my_drive_service = Service(
...    item="disc_name",
...    parameters={},
... )

class ServiceLabel(name: str, value: str)

Bases: _KV

Representing a service label in Checkmk

This class creates a service label that can be passed to a ‘Service’ object. It can be used in the discovery function to create a new label like this:

>>> my_label = ServiceLabel("my_key", "my_value")

class SNMPTree(base: str, oids: Sequence[str | _OIDSpecTuple])

Bases: _SNMPTreeTuple

Specify an OID table to fetch

For every SNMPTree that is specified, the parse function will be handed a list of lists with the values of the corresponding OIDs.

Parameters:

base – The OID base string, starting with a dot.
oids – A list of OID specifications.

Example

>>> _ = SNMPTree(
...     base=".1.2.3.4.5.6",
...     oids=[
...         OIDEnd(),  # I want the end oids of every entry
...         "7.8",  # just a regular entry
...         OIDCached("123"),  # this is HUGE, please cache it
...         OIDBytes("42"),  # I expect bytes, give me a list of integers
...     ],
... )

class State(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

States of check results

classmethod best(*args)

Returns the best of all passed states

You can pass an arbitrary number of arguments, and the return value will be the “best” of them, where

OK -> WARN -> UNKNOWN -> CRIT

Parameters:: args (State | int) – Any number of one of State.OK, State.WARN, State.CRIT, State.UNKNOWN
Return type:: State
Returns:: The best of the input states, one of State.OK, State.WARN, State.CRIT, State.UNKNOWN.

Examples

>>> State.best(State.OK, State.WARN, State.CRIT, State.UNKNOWN)
<State.OK: 0>
>>> State.best(0, 1, State.CRIT)
<State.OK: 0>

classmethod worst(*args)

Returns the worst of all passed states.

You can pass an arbitrary number of arguments, and the return value will be the “worst” of them, where

OK < WARN < UNKNOWN < CRIT

Parameters:: args (State | int) – Any number of one of State.OK, State.WARN, State.CRIT, State.UNKNOWN
Return type:: State
Returns:: The worst of the input States, one of State.OK, State.WARN, State.CRIT, State.UNKNOWN.

Examples

>>> State.worst(State.OK, State.WARN, State.CRIT, State.UNKNOWN)
<State.CRIT: 2>
>>> State.worst(0, 1, State.CRIT)
<State.CRIT: 2>

Bases: _TableRowTuple

TableRow to be written into a Table at a node in the HW/SW Inventory

exception GetRateError

Bases: IgnoreResultsError

The exception raised by get_rate(). If unhandled, this exception will make the service go stale.