Checkmk
to checkmk.com

1. Introduction

azure logo

Checkmk includes a extensive module for monitoring Microsoft Azure, consisting of a connector to Azure and a comprehensive collection of check plug-ins that retrieve and evaluate various metrics and statuses for you.

In addition to general information about the costs that are incurred by your Azure environment and the current status of the Azure services in your region, you can monitor the following Azure products with all editions of Checkmk:

With CSE Checkmk Cloud you can also include the following products in your monitoring system:

A complete listing of all available check plug-ins for monitoring Microsoft Azure can be found in our Catalog of check plug-ins and we describe how to include your AKS (Azure Kubernetes Service) clusters in Checkmk in the article Monitoring Kubernetes.

2. Preparing Azure for Checkmk

2.1. Creating the app

To monitor Azure with Checkmk, you will need your subscription ID and your tenant ID (also known as the "Directory ID").

First, register Checkmk monitoring as an app so that you can work with the Azure API. The option for this can be found in the Azure portal at Microsoft Entra ID > Manage > App registrations > New registration:

azure register 1

Assign a name of your choice. In the example we use my-check-mk-app. This name is only for information. The reference to the app itself is actually made via a UUID which you will see in a later step. You don’t need to change anything in the Supported account types section. Setting the Redirect URI is optional.

After the creation select the new app from the list of apps. If it does not appear in the list, query Select My apps on All apps. In the details for the app you will also find the Application (client) ID that you will need later. The Object-ID is not required.

azure register 2

2.2. Assigning permissions to the app

In order for your new app to have access rights to the monitoring data, you must assign them here. On the left of the main navigation page select the All resources item, and then select the Subscriptions:

azure subscriptions

In this page’s navigation go to Access Control (IAM) and select Add, and Add role assignment:

azure access control

Now, under role enter Reader, under Assign access to select the Azure AD user, group, or service principal value, and enter your new app’s name in the Select option:

azure role assignment

2.3. Creating a key for the app

Now you need a key (a secret) with which Checkmk can log in to the API. You can create a key in the app settings under Certificates & secrets. Simply click New client secret in the Client secrets section.

azure register 5

In the following window Microsoft would like you to enter a name of your choice in the Description field. We have chosen my-check-mk-key here. Don’t forget to select the correct time frame for your needs at the Expires option.

azure register 6

The setup under Azure is now complete, and you should now have the following four pieces of information:

  1. Your subscription ID

  2. Your tenant ID (also known as the "Directory ID").

  3. The application ID (client ID) for the my-check-mk-app app

  4. The secret for the key my-check-mk-key for this app

If you do not have your tenant ID at hand, find it by hovering over your login name in the tooltip under Directory:

azure register tenant id

You can see the subscription ID — for example on the Cost Management + Billing under My subscriptions. Note: Nowadays Microsoft does not display this ID as a hash, but instead as a human-readable name. You can use this new-style name in the usual way.

3. Setting up basic monitoring in Checkmk

3.1. Creating a host for Azure

Even though you are not dealing with a physical host in Azure, create a host for your Azure directory in Checkmk. The host name you can define at will. Important: Because Azure is a service and therefore does not have an IP address or DNS name (the special agent does the access itself), you must set the IP address family to No IP.

azure wato no ip

It is best to save with Save & Finish at this point, because of course the service discovery cannot work yet.

3.2. Configuring the Azure agent

Since Azure cannot be queried through the regular Checkmk agent, you now set up the Azure special agent — which is also known as a data source program. In this situation Checkmk does not contact the target host over TCP port 6556 as usual, instead it calls a utility that communicates with the target system via Azure’s application-specific API.

To do this, under Setup > Agents > VM, Cloud, Container > Microsoft Azure create a rule whose conditions apply exclusively to the Azure host that has just been created. There you will find the input fields for the IDs and the secret:

azure agent rule

Here you can also select the resource groups or resources that you want to monitor. If you have not checked explicitly specified groups, all resource groups are automatically monitored.

3.3. Testing

If you now perform a service discovery on the Azure host, only a single service called Azure Agent Info should be detected:

azure services ok

If access to the API does not work (because of a wrong ID or bad permissions, for example), an error message from the Azure API appears in the status text of Azure Agent Info:

azure services fail

3.4. Making resource groups available as hosts

For clarity, Azure monitoring in Checkmk has been designed so that each Azure resource group is represented by a logical (so to speak) host in Checkmk. This is done with the help of a piggyback procedure. This piggyback will take data from the Azure host using special agents, and within Checkmk redirect it to these resource group hosts.

The resource group hosts do not automatically appear in Checkmk. Place these hosts either manually or optionally with the Dynamic Configuration Daemon (DCD). Important — when doing so the names of the hosts must exactly match the names of the resource groups — and this is also case-sensitive! If you are uncertain about the exact spelling of the groups' names, you can do this directly from the Azure Agent Info service on the Azure host.

By the way — with the find_piggy_orphans auxiliary script from the treasures Directory you will find all of the piggyback hosts for which there are data, but which have not yet been created as a host in Checkmk:

OMD[mysite]:~$ share/doc/check_mk/treasures/find_piggy_orphans
Glastonbury
Woodstock

Configure the resource group hosts without an IP address (analogous to the Azure host), and select No API integrations, no Checkmk agent as the agent and Always use and expect piggyback data as piggyback.

wato host no agent

If you now perform a service discovery on one of these resource group hosts, you will find there are additional services that specifically relate to this resource group:

azure services piggy
Tip

If you want to freely-choose the names of the resource group hosts, with the Setup > Agents > Access to Agents > Hostname translation for piggybacked hosts rule you can define a conversion of resource groups to hosts.

4. Advanced configuration

4.1. Virtual machines (VMs)

When you use Azure to monitor virtual machines which simultaneously serve as your normal hosts in Checkmk — you can assign the Azure services associated with those VMs directly to the VM hosts in Checkmk instead of to the resource group hosts.

To do this, in the Azure rule, under the Map data relating to VMs option, select the Map data to the VM itself setting. For this to work the VM’s Checkmk host in monitoring must have exactly the same name as the corresponding VM in Azure.

4.2. Monitoring costs

The Microsoft Azure rule is preset so that Checkmk also monitors all costs incurred in your Azure environment. Specifically, the services display the costs incurred on the previous day. In this way you can quickly determine if there have been any changes.

Several services are created to get a better overview of exactly where costs have been incurred and to be able to set specific thresholds. The total costs at the level of your Azure directory are displayed for the Azure host that you created first. In addition, services are created for each host that represents a resource group. At both levels, Checkmk generates one service for the costs per so-called 'resource provider' (e.g. microsoft.compute and microsoft.network). The Costs Summary service then shows the total sum for the resource group or for the entire Azure directory.

You can use the Azure Usage Details (Costs) rule to define individual thresholds for all of these services.

If you do not wish to monitor costs, you must deactivate the Usage Details option in the Microsoft Azure rule.

4.3. Rate limit for API queries

Currently the API queries that Checkmk needs for monitoring Azure (as opposed to AWS) are free — however there is a limit to the number of queries permitted per time period (the "Rate Limit"). Per application ID the limit is 12,000 read requests per hour.

Due to the structure of the API, Checkmk requires at least one or more queries per requested resource. Therefore the total number of queries scales linearly with the number of resources being monitored. If the query limit is reached or exceeded, the query fails with a HTTP code 429 (too many requests), and the Check_MK service for the Azure host is flagged as CRIT.

This rate limit results from Azure’s so-called "token bucket" algorithm. It all starts with you having a "credit" of 12,000 remaining queries — each query consumes one of these. Simultaneously 3.33 queries per second are added to the credit. The output of the Azure Agent Info service lets you see how many queries are currently left.

Specifically, this means that:

  • If your query rate is sufficiently low, the available queries are always just under 12,000.

  • If your rate is too high, the credit will slowly go down to 0 and then errors will occur sporadically in the query.

In this case you can reduce the polling rate by querying fewer polling resource groups or resources, or by reducing the check interval for the Check_MK active check on the Azure host. This is possible with the Normal check interval for service checks rule.

So that you can react in time, the Azure Agent Info service monitors the number of remaining queries and warns you in advance. By default, for the remaining queries the warning threshold is 50 %, and the critical threshold is at 25 %.

5. Dashboards

CEE For a convenient start into Azure monitoring, the Checkmk Cloud includes the two built-in dashboards, Azure VM instances and Azure storage accounts. Both of these can be found as menu items in the monitoring under Monitor > Cloud.

To provide a clearer impression, following are two examples of how these dashboards are structured. First, the VM instances dashboard, in which you can compare the current state on the left side and the chronological history of the most important metrics on the right side:

Dashboard for the Azure VM instances.

The dashboard for the storage accounts is structured very similarly. On the left-hand side, you will find current data for the respective buckets. On the right, the most important metrics are again displayed chronologically:

Dashboard for the Azure storage accounts.
On this page