With Checkmk you can monitor ESXi-Hosts and also its VMs. Thus, for example, on a host it is possible to query Disk-IO, datastore performance, the status of physical network interfaces, diverse hardware sensors, and much more. Checkmk likewise offers a series of check plug-ins for the VMs. A comprehensive list of these can be found in the Catalog of check plug-ins in the ‘VMWare ESX’ section.
Using the piggyback technique VM data will be displayed directly in its associated host. Thus the VM related data is found right where it is actually required, and where it can be compared to that reported by the VM’s OS:
Access to this data is achieved via the HTTP-based vSphere-API — not over the normal agents or SNMP. This means that no agent or other software needs to be installed on the ESXi-Hosts, and that the access is very simple to set up. Older systems — from Version 4.1 — are also supported.
2. Setting up
2.1. Setting up via the ESXi host system
The initial setup for monitoring a ESXi-server is very simple, and can be completed in less than five minutes. Before you can set up the access however, the following prerequisites must be satisfied:
You must have defined a user on the ESXi-server. It is sufficient that this user only has read access.
You must have defined the ESXi-server as the host in Checkmk, and configured it as an agent (Checkmk Agent). Tip: Select the host name so that it is the same as that known to the server itself.
Once the prerequisites have have been satisfied you can create a Rule in the Datasource programs > Check state of VMWare ESX via vSphere rule set. This will be assigned to the defined host, so that instead of the standard agent the Special Agent will be used for retrieving data from the VMware-monitoring.
Enter the user’s name and password as they have been defined on the ESXi-Server. The condition for the rule must be set on the host defined in Checkmk. After this the first installation will be complete and Checkmk can retrieve the data from the server.
Finally, go back to the host configuration, and execute a Service discovery. This should find a number of services:
Activate the changes as usual. If no services have been identified, you can search for errors in the configuration with the Diagnostic options, as described later in this article.
2.2. Setting up using vCenter
If a vCenter is available, instead of retrieving the monitoring data via the individual host systems you can also call up the vCenter. This method has various advantages and disadvantages:
Simple application in situations where VMs are assigned dynamically using vMotion.
No monitoring if vCenter is unavailable.
Monitoring of a cluster’s total RAM usage is possible.
No monitoring of hardware-specific data in the cluster’s nodes (e.g., RAM-disks and network cards).
A combination of both methods can also be utilised — then you can have the best of both worlds.
Configuring the vCenter
Similar prerequisites apply for this configuration as for the configuration over a single ESXi-Server:
A user with read access must be present on the vCenter.
The vCenter has been defined as a host and configured as a Checkmk Agent agent in Checkmk
If the ESXi-Servers have already been configured in Checkmk and you wish to combine the monitoring, then in vCenter their names will be the same as they are configured as hosts in Checkmk
As described earlier, create a rule for the VMware-monitoring’s special agent, in Type of Query select the vCenter, and set the condition to the appropriate host as defined in Checkmk:
With this the configuration will be completed. Execute a service discovery for the vCenter-host as described earlier.
Retrieving from ESXi-Hosts and vCenter
In order to avoid duplicated data retrieval when using a combination of both configuration methods, the rule for the vCenter can be configured to retrieve only specific data. One possibility is to access the Datastores and the Virtual Machines over the vCenter, and the other data directly from the ESXi-hosts. The license usage can be fetched in both configurations since the vCenter reports an overall status.
If you have already configured the ESXi-hosts, its rules will be adapted accordingly. Here only access to the Host Systems and Performance Counters is available, since these belong unalterably to a particular ESXi-server. The license status is applicable only to the retrieved ESXi-server.
2.3. Monitoring the VMs
By default, only the status of the VMs as services is created and assigned to the ESXi, or the vCenter respectively. There is however even more information available from these VMs — from RAM, or the Snapshots, for example. This data is stored as piggyback data and assigned directly to the hosts which correspond to the VMs in Checkmk.
In order to make this data visible, the VM must be defined as a host in Checkmk. You can of course install the Checkmk agent on the VM and take full advantage of its functions. The piggyback data will simply be added to that already available.
Naming the piggyback data
If the host name of the VM in Checkmk matches the name of the VM, the assignment works automatically. If not, there are various options in Checkmk to customize the piggyback name. The following options are available in the configuration rule itself:
From Version 1.4.0i1] it is possible to use the VM’s operating system’s host name, if this can be accessed via the vSphere-API
If the VM’s name includes blank characters, the name will be truncated after the first blank. Alternatively, the blanks can be replaced with underscores
If the host’s name is quite different in Checkmk, an explicit allocation can be performed with the help of the Access to agents > Hostname translation for piggybacked hosts rule.
If the host is configured in Checkmk and the names conform, you can activate the Display VM power state on check box in the configurations rule — select if and where the data is to be made available. Select The Virtual Machine here.
With a service discovery on the host(s) the new services will now be identified and can be activated. Be aware that the information from the services could differ from one another. The ESXi-Server will see a virtual machine’s RAM usage differently to how the machine’s own OS reports it.
3. Diagnostic options
When searching for the source of an error there are a number of ‘ports of call’. Since the data comes from the ESXi-/vCenter-Server, this is a logical place to start searching for the error. Later it is important that the the data gets to the Checkmk-Server, and can be correctly processed and displayed there.
3.2. Problems with an ESXi-/vCenter-Server configuration
curl command you can verify whether the server is accessible from
myESXhost.my-domain.net HTTP/1.1 200 OK Date: Fri, 4 Nov 2016 14:29:31 GMT Connection: Keep-Alive Content-Type: text/html X-Frame-Options: DENY Content-Length: 5426curl -Ik
Whether the access data has been entered correctly — and whether Checkmk can access the host — can be tested on the console with the Special-Agent. Use the
to receive a complete list of the available options. In the example, with the aid
grep the output was limited to a specific section and the first four
lines following it — you can omit this in order to receive a complete output,
or filter for another:
share/check_mk/agents/special/agent_vsphere --debug --user myesxuser --secret myesxpassword -D myESXhost | grep -A4 esx_vsphere_objects <<<esx_vsphere_objects:sep(9)>>> hostsystem myESXhost poweredOn hostsystem myESXhost2 poweredOn virtualmachine myVM123 myESXhost poweredOn virtualmachine myVM126 myESXhost poweredOn
Whether Checkmk can access the host can be verified on the console. Here the output is also limited to five lines:
cmk -d myESXhost | grep -A4 esx_vsphere_objects <<<esx_vsphere_objects:sep(9)>>> hostsystem myESXhost poweredOn hostsystem myESXhost2 poweredOn virtualmachine myVM123 myESXhost poweredOn virtualmachine myVM126 myESXhost poweredOn
Alternatively, you can carry out the test on the host’s diagnostic page in WATO:
If everything works up to this point the output should have been saved to a temporary directory. Whether such a file has been produced, and whether the content is correct can be determined with the following:
ll tmp/check_mk/cache/myESXhost -rw-r--r-- 1 mysite mysite 17703 Nov 4 15:42 myESXhost head -n5 tmp/check_mk/cache/myESXhost <<<esx_systeminfo>>> Version: 6.0 AgentOS: VMware ESXi <<<esx_systeminfo>>> vendor VMware, Inc.
3.3. Problems with piggyback data
Checkmk creates a directory containing a text file for each host. In this text file can be found the data which is to be allocated to the hosts.
ll tmp/check_mk/piggyback/ total 0 drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM123/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM124/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM126/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myESXhost2/ ll tmp/check_mk/piggyback/myVM123/ -rw-r--r-- 1 mysite mysite 1050 Nov 4 15:51 myESXhost
If these directories or files are absent they have not been created by the Special-Agents. You can see if the VM’s data is included in the agent’s output. Should this situation arise, look in the configuration rule for the ESXi-/vCenter-host to see if the data retrieval has been activated.
grep "<<<<myVM123>>>>" tmp/check_mk/cache/myESXhost <<<<myVM123>>>>
In the case of a very large number of such directories for piggyback data it can be very difficult to find those that have no allocation to a host. Here we provide a script with which unassigned piggyback hosts can easily be found:
From the script output it can be that Checkmk can’t find a host with the same name to which it can allocate the data. The piggyback names can however be altered in a number of ways.
4. Files and directories
WATO saves the piggyback data here. For each host a subfolder is created with the host’s name — this subfolder contains a text file with the host’s data. The filename is the name of the host providing the data.
Here the respective latest agent output from all hosts is temporarily saved. The content of a host’s file is identical to the cmk -d myhost command.
The special agent for executing a query of ESXi and vCenter servers. This script can also be executed manually for testing purposes.
A script for finding piggyback data that is not allocated to a host.