|
| 1 | +--- |
| 2 | +date: 2025-11-27T10:00:00.000Z |
| 3 | +title: A Brief Overview Of The check_vsphere Plugin |
| 4 | +tags: |
| 5 | +- omd |
| 6 | +- vsphere |
| 7 | +--- |
| 8 | + |
| 9 | +## What is it? |
| 10 | + |
| 11 | +[check\_vsphere](https://github.com/consol-monitoring/check_vsphere) |
| 12 | +is a plugin for Naemon, Icinga, and Nagios-compatible systems. |
| 13 | +It checks various aspects of ~VMware~Broadcom vCenter or ESX hosts. |
| 14 | + |
| 15 | +For a long time, this was done using `check_vmware_esx.pl` |
| 16 | +or `check_esx.pl`. However, Broadcom (formerly VMware) |
| 17 | +has decided to deprecate the Perl SDK for vCenter. |
| 18 | +Therefore, we decided to rewrite the parts our |
| 19 | +customers use in Python using the [pyVmomi](https://github.com/vmware/pyvmomi/) |
| 20 | +library. |
| 21 | + |
| 22 | +In this article, I will provide an overview of what the plugin |
| 23 | +can do and delve into some of its features. |
| 24 | + |
| 25 | +Development happens at |
| 26 | +[Github](https://github.com/consol-monitoring/check_vsphere). |
| 27 | +Feel free to open issues or pull requests. |
| 28 | + |
| 29 | +## Authentication |
| 30 | + |
| 31 | +Currently, only user/password-based authentication is supported. The |
| 32 | +common options needed to establish a connection are: |
| 33 | + |
| 34 | +* `-u USERNAME` |
| 35 | +* `-p PASSWORD` can be omitted in favor of the `VSPHERE_PASS` |
| 36 | + environment variable |
| 37 | +* `-s ADDR` hostname of the vCenter or ESX host |
| 38 | +* `-nossl` whether TLS verification should be skipped |
| 39 | + |
| 40 | +So a command line has at least this basic structure: |
| 41 | + |
| 42 | +``` |
| 43 | +check_vsphere subcommand -u user -p pass -s addr [subcommand options] |
| 44 | +``` |
| 45 | + |
| 46 | +In this document `[AUTH]` just means: `-u user -p pass -s addr`. |
| 47 | + |
| 48 | +## Checks |
| 49 | + |
| 50 | +Here is a brief overview of some features, to see the full list please see |
| 51 | +[the documentation](https://omd.consol.de/docs/plugins/check_vsphere/cmd/). |
| 52 | + |
| 53 | +### VSAN |
| 54 | + |
| 55 | +The [vsan](/docs/plugins/check_vsphere/cmd/vsan/) command |
| 56 | +offers two modes: |
| 57 | + |
| 58 | +* `healthtest` – shows exactly what you see under |
| 59 | + **Cluster → Monitor → vSAN → Skyline Health** in vCenter. |
| 60 | +* `objecthealth` – performs a detailed check of vSAN object health. |
| 61 | + |
| 62 | +Please try them, they are not used very much and may need some fine tuning. |
| 63 | + |
| 64 | +### Host checks |
| 65 | + |
| 66 | +There are several host checks in `check_vsphere`: |
| 67 | + |
| 68 | +* **[host-runtime](/docs/plugins/check_vsphere/cmd/host-runtime/)** |
| 69 | + offers a few modes: |
| 70 | + * **status** – vCenter calculates an overall host status. This mode |
| 71 | + just maps the colors to exit codes (green → OK, yellow → warning, |
| 72 | + red → critical). |
| 73 | + * **con** – checks whether the host can still talk to the vCenter. |
| 74 | + * **health** – runs various health checks exposed by the API for the |
| 75 | + host (memory, voltage, fans, …) and reports any problems. |
| 76 | + * **temp** – walks through the temperature sensors and reports issues. |
| 77 | + The state is determined by the vCenter/ESX host itself. |
| 78 | +* **[host-nic](/docs/plugins/check_vsphere/cmd/host-nic/)** - |
| 79 | + This check verifies if all network interfaces are connected |
| 80 | +* **[host-service](/docs/plugins/check_vsphere/cmd/host-service/)** - |
| 81 | + This check can verify if various services are running on a host, like ntp, DCUI, vpxa etc. |
| 82 | + |
| 83 | +### VM checks |
| 84 | + |
| 85 | +* **[media](/docs/plugins/check_vsphere/cmd/media/)** – |
| 86 | + spots VMs that still have a CD‑ROM attached. |
| 87 | +* **[vm-tools](/docs/plugins/check_vsphere/cmd/vmtools/)** – |
| 88 | + flags VMs without guest tools installed. |
| 89 | +* **[vm‑net‑dev](/docs/plugins/check_vsphere/cmd/vmnetdev/)** – |
| 90 | + finds VMs that contain unused network devices. |
| 91 | +* **[snapshots](/docs/plugins/check‑vsphere/cmd/snapshots/)** – |
| 92 | + reports VMs with an unexpected number of snapshots or snapshots |
| 93 | + that are too old. |
| 94 | +* **[vm‑guestfs](/docs/plugins/check‑vsphere/cmd/vmguestfs/)** – |
| 95 | + monitors filesystem usage of VM volumes via vCenter. |
| 96 | + |
| 97 | +### PerfCounters |
| 98 | + |
| 99 | +#### Overview |
| 100 | + |
| 101 | +The vCenter has a variety of [performance |
| 102 | +counters](https://dp-downloads.broadcom.com/api-content/apis/API_VWSA_001/8.0U3/html/ReferenceGuides/vim.PerformanceManager.html). |
| 103 | +These counters may be related to VirtualMachines, HostSystems, Datacenters, |
| 104 | +ClusterComputeResources, and possibly more. |
| 105 | + |
| 106 | +`check_vmware_esx` had many hard-coded options for specific |
| 107 | +performance counters. We decided to generalize this so any |
| 108 | +performance counter can be checked with `check_vsphere`. |
| 109 | + |
| 110 | +To get a list of performance counters available on a vCenter, the |
| 111 | +`list-metrics` command can be used. |
| 112 | + |
| 113 | +``` |
| 114 | +check_vsphere list-metrics [AUTH] |
| 115 | +``` |
| 116 | + |
| 117 | +If you're coming from `check_vmware_esx`, |
| 118 | +[the documentation](/docs/plugins/check_vsphere/cmd/perf/#rosetta) has a list of |
| 119 | +all the performance counters that were supported by `check_vmware_esx` and their |
| 120 | +counterparts in `check_vsphere`. However, as mentioned earlier, you can check |
| 121 | +any performance counter. For example, to monitor the power consumption of an ESX |
| 122 | +host: |
| 123 | + |
| 124 | +``` |
| 125 | +check_vsphere perf [AUTH] --perfcounter power:power:average \ |
| 126 | + --vimtype HostSystem --vimname esx-hostname \ |
| 127 | + --critical 400 |
| 128 | +``` |
| 129 | + |
| 130 | +#### Instances |
| 131 | + |
| 132 | +`check_vmware_esx` and its related tools have a significant bug. |
| 133 | +Performance counters can have instances. For example, disk I/O counters |
| 134 | +are available for each disk, where each disk represents an instance of |
| 135 | +the counter. When you monitor this with `check_vmware_esx`, you only |
| 136 | +monitor a random disk and ignore all the others. Yes, we have been |
| 137 | +monitoring random disks for years. |
| 138 | + |
| 139 | +With `check_vsphere`, you can now check specific disks using the |
| 140 | +`--perfinstance` flag. The default instance is an empty string, which |
| 141 | +is a special value. It monitors the aggregate (average) across all |
| 142 | +instances where this is applicable. This is only available when it |
| 143 | +makes sense; for example, CPU usage can have an aggregate over all |
| 144 | +cores. However, calculating the average across several different disks |
| 145 | +is generally not meaningful, so vSphere does not provide this aggregate. |
| 146 | + |
| 147 | +You can also check each instance with `--perfinstance '*'`. In this |
| 148 | +case, the threshold is applied to each instance, and the highest |
| 149 | +criticality is returned. |
| 150 | + |
| 151 | +``` |
| 152 | +# check disk latency |
| 153 | +# the default perfinstance is '' which is the aggregate and not available |
| 154 | +# for this counter |
| 155 | +$ check_vsphere perf -s vcenter.example.com -u naemon@vsphere.local -nossl \ |
| 156 | + --vimname esx1.int.example.com --vimtype HostSystem \ |
| 157 | + --perfcounter disk:totalLatency:average |
| 158 | +UNKNOWN: Cannot find disk:totalLatency:average for the queried resources |
| 159 | +
|
| 160 | +# On that error you may want to try --perfinstance '*' |
| 161 | +# now you see all instances for this counter |
| 162 | +
|
| 163 | +$ check_vsphere perf -s vcenter.example.com -u naemon@vsphere.local -nossl \ |
| 164 | + --vimname esx1.int.example.com --vimtype HostSystem \ |
| 165 | + --perfcounter disk:totalLatency:average --perfinstance '*' |
| 166 | +OK: disk:totalLatency:average_naa.6000eb3810d426400000000000000277 has value 0 Millisecond |
| 167 | +disk:totalLatency:average_naa.600605b00ba8cb0022564867b8c8cc32 has value 2 Millisecond |
| 168 | +disk:totalLatency:average_naa.6000eb3810d4264000000000000000b2 has value 0 Millisecond |
| 169 | +disk:totalLatency:average_naa.600605b00ba8cb001fd947850523e56d has value 0 Millisecond |
| 170 | +disk:totalLatency:average_naa.600605b00ba8cb0029700b163217244e has value 6 Millisecond |
| 171 | +disk:totalLatency:average_naa.6000eb3810d4264000000000000002b3 has value 1 Millisecond |
| 172 | +| 'disk:totalLatency:average_naa.6000eb3810d426400000000000000277'=0.0ms;;;; |
| 173 | +'disk:totalLatency:average_naa.600605b00ba8cb0022564867b8c8cc32'=2.0ms;;;; |
| 174 | +... |
| 175 | +
|
| 176 | +# you can also check a single instance specifically |
| 177 | +$ check_vsphere perf -s vcenter.example.com -u naemon@vsphere.local -nossl \ |
| 178 | + --vimname esx1.int.example.com --vimtype HostSystem \ |
| 179 | + --perfcounter disk:totalLatency:average --perfinstance naa.600605b00ba8cb0022564867b8c8cc32 |
| 180 | +OK: disk:totalLatency:average_naa.600605b00ba8cb0022564867b8c8cc32 has value 2 Millisecond |
| 181 | +| 'disk:totalLatency:average_naa.600605b00ba8cb0022564867b8c8cc32'=2.0ms;;;; |
| 182 | +``` |
0 commit comments