When we first started virtualizing, we looked at a number of different things initially to monitor performance: CPU usage, memory overhead, network saturation, and free space on storage. I like to equate those to the warning lights of the cars we drive, as they really are only indicators of serious problems (like a check engine light).
In particular, storage is one area where you will potentially spend a lot of time monitoring, and hopefully less time troubleshooting. I like to take a three-phased approach to storage, in general. This approach is metric-driven, and then impacts every other area of the storage practice for virtualization.
Historically, I only monitored free space, the check engine light of storage as it were. I then advanced to monitoring latency (multiple types of latency) on datastores, which is a good indicator of a volume’s performance. Nowadays, I’ve zoomed into looking at IOPs for virtual machines. This is a good way to see what is really going on.
But, it’s important to note that all of these measures are important. Others are important as well, but generally speaking free space is my generic measurement, latency is the medium-measurement, and IOPs is the detail measurement. And the latter applied to a specific virtual machine is excellent visibility.
IOPs are simply Input/Output operations per second, and are nothing new in terms of measuring storage systems. While I hope to avoid having the metric fetish phenomena (over-measuring), it is important to have visibility into this specific value through the abstraction of virtualization.
There are a lot of ways to obtain the IOPs and other storage values. The vSphere Client, storage solutions, and monitoring tools can all report these values. One way to do this is with storage solutions, and in Figure A below, a VM running on a Tintri datastore is reporting its IOPs in a VMware virtual environment:
Figure A
Click to enlarge.
In this example, the virtual machine leading this view ending in 007 is consuming 1,643 IOPs. This particular storage system is a hybrid rotational storage system with high performance non-rotational storage resources. The Tintri moves the hot spots of a VMDK to the non-rotational storage. Consider that the average SATA drive may push less than 100 IOPs; the one VM leading this view at a quiet time could require 16 drives to satisfy over 1,600 IOPs just by itself. My next task is to figure out what virtual machine 007 is doing! This is the visibility we need for our storage in virtualized environments.
There are a lot of ways to get this visibility into your virtual machines — and use the right metrics (free space, latency and IOPs). What ways do you measure your virtual machine storage in detail? Share your comments below.