This is a short post, just figured I’d throw this out there and save people some time in case they’re stuck.
DRS main objective is to allocate the resources that the virtual machines request. There’s a common misconception where users believe that DRS is responsible for distributing virtual machines evenly across all hosts in the cluster, this is not the case. For example, I’ve seen many environments where one or two hosts in a cluster have way more virtual machines then other hosts. This is because the dynamic entitlement (the resources the virtual machine request) are met so there’s no need to vMotion the virtual machine to another host.
Sometimes we only want certain virtual machines to be tied to certain hosts and to do so, we use DRS affinity rules.
Today I had an issue with one of my ESXi 5 hosts and in order to avoid interruption of some critical virtual machines like Exchange, BES, DC’s, etc I decided to the host into maintenance mode and let DRS handle the virtual machine moves and after begin troubleshooting the host.
Every virtual machine moved over except for 2 SQL VM’s. Upon some investigation I noticed the following under faults in the DRS tab. To someone who is not new to this will know right away what the problem is. But basically what happened, was that these VM’s were tied to the host through DRS rules and thus could not be moved.
To temporarily fix the issue, we just need to disable the DRS rule or rules to allow the virtual machines to be moved over to the other hosts. Once the ESXi host is fixed and out of maintenance mode, we can re-enable the rule and let DRS handle the rest.
