vMotion fails to migrate and EVC fails to enable

//vMotion fails to migrate and EVC fails to enable

vMotion fails to migrate and EVC fails to enable

This week and trying to do some maintenance on several ESXi hosts (vSphere 6.5) notice that vMotion and DRS were not migrated some of the VMs in some hosts. vMotion fails to migrate and EVC fails to enable.

Troubleshoot the VMkernel by pinging all ESXi hosts using vMotion network using command vmkping.

Troubleshooting:

First, let us check VMkernel adapters to identify vMotion VMkernel adapters:

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

Identified the VMkernel and IP address of vMotion, vmk1, let us ping other ESXi hosts (if you don’t have all IPs, you need to run the above command in all ESXi hosts to display the vMotion IP address).

In this case was from 10.0.28.37 to 10.0.28.48  (all ESXi hosts from the same Cluster where VMs should migrate)

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

All vMotion VMkernel IPs did ping from all ESXi hosts. No issues found here.

As we can see in the IP display board, MTU is 9000, that means Jumbo Frames are enabled. So I try same vmkping command with Jumbo Frames to make sure all ESXi hosts were able to ping each other (even using Jumbo Frames), so the issues were not on the network level.

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

No issues found in the vMotion network, the problem needs to be somewhere.

Rechecked DRS and tried a manual Hot vMotion to some ESXi hosts was working and had no issues,  but then I tried to move VMs from one or two ESXi hosts particular ESXi host and was not possible. Any Hot Migration that was done to this ESXi hosts doesn’t work. If I try a cold migration, I have no issues.

When try again to manually migrate a VM from a working host to the ones that were not possible to migrate VMs was when I saw this message.

“The target host does not support the virtual machine current hardware requirements. To resolve CPU incompatibilities, use a cluster with Enhanced vMotion Compatibility (EVC) enabled”

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

Strange error and strange that I need to enable EVC to perform this migration when all servers and CPUs in this Cluster are the same.

Even was strange that I need EVC to do this migration I try to enable EVC to perform this migration and none of the VMware EVC modes were compatible with the CPU, and I get: “The host’s CPU hardware does not support the cluster’s current Enhanced vMotion Compatibility mode. The host CPU lacks features required by that mode.”.

Again this is a strange behavior since all Dell servers are the same model and also CPU.

Then I start looking at other alternatives, since this was a new Cluster and also some updates were already applied (but not all) I double check if Spectre/Meltdown and L1 Terminal Fault – VMM vulnerability patches were applied in all ESXi hosts.

Since this is a patch that affects ESXi host CPU behavior and also VMs Guest OS, I double check the build for each ESXi host.

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

As we can see above, there is a mismatch between two ESXi hosts from the ten that existing on this particularly Cluster.

vSphere 6.5 Build 893507 doesn’t have cpu-microcode patch ESXi650-201808402-BG for the L1 Terminal Fault – VMM vulnerability. Since this affects CPU, I am pretty sure this is the issue on this Cluster where VMs already patched were not allowed to migrate to a host that still doesn’t have the proper patch applied.

This was the main reason why vMotion fails to migrate and EVC fails to enable.

Next is to go to Update Manager do a scan on this ESXi hosts and check missing patches.

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

As we can check above, those are the missing patches on those ESXi hosts, including the cpu-microcode patch. Next, I just applied the patches on the faulty ESXi hosts, reboot and tried to hot migration to those ESXi hosts again.

vMotion fails to migrate and EVC fails to enable

Click to Enlarge

Now all VMs can migrate from all ESXi hosts inside the Cluster, and there are no more compatibility issues.

Conclusion:

Once again is proved that ALWAYS we should apply patches to all ESXi hosts and all should always be up to date, particularly the security patches. We should not have mismatch vSphere builds inside a Cluster, or even inside the vCenter.

Particularly with the latest problems with Spectre/Meltdown and with the CPU Intel issues the L1TF security issues, patching your ESXi hosts also and adequately VMs is very important.

In the past weeks, VMware informed customers about a bug in the VMware Tools version 10.3.0, that was bug causing PSOD for Windows OS and that version 10.3.2 was launched in the last days to fix this issues.

Again, to have a clean and safe VMware environment, updates should always be applied as soon as possible in your vCenter, vSphere or Virtual Machines (VMware Tools and Virtual Hardware).

Note: Share this article, if you think it is worth sharing.

©2018 ProVirtualzone. All Rights Reserved

By | 2018-09-20T18:09:08+02:00 September 18th, 2018|Virtualization|0 Comments

About the Author:

I am over 20 years’ experience in the IT industry. Working with Virtualization for more than 10 years (mainly VMware). I am an MCP, VCP6.5-DCV, VMware vSAN Specialist, Veeam Vanguard 2018/2019, vExpert vSAN 2018/2019 and vExpert for the last 4 years. Specialties are Virtualization, Storage, and Virtual Backups. I am working for Elits a Swedish consulting company and allocated to a Swedish multinational networking and telecommunications company as a Teach Lead and acting as a Senior ICT Infrastructure Engineer. I am a blogger and owner of the blog ProVirtualzone.com

Leave a Reply

%d bloggers like this: