/vSphere 7 update 2/3 with HA issue using i40enu driver

vSphere 7 update 2/3 with HA issue using i40enu driver

I this blog post, vSphere 7 update 2/3 with HA issue using i40enu driver, I will talk about a problem with HA that could and will happen when you update your vSphere 7 to update 2c or 3.

When you apply these updates on your vSphere 7, your HA will start going crazy and fail if you have HA enabled. It will enable, disable, and every 1/2m trying to reconfigure the HA settings.

Examples:

vSphere 7 update 2/3 with HA issue using i40enu driver

Then HA stops with

After a couple of minutes, everything starts again with HA trying to enable and reconfigure.

Looking at the logs in HA log(fdm.log), I get this:

Besides this issue, we cannot apply a patch from and to 7.0 Update 2c/2d or subsequent Update 2 patch failed. It is not possible to apply vSphere 7 Update 3, and stage or remediation will not work.

Sometimes we get: “host returned esxupdate code -1”

The driver i40en triggers this problem. The name of this driver was changed in vSphere 7 update 2 to  i40enu, and for some reason, in vSphere 7 Update 3 cannot change back to i40en(by replacing the driver). Then, we have issues using an image to patch, and ESXi will have both drivers because i40en would not replace i40enu during the update. When only one should be running (the proper one).

So to fix this, we need to remove the previous one(that was changed in update 2 i40enu), then reboot, apply the vSphere 7 Update 3 that will replace it with the proper one i40en, and only one should be installed in ESXi.

VMware doesn’t have a fix for this issue yet, and there is only a workaround. Is to remove the i40enu.

How to fix this?

  • First, disable HA so that ESXi hosts stop the above tasks.
  • Second, put the ESXi host in maintenance mode before you start the following tasks.

First, we need to check what driver we have and check your network interfaces to make sure you use the ones that use this driver ( Intel(R) X710/XL710/XXV710/X722 Adapters).

vSphere 7 update 2/3 with HA issue using i40enu driver

Next, we will check both drivers in detail.

vSphere 7 update 2/3 with HA issue using i40enu driver

After we check that we have both drivers, remove the recent and leave the old one that vSphere 7 update 3 will change the name again, this time when there is only one driver.

Next, you can reboot the ESXi host. You can now apply vSphere 7 Update 3.

Post upgrade to 7.0 Update 3, do not apply 7.0U2x based baselines or the “critical host patches” baseline until a fix for this issue is made available in future patches. Otherwise, you may hit a similar issue again.

When all ESXi hosts are fixed, apply the vSphere you can now enable the HA back again, and all ESXi hosts should have HA agents installed and enabled correctly.

Important note: This is not a fix, and it is a workaround. VMware will launch a fix or an update that will fix the issue when using update 2 or update 3.

Some information about this problem you can find here in this VMware KB 85982.

Update 08/11/2021:

It seems that are some known issues (too many) with this vSphere 7 update 3. Next, I had a list of all known issues listed by VMware.

Since the release of vSphere 7.0 Update 3 there have been the following reported issues.

Check in this VMware KB all the known issues and fix workaround KB86281

Final Notes:

If you are running into upgrade issues due to intel-nvme-vmd/iavmd drivers, please refer VMware KB 85701for the relevant resolution/workaround.

It seems that since vSphere 7 was launched is still not stable. Every vSphere 7 update, we have some issues, and some are minor(that is normal), but some are very serious and with downtime on production.

All the above issues have some workaround and only be fixed in the following vSphere 7 update 3a. So, for now, apply the workaround and wait for the final fix.

I have another support ticket for another issue that I have in one of my Cluster. Something I rarely do, but with this vSphere 7 I have more support tickets than in 6.5 and 6.7 together.

Share this article if you think it is worth sharing. If you have any questions or comments, comment here, or contact me on Twitter.

©2021 ProVirtualzone. All Rights Reserved

 

By | 2021-11-08T23:31:54+01:00 November 4th, 2021|VMware Posts, vSphere|2 Comments

About the Author:

I have over 20 years of experience in the IT industry. I have been working with Virtualization for more than 15 years (mainly VMware). I recently obtained certifications, including VCP DCV 2022, VCAP DCV Design 2023, and VCP Cloud 2023. Additionally, I have VCP6.5-DCV, VMware vSAN Specialist, vExpert vSAN, vExpert NSX, vExpert Cloud Provider for the last two years, and vExpert for the last 7 years and a old MCP. My specialties are Virtualization, Storage, and Virtual Backup. I am a Solutions Architect in the area VMware, Cloud and Backup / Storage. I am employed by ITQ, a VMware partner as a Senior Consultant. I am also a blogger and owner of the blog ProVirtualzone.com and recently book author.

2 Comments

  1. Radim Pesa 08/11/2021 at 09:52

    Thanks for the post. There are serious bugs in 7.0U3 (https://kb.vmware.com/s/article/86287). KB 86100 can crash whole cluster. Hopefully solved in 7.0U3a. Be careful with it.

Leave A Comment