PSOD after apply ESXi 6.5 Update 1 Rollup on HP DLs

//PSOD after apply ESXi 6.5 Update 1 Rollup on HP DLs

PSOD after apply ESXi 6.5 Update 1 Rollup on HP DLs

Last week we had upgraded some of our HP Blades (HPE BL460c) from ESXi 6.0 to ESXi 6.5. We have used VMware Update manager to migrate/upgrade these ESXi hosts using the ESXi HPE image VMware-ESXi-6.5.0-Update1-5969303-HPE-650.U1.

The upgrade did run without any issues. After ESXi upgrade we run a search/scan in VUM for any updates/patches release after this ESXi release, then the problems started. We had an HPE driver bundle (version hpe-driver-bundle-650.9.6.5) and also a critical VMware Rollup ESXi650-Update1 (both release in 27/07/2017) to apply. After applying those updates, we get this PSOD.

Looking at PSOD details, we see “sfcb-intelci IP addr 0x43911b41c000″ and also in the line “DataAccess5Kbytes@(ixgben)”. Looking at those details, I suspected that a Network Adapter driver was the Root Cause of this PSOD. Not only because of the PSOD details, but also since we were applying HPE driver bundle updating the ixgbe (for our Intel 82599 10 Gigabit Dual Port from the Blade Backplane).

So I rollback the ESXi installation, check HERE how to, and apply the updates again but removed the HPE, only the VMware update 1 Rollup.

But again after applying the VMware Update 1 Rollup get the same issue. So the root cause for this PSOD was not the HPE driver bundle, but the VMware update itself.

This time I need to check the core dump file, check HERE how to extract the core dump file, and troubleshoot the driver that was the root cause for this PSOD. Here I could identify that the problem was the ixgbe driver, but cannot understand why(these core dump files are always too difficult to read).

Googling to see if anyone was having the same issue, I found some similar PSOD with the ESXi 6.5 Update 1 image, but not applying only the 6.5 Update 1 Rollup.

Discovered that these HPE DLs have the same issue:

  • HPE DL360p Gen8
  • HPE DL380 Gen8
  • HPE DL380 Gen9
  • HPE DL380p Gen9
  • HPE BL460c Gen8
  • HPE BL460c Gen9

So next step is to Rollback again the ESXi and open a VMware ticket support using the core dump file.

So VMware official statement is:

“The root cause is the Intel ixgbe driver, which is contained in the critical or non-critical standard baselines of the update manager.  As soon as the ESXi with these Baselines AND the HPE image is upgraded to 6.5 Update 1, a PSOD is the trigger.”

Currently, there are two ways to get the host on ESXi 6.5 Update 1 (without PSOD):

  1. Update here ONLY with the HPE image, as the driver is not included.
    Note: This is the option that we use, but still, the Update 1 Rollup always show to install. So this is not a solution.
  2. You install the update but do not restart the host yet.

For Option 2:
Connect to the host via SSH and either:

  • Uninstall the ixgbe driver. Check HERE how to.
  • Update the driver to version 1.5.3. Download from HERE the ixgbe v 1.5.3

VMware: “The development is already in the process of investigating the problem more closely and will contact the manufacturers in this regard to work out a fix”


  • All “vmklinux” drivers under 6.5 U1 can cause a PSOD during operation.
  • The driver with version 1.4.1 (native) causes the PSOD to reboot.
  • The only known driver without problem is the 1.5.3.

Note: You have a 3rd option: We can recreate a custom ESXi ISO with the ixgbe version 1.5.3(did not test this, but I may do this in the next days).

Or we wait for a solution from VMware to this ixgbe driver, or we apply the workaround.  My recommendation is to apply the workaround (option 2). The problem with this option 2 is if you have too many ESXi hosts, this can be a huge time consuming to perform this tasks.  If I have time, in the next days will try to create a PowerCLI to perform this task in all ESXi hosts.

Note: Share this article, if you think it is worth sharing.

©2017 ProVirtualzone. All Rights Reserved
By | 2017-12-30T02:50:03+00:00 September 18th, 2017|Virtualization|13 Comments

About the Author:

I am over 20 years’ experience in the IT industry. Working with Virtualization for more than 10 years (mainly VMware). I am an MCP, VCP and vExpert for the last 3 years. Specialties are Virtualization, Storage, and Backups. I am working for Elits a Swedish consulting company and allocated to a Swedish multinational networking and telecommunications company as a Teach Lead and acting as a Senior ICT Infrastructure Engineer. I am a blogger and owner of the blog


  1. Eric September 21, 2017 at 10:32 am - Reply

    So we had the same problem on DL360p Gen9 with dual 560SFP+ (intel 82599)

    The problem is related to both ixgbe/ixgben driver being install and the wrong one is loaded.

    Step1: You update the host to HPE Custom iso 6.5U1 (27/07/2017)
    after update you connect to SSH and you check what driver is installed by HPE:

    localcli software vib list | grep -i ixgb
    net-ixgbe 4.5.1-1OEM.600.0.0.2494585 INT VMwareCertified 2017-09-12

    If you install the rollup right now you will endup with PSOD

    Step2: Remove sfcb-intel cim provider:
    /etc/init.d/sfcbd-watchdog stop
    esxcli software vib remove -n=intelcim-provider
    /etc/init.d/sfcbd-watchdog start

    3) Start remediation. The host wil reboot.

    4) Connect to SSH and look at the installed drivers
    localcli software vib list | grep -i ixgb
    net-ixgbe 4.5.1-1OEM.600.0.0.2494585 INT VMwareCertified 2017-09-12
    ixgben 1.4.1-2vmw.650.1.26.5969303 VMW VMwareCertified 2017-09-12
    ==> take a look at the supported driver:

    in our case, driver ixgben was causing issues in host vents (vmkernel.log) because it was trying to enable Flow Control transmit frame wich is not supported by the driver. Beware that no alarm is triggered in vcenter (nor in Veeam One).
    every 15s: (unsupported) Device 10fb does not support flow control autoneg

    5) Enable the good driver, disable the wrong one
    esxcli system module set -e=true -m=ixgbe
    esxcli system module set -e=false -m=ixgben
    6) reboot host

    7) optional : reinstall sfcb intel cim provider.

    • Luciano Patrao September 21, 2017 at 12:46 pm - Reply

      Hi Eric,

      Thank you for your rely and thank you sharing your process.

      VMware statement is that ixgbe driver that bypass this issue is the 1.5.3.
      I did not test the sfcb, but I think the issue is not associated with this.

      Regarding enable Flow Control, I remember in 5.0 or 5.5, there was a similar issue. Don’t know if this is the same.

      The issue in all workaround if that if we apply the VUM vSphere 6.5 update 1 rollup the issue returns because it replaces the drivers. After we apply the update there is no way to stop the reboot.
      So the only option is like VMware told us, apply the workaround and don’t apply the the rollup.

      In my last communication with VMware say: “The development is already in the process of investigating the problem more closely and will contact the manufacturers in this regard to work out a fix”

  2. Eric September 21, 2017 at 4:09 pm - Reply

    The resolution steps I posted are what VMWare suppor provided to me.

    removing the sfcb-intelci cim provider allows you to reboot after the rollup update (because it is this proccess triggering the PSOD when attemping an action not supported by the active driver, like shown on you PSOD screenshot).

    Then you can activate the ixgbe driver provided by HPE, in the case of 6.5U1 custom ISO : net-ixgbe 4.5.1-1OEM.600.0.0.2494585

    • Luciano Patrao September 21, 2017 at 5:53 pm - Reply

      Well is different from what they reply to me in the support.

      But it test and if works, in our case I will update the article with that. Off course will attached your name for the solution 😉

  3. Phil October 9, 2017 at 8:48 am - Reply

    Good Morning,

    did you hear something new from VMWare?
    The problem still occurs with the HP ISO VMware ESXi 6.5.0 Update 1-5969303-HPE-650.U1. and the update to build 6765664.
    Thanks and best regards

    • Luciano Patrao October 11, 2017 at 3:07 am - Reply

      Hi Phil,

      Did not have any update from VMware regarding this issue. And honestly didn’t have much time to test some of the workarounds. We did not apply of upgrade for now our production systems. But is still on my list to do some tests.
      But the latest HPE ISO from 6.5 you did install in witch HP servers?

      Thank You

      Luciano Patrao

  4. Dominik Zorgnotti October 16, 2017 at 1:55 pm - Reply

    Hey folks,

    I do not want to take credit for your groundwork but in my case a simpler way worked (relates to your option 2):

    1) Install the ixgben-driver 1.5.3 before you do anything else (no need for uninstallation of the old one)
    2) Reboot and verify operations
    3) Apply patches, including rollup to ESXi 6.5 U1

    The native drive will force the VUM to obmit the ixgbe-stuff in the updates.
    I am currently in the verification phase to see if this brings any new errors.

    localcli software vib list | grep -i “ixg”
    ixgben 1.5.3-1OEM.600.0.0.2768847 INT VMwareCertified 2017-10-16
    net-ixgbe 4.5.1-1OEM.600.0.0.2494585 INT VMwareCertified 2017-07-27

    localcli system version get
    Product: VMware ESXi
    Version: 6.5.0
    Build: Releasebuild-6765664
    Update: 1
    Patch: 29

    If I find time I’ll write a more detailed blog about it, until then I hope someone can use this info.

    • Luciano Patrao October 16, 2017 at 2:21 pm - Reply

      Hi Dominik,

      We are all here to share knowledge, so no issues and all information’s and examples are always good.

      Regarding your example, that one was one of my first tries and did not work. After I applied the rollup I get the PSOD. So that workaround did not work for me.

  5. Dominik Zorgnotti October 16, 2017 at 6:55 pm - Reply

    Thanks for the information, I will try to update some more hosts over the next days to find a common determinator.

  6. Maciek November 3, 2017 at 9:58 am - Reply

    There is new ISO released today (2017-11-03) which can help resolve this issue:

    • Luciano Patrao November 3, 2017 at 2:10 pm - Reply

      Hi Maciek,

      Thanks I have notice.

      Plan to try next week if really fix the issue.

      Thank You again for the update.

  7. Mike November 5, 2017 at 7:47 pm - Reply

    I just tried the new image and the host will now run for 1.5 minutes and then PSOD.

    HP ML110 w/ P400 controller running esxi off USB

    • Luciano Patrao November 6, 2017 at 10:27 pm - Reply

      Hi Mike,

      Thanks for your update.

      I will only test next weekend, since is when is possible to have some downtime for some of our systems.

Leave a Reply

%d bloggers like this: