Today in my nested vSAN I had a dead ESXi host with a PSOD. Since I had some issues with one of my Storage disks, I think is the cause of this nested ESXi issue. I restore the nested VM with a Veeam Backup. Afterward, the ESXi host was not possible to connect to the vCenter and vSAN Cluster anymore.
When I try to rejoin the ESXi host I get an error message regarding vpxa user agent:”vpxuser already exists”.
This happens because of the ESXi restore and the information that vCenter has regarding the previous state of this host.
To double check the issues I check vpxa.cfg file to see if there were any inconsistency in the data that I could notice and double check the IP address (since VM was previously deployed as a template).
vpxa.cfg file is located in: /etc/vmware/vpxa/vpxa.cfg
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<vpxa> <bundleVersion>1000000</bundleVersion> <datastorePrincipal>root</datastorePrincipal> <hostIp>192.168.1.92</hostIp> <hostKey>520e00c9-ad3d-681e-6bdb-c1e3ec89ec14</hostKey> <hostPort>443</hostPort> <licenseExpiryNotificationThreshold>15</licenseExpiryNotificationThreshold> <memoryCheckerTimeInSecs>30</memoryCheckerTimeInSecs> <serverIp>192.168.1.95</serverIp> <serverPort>902</serverPort> </vpxa> |
All seems to be ok, so to fix this issue, we need to uninstall the vpxa agent from the ESX host and rejoin to the vCenter.
To do uninstall vpxa agent from ESXi we need to run the uninstaller file VMware-fdm-uninstall.sh that is located in opt/vmware/uninstallers/.
Follow this procedure:
1 2 3 4 5 |
[root@vSAN-03:~] cp /opt/vmware/uninstallers/VMware-fdm-uninstall.sh /tmp [root@vSAN-03:~] chmod +x /tmp/VMware-fdm-uninstall.sh [root@vSAN-03:~] /tmp/VMware-fdm-uninstall.sh |
After the vpxa agent is removed from the ESXi host and we can reboot the host and try to rejoin (not connect) to the vCenter again.
ESXi hos was rejoined to the vSAN Cluster, and now vSAN will rebuild the vSAN Cluster and HA.
Disks were also added to vSAN, and all is configured.
While I was fixing this ESXi host issue, VMs were still running, and there was no downtime to the vSAN Cluster since this vSAN configuration can tolerate one host failure.
Afterward, vSAN is working again with all hosts.
Note: There are some alarms in my vSAN, but those are normal since this is nested vSAN. So when we have health monitor enable, will get some warnings regarding our configuration.
Note: Share this article, if you think it is worth sharing.
Leave A Comment
You must be logged in to post a comment.