/Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide

I recently set up a Proxmox VE cluster in my lab using VMware as the base. My goal was to test Veeam migrations and understand how Proxmox handles features such as failover and storage. I did most of the work through the command line because it gave me more control and helped me learn what’s happening behind the scenes. During setup, I encountered issues with repository updates and network configuration, but I found ways to resolve them. I also set up NFS storage and tested how the cluster handles failover when a node goes offline. In this post, I’ll share each step I took, the problems I faced, and how I fixed them.

Since numerous articles are already available on how to install Proxmox, I won’t cover that aspect here. It’s relatively straightforward, and countless step-by-step instructions are available online. So, I’ll start with my three Proxmox VE nodes, which are already up and running.

This blog post provides a detailed, step-by-step process, focusing on command-line interface (CLI) commands, supplemented by graphical user interface (GUI) steps, to configure a three-node Proxmox cluster.

Environment Overview

Setup:

  • Three Proxmox VE 8.x nodes (proxmox01, proxmox02, proxmox03), installed as VMs in vSphere 8.x.
  • Purpose: Testing Veeam migrations and failover scenarios.
  • Networking:
    • vmbr0: Management and cluster communication (192.168.1.x).
    • vmbr1: Dedicated for NFS storage traffic (192.168.10.x).
  • Storage: Synology NAS exporting /volume2/Proxmox_NFS over NFS.
  • Nested Environment: VMware settings tweaked to support nested virtualization and bonded interfaces.

Why two networks? Separating management and storage traffic prevents contention during high I/O operations and provides better security through network isolation.

Nested Environment Considerations

Since Proxmox was nested in VMware, I installed open-vm-tools for better integration:

Why Nested and open-vm-tools?

Running Proxmox VE inside VMware vSphere requires open-vm-tools to optimize integration with the hypervisor. This provides:

  • Time synchronization: Ensures accurate clock alignment across nodes.
  • Graceful shutdown/reboot: Allows clean operations from vSphere.
  • IP reporting: Improves visibility in VMware.
  • Performance enhancements: Optimizes interaction with virtual hardware.

Important: open-vm-tools is only required for nested environments and should not be installed on bare-metal Proxmox setups, as it’s irrelevant without a virtualization layer, such as VMware.


Proxmox VE Installation and Repository Setup

Proxmox VE is based on Debian, so I began by ensuring the system and kernel were fully updated. The first step was to check if the enterprise repository was enabled:

Problem: This file pointed to the enterprise repo, causing a 401 Unauthorized error because I didn’t have a subscription.

Fix: I disabled the enterprise repo:

Then I added the free no-subscription repo:

I also ran into the same 401 error with the Ceph repo:

Once the repos were sorted, I updated the system and upgraded everything:

Finally, I rebooted to make sure the new kernel was loaded:

After the reboot, I checked the kernel version:

and confirmed it was the latest Proxmox version (like 6.5.x-pve). This resolved the update errors and restored the system to a clean state, allowing it to proceed to the next steps.


Network Configuration

  • vmbr0: Linked to ens192, handles management traffic (GUI, SSH, Corosync).
  • vmbr1: Linked to a bonded bond0 using ens224 and ens256 for NFS traffic.

Bonding Setup:

Note: Same approach on proxmox02 and proxmox03, just change the IP.

Network Bonding Modes Explained

We chose active-backup (mode 1) bonding because:

  • Simplicity: Works without special switch configuration
  • Reliability: Clear failover behavior
  • Nested Environment Compatibility: Works reliably in VMware

Other modes we considered:

  • balance-rr (0): Packet round-robin – risk of out-of-order packets
  • 802.3ad (4): LACP – requires vSwitch configuration
  • balance-xor (2): Not necessarily better for our use case

Troubleshooting Tip: When bonding interfaces in a nested environment, ensure VMware port groups have “Promiscuous Mode” set to “Accept”.

VMware Network Configuration

For our nested Proxmox nodes to work properly, we configured the VMware vSwitches as follows:

Management Network (vmbr0):

  • Standard vSwitch
  • MTU 1500 (standard)
  • Promiscuous mode: Reject (default)

Storage Network (vmbr1):

  • Distributed vSwitch recommended
  • MTU 9000 (matching Proxmox config)
  • Security settings:
    • Promiscuous mode: Accept
    • MAC address changes: Accept
    • Forged transmits: Accept

Critical Note: The storage network requires relaxed security settings to allow proper bonding operation in the nested environment.

VMware vDS Configuration for Nested Setup

Since this is a nested environment, configure the VMware vDS:

  1. Go to Networking > vDS > Port Group > Edit Settings.
  2. Security:
    • Promiscuous Mode: Accept
    • MAC Address Changes: Accept
    • Forged Transmits: Accept
  3. Teaming and Failover:
    • Load Balancing: Route based on originating virtual port ID
    • Network Failover Detection: Link status only
    • Notify Switches: No
    • Failback: Yes

Why? These settings enable bonded interfaces to function in a nested environment by allowing MAC address changes and ensuring that traffic flows through virtual NICs.

Best Practices

  • Use Active-Backup bonding mode for simplicity and compatibility in nested setups.
  • Separate management and storage traffic with distinct subnets.
  • Verify connectivity with ping -I vmbr1 <NFS-IP>.
  • Consideration: In nested setups, bonding is limited if both vNICs map to the same physical NIC in VMware. Spread vNICs across different uplinks if possible.

Cluster Creation

I created a three-node cluster for high availability (HA) and migration purposes.

CLI Steps

On proxmox01:

On proxmox02 and proxmox03:

Verify cluster:

GUI Steps

  1. Go to Datacenter > Cluster.
  2. On proxmox01, click Create Cluster, name it proxcluster.
  3. On proxmox02 and proxmox03, click Join Cluster, enter proxmox01’s IP and credentials.

Initial Challenges

Problem: Nodes fail to join due to authentication errors.

Root Cause: We hadn’t copied SSH keys between nodes first.

Solution:

# From each joining node
ssh-copy-id root@proxmox01

Best Practices

Ensure all nodes have unique hostnames and that they resolve via /etc/hosts:

Add:

Verify time synchronization:

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide


Configuring Shared NFS Storage

Why NFS?

We chose NFS for our shared storage because:

  1. Simplicity: Easy to set up and manage
  2. Compatibility: Works well with Proxmox’s features
  3. Performance: Good enough for our testing needs
  4. Snapshot Support: When backed by ZFS on the NAS

Shared storage enables VM migration and HA.

CLI Steps

Install the NFS client on each node:

Verify the NFS export:

Error Encountered: The original NFS path /volume2/Proxmox NFS had a space, causing mount failures.
Fix: Renamed it to /volume2/Proxmox_NFS on the NFS server.

GUI Steps

  1. Go to Datacenter > Storage > Add > NFS.
  2. Enter:
    • ID: VMS-Storage
    • Server: 192.168.10.198
    • Export: /volume2/Proxmox_NFS
    • Content: Disk image, ISO image, VZDump backup file
    • Nodes: All
    • Advanced: Default for Preallocation and NFS Version
  3. Click Add.

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide

NFS Storage Options

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide

Best Practices

  • Avoid spaces in NFS paths (e.g., use Proxmox_NFS instead of Proxmox NFS).
  • Test mounts with showmount -e before adding.
  • Use NFS 4.1 for better performance if supported by your NAS.
  • Monitor the NFS server to avoid bottlenecks.

Verification of the new Storage added to the Cluster(each node) by command line:

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide


Creating a Test VM for Storage and Migration Tests

To ensure everything was working properly, I created a test virtual machine (VM) immediately after configuring the shared NFS storage. The primary goal was to ensure that the NFS-backed storage was functional and could be used for virtual machine (VM) disks. But beyond that, I also wanted to test how well live migration would work between the Proxmox nodes.

This test VM lets me simulate a real workload. I used it to:

  • Verify that the storage was mounted and accessible by all cluster nodes.
  • Test live migrations between nodes (proxmox01, proxmox02, proxmox03) to confirm shared storage and network configurations were solid.
  • Get ready to configure and test HA, which I will do in the next steps.

After creating the VM, I ran migrations to and from each node. The tests measured the migration times and speeds, providing a baseline for later performance tuning. It was also an opportunity to observe how quickly the VM state was transferred across nodes and to ensure that no data was lost or corrupted during the process.

This is the VM summary board in Proxmox.

Testing Live Migration

After configuring the cluster and setting up shared NFS storage, we tested live migration to confirm everything worked as expected. We used VM 100 as our test VM.

Process

In the Proxmox GUI:

  • Right-click the test VM (VM 100).
  • Click Migrate.
  • Select the target node (e.g., proxmox03).
  • Confirm and start the migration.

Migration Log

Here’s a sample log snippet showing what we saw:

Highlights:

  • Start time: 02:52:23
  • VM memory: 3.9 GiB
  • Peak transfer rates: Around 100–170 MiB/s
  • Downtime limit: 100 ms

Rolling Back to Test Reverse Migration

To make sure migration worked both ways, we repeated the same test:

  • Migrated the VM from proxmox01 to proxmox02.
  • Migrated again from proxmox02 to proxmox03.
  • Finally, migrated it back to proxmox01.

Result: All migrations completed successfully without downtime, confirming that the cluster’s shared storage and network setup were working perfectly for live migrations.

This thorough testing ensured our HA setup was reliable and ready for real-world use.


High Availability (HA) Configuration

High Availability (HA) is essential for ensuring that virtual machines remain online, even if one of the Proxmox nodes fails. It automatically restarts VMs on healthy nodes, ensuring services continue to run without manual intervention. Before setting up HA, I used the test VM I created earlier to verify that live migration and failover between nodes would work properly.

CLI Steps

  1. Create an HA group:
  2. Add the test VM to the HA group:
  3. Check the current HA status:

    Expected output:

HA Configuration in GUI Steps

  1. In the Proxmox web interface, go to Datacenter > HA > Groups, and create a new group named default, adding all three nodes.
  2. Then go to Datacenter > HA > Resources, click Add, and add the test VM to the default group. Set its state to started.

Building a Nested Proxmox VE Cluster for Veeam Migrations: A Step-by-Step Guide


Testing Failover Scenarios

To test how HA behaves during failures, I simulated a node failure:

CLI Steps

  1. Shut down the primary node (proxmox01):
  2. Check logs on another node to monitor the failover process:

Failover Timeline

 

  • Total failover time: About 3 minutes
  • Cluster detection: Around 5 seconds
  • VM restart: Roughly 3 seconds

 

One thing I noticed during these tests is that if you’re coming from a VMware environment, don’t expect Proxmox HA to behave the same way as vSphere or vCenter HA. It’s not as fast or as seamless; it takes a bit more time for Proxmox to detect a node failure and trigger the migration. This means you may experience more downtime for your VMs than you’re accustomed to. At least in my testing, I didn’t find a way to make it faster, although perhaps there’s a different configuration I missed, as someone new to Proxmox. Another thing to note is that if you’re connected to a node that goes down (like through the web GUI), you’ll also lose your session to the cluster. No virtual or management IP address remains active when a node is down, so you must reconnect manually to another node. It’s something to keep in mind when planning your setup.


Improving HA Performance

Although the initial failover was successful, I wanted to reduce the 3-minute delay. By adjusting Corosync’s timeout settings, I was able to speed up detection and reaction times.

CLI Steps

  1. Edit the Corosync configuration:

    Inside the totem section, I added:

  2. Here’s the final relevant part:
  3. Apply the new settings:

Note: These settings should be edited on a single node, as the configuration is synced across the cluster. Lowering these timeouts helps the cluster react more quickly to failures, but be cautious, if they’re too low, they can cause false failovers due to brief network hiccups.

Summary of HA Best Practices

  • Always test HA with a real VM to confirm everything works as expected.
  • Monitor logs (corosync and pve-ha-lrm) to see how fast failover happens.
  • Balance performance and stability when tuning Corosync — start with moderate values and test carefully.
  • Nested Environment Note: HA testing in nested environments helps identify quirks that may not be apparent on physical servers.

This approach provided me with a significantly faster failover time and increased confidence in the cluster’s stability during node outages.


Conclusion

Setting up this Proxmox VE cluster was a thorough and eye-opening experience. Along the way, I encountered some common pitfalls, including issues with repository configurations and networking. One of the bigger challenges was dealing with spaces in names – both for NFS exports and VM names – which Proxmox doesn’t handle well, causing some frustrating errors. Networking also required careful setup, especially in a nested environment, and HA required some extra work to run reliably.

In my opinion, Proxmox has made significant progress over the last few years, but it still lags behind VMware’s vSphere and vCenter in terms of large, enterprise environments. For small and medium-sized setups, Proxmox is a solid alternative; however, if you have critical workloads and require rock-solid high availability (HA), you may want to consider other options for now. Another important point is that organizations should recognize that community support alone is insufficient for production environments. It’s best to invest in professional support directly from Proxmox or a trusted partner.

One thing I noticed during these tests is that if you’re coming from a VMware environment, don’t expect Proxmox HA to behave the same way as vSphere or vCenter HA. It’s not as fast or as seamless; it takes a bit more time for Proxmox to detect a node failure and trigger the migration. This means you may experience more downtime for your VMs than you’re accustomed to. At least in my testing, I didn’t find a way to make it faster, although perhaps there’s a different configuration I missed, as someone new to Proxmox. Another thing to note is that if you’re connected to a node that goes down (like through the web GUI), you’ll also lose your session to the cluster. No virtual or management IP address remains active when a node is down, so you must manually reconnect to another node. It’s something to keep in mind when planning your setup.

Another area that Proxmox should improve is cluster management during failures. Currently, if the node you’re connected to goes offline, you lose access to the GUI and must reconnect manually to a different node. A management or virtual IP that stays up even when a node goes down would make the cluster easier to manage and more professional. This seems like something Proxmox could implement fairly easily, and it would be a great step towards making it a more enterprise-ready platform.

Overall, Proxmox has considerable potential and could become a strong competitor in the next few years, particularly with continued investment and community support. I’m looking forward to seeing how it evolves. Now, I’ll start working on my VMware migrations to Proxmox using Veeam, and in future posts, I’ll share step-by-step details of that process.

Share this article if you think it is worth sharing. If you have any questions or comments, comment here, or contact me on Twitter (yes, for me it is not X, but still Twitter).

©2025 ProVirtualzone. All Rights Reserved
By | 2025-06-03T17:18:58+02:00 June 3rd, 2025|Hypervisor, Proxmox, Veeam|0 Comments

About the Author:

I have over 20 years of experience in the IT industry. I have been working with Virtualization for more than 15 years (mainly VMware). I recently obtained certifications, including VCP DCV 2022, VCAP DCV Design 2023, and VCP Cloud 2023. Additionally, I have VCP6.5-DCV, VMware vSAN Specialist, vExpert vSAN, vExpert NSX, vExpert Cloud Provider for the last two years, and vExpert for the last 7 years and a old MCP. My specialties are Virtualization, Storage, and Virtual Backup. I am a Solutions Architect in the area VMware, Cloud and Backup / Storage. I am employed by ITQ, a VMware partner as a Senior Consultant. I am also a blogger and owner of the blog ProVirtualzone.com and recently book author.

Leave A Comment