Proxmox High Availability with UniFi Link Aggregation
Building a Proxmox HA cluster with TrueNAS and UniFi for automatic failover and zero downtime.
High Availability ensures that critical virtual machines continue running even if one or more of your Proxmox nodes fails. This is especially important for services like DNS, DHCP, or any always-on workloads like containers, home automations, dashboards, etc. HA allows VMs to automatically migrate to another node when a failure is detected, minimizing downtime and manual intervention.
Without HA, you'd have to monitor your cluster and manually bring up VMs on other nodes when something goes wrong. HA automates this failover process.
What You Will Need
To enable HA in Proxmox:
- Three or more Proxmox nodes: This is required to maintain quorum.
- Shared storage: All nodes need access to the same VM disk images to support live migration and failover.
What Is Quorum?
Quorum is the mechanism that ensures the cluster can make consistent and valid decisions. It prevents "split-brain" scenarios, where nodes disagree about cluster state. Proxmox uses the corosync
service to maintain cluster consensus. At least 3 nodes are required so that a majority (quorum) can be reached even if one node fails.
My Setup
Storage Node: TrueNAS with NFS
Set up a TrueNAS server as dedicated shared storage. You can use an old PC or server-grade hardware for this. Ideally, the machine should have redundant storage (RAID Z1/Z2 or mirrored vdevs) and the more network interfaces the merrier.
1. Install TrueNAS:
- Flash the latest TrueNAS ISO to a USB stick.
- Boot the storage machine from it and follow the installer to install TrueNAS.
- After installation, log in via the web UI and configure the basics like password changes, auth and email/slack alerts.
2. Configure Networking:
- Set a static IP address on the main interface.
- Optionally configure DNS and gateway manually.
3. Create a Storage Pool:
- Go to Storage > Pools >
Add
. - Select your disks and set up redundancy (RAIDZ, Mirror, etc.).
- Name the pool something like
proxmox-nfs
.
4. Enable NFS:
- Go to Sharing > Unix (NFS).
- Click
Add
and select a dataset in your pool. - Enable
Mapall User
toroot
(for simplicity in homelabs). - Allow network access to the subnet your Proxmox nodes reside in.
5. Add NFS Share in Proxmox:
- On each Proxmox node: Datacenter > Storage > Add > NFS.
- Point it to the TrueNAS IP and shared path.
- Enable desired content types:
Disk image
,ISO
,Container
, etc.
Move VMs to Shared Storage
- Go to each VM >
Hardware > Hard Disk
>Disk Action > Move Storage
- Select the newly added NFS pool.
- Wait for the disk to be copied across.
Set Up HA in Proxmox
- Navigate to Datacenter > HA > Groups
- Create a group of nodes to participate in HA.
- Omit nodes you don't want HA-enabled.
- Add VMs to HA:
- Datacenter > HA > Resources > Add
- Select the VM and assign it to your HA group.
Only add VMs that are portable (i.e., don’t rely on GPU passthrough or local storage). For example, I keep my gaming VM off HA since it relies on a GPU in a specific host.
Test HA
- Shut down or unplug a node.
- Watch as Proxmox automatically evacuates and restarts affected VMs on another node.
Note: VMs don’t automatically move back to their original nodes. You’ll need to do this manually if you want to redistribute the load.
Enable Link Aggregation with UniFi
Shared storage is now your single point of failure, so it’s worth hardening it.
Why Link Aggregation?
If the storage server’s NIC or network cable fails, all your HA resilience goes out the window. Link Aggregation bonds multiple physical network interfaces into one logical interface, giving you redundancy and potentially higher throughput.
On TrueNAS:
- Go to:
Network > Interfaces > Add Interface
- Type:
Link Aggregation
- Choose interfaces (e.g.,
igb0
,igb1
) - Mode:
LACP
(recommended if your switch supports it) - Add your static IP again to the LAGG interface
- Click
Apply
(but not Test yet)
On UniFi Switch:
- Open UniFi controller
- Go to
Devices > Your Switch
- Click the first port the TrueNAS server is connected to
- Under
Port Profile Override
, chooseManual
- In
Operation
, chooseAggregate
- Add the adjacent port (e.g., port 23 if TrueNAS is in port 22)
Enable and Test:
- Back in TrueNAS, click
Test
- Set the timeout to
120 seconds
- Quickly go back to UniFi and
Apply
the changes - Wait for the LAGG interface to come up
- If successful, click
Make Permanent
in TrueNAS
If the test fails, the config will automatically roll back. That’s why the 120s safety window is handy.
Final Thoughts
Proxmox HA with shared storage and LAG-backed networking gives you a solid, production-grade platform that can gracefully survive node reboots, maintenance, or unexpected outages. With TrueNAS acting as shared storage and UniFi providing network redundancy, your homelab (or SMB cluster) is far more resilient. I’ve been running this setup for about a month through kernel updates, node shutdowns, and deliberate chaos and so far had zero downtime.