Building a 3-node Proxmox HA cluster — the homelab high-availability path

A 3-node Proxmox cluster with HA is achievable under $1,500 in 2025. Craft Computing's CEPH tutorial + Lempa's LXC framing + r/homelab dashboard threads = the working build.

C Charles Lin · August 12, 2025

Craft Computing”s July 26, 2025 video — “Proxmox CEPH Cluster Tutorial - I”m never going back!” — is the canonical 2025 guide for the multi-node Proxmox + Ceph pattern that makes homelab high availability actually viable. The video walks the complete stack: three identical nodes, Ceph distributed storage, live VM migration, automatic failover.

A 3-node Proxmox HA cluster is achievable for under $1,500 in 2025. The hardware envelope has dropped enough; the software stack has matured enough; and as the r/homelab “Here”s my attempt at a dashboard to show HA and Proxmox data” thread (3,046 upvotes, Aug 7) demonstrates, the community has matured around the operational patterns too. This guide is the working build from over a year of running one.

Why 3 nodes specifically

Proxmox HA requires at least 3 nodes for quorum. Two-node clusters split-brain when network partitions happen; the system can”t reliably decide which node should keep running. Three nodes gives you majority voting — even if one node fails, the remaining two have quorum and can keep services running.

Five or seven nodes give you more redundancy but at significantly higher cost. For homelab use, 3 is the sweet spot — minimum viable HA, manageable power draw, fits in a small rack.

The hardware build (~$1,500 total)

The recommended build that I ran for the past year:

Per node (×3):

Mini-PC or SFF desktop: Minisforum MS-01, Beelink SER8, or used Dell OptiPlex Micro 7050 (~$300-400)
32GB RAM (matched DDR4/DDR5): ~$80-120 per node
2x 1TB NVMe SSD: one for boot, one for Ceph OSD (~$100 per node)
2.5GbE NIC (often built-in on MS-01; USB or PCIe add-on otherwise): ~$30 if needed

Cluster networking:

2.5GbE managed switch: 8-port for $80-150
Ethernet cables, rack mount, UPS if you want one: ~$100

Total: ~$1,200-1,500 for three nodes + switch.

The build philosophy: identical hardware per node (Proxmox HA likes this), separate disks for boot vs OSD, fast networking (Ceph hates slow networks), and modest CPU per node.

The software stack

1. Install Proxmox VE 8.x on each node from USB. Identical install configuration (same hostname pattern, same network config, same NTP setup).

2. Create the cluster. From node 1:

pveecm create homelab-cluster

From nodes 2 and 3:

pveecm add 10.0.0.1  # IP of node 1

3. Install Ceph. Proxmox has a wizard. Pick one OSD per node (your second NVMe), let Ceph manage replication. The default 3-replica setup means each piece of data lives on all three nodes — survives single-node failure with no data loss.

4. Configure HA groups. Decide which VMs should auto-restart on failure. Assign them to an HA group. Set the group to allow migration to any of the 3 nodes.

5. Test failover. Pull the power on one node. The cluster should:

Detect the failed node within ~30 seconds
Restart HA-enabled VMs on remaining nodes
Keep network/storage available throughout

If this works, you have HA. If it doesn”t, debug your Ceph health and quorum config before adding production workloads.

Why the LXC pattern from Lempa matters here

Christian Lempa”s July 28 video — “The BEST alternative to Docker and VMs! // Proxmox LXC” — frames the runtime layer. For most homelab services (Pi-hole, Plex, Home Assistant, Vaultwarden), LXC containers on Proxmox are dramatically more resource-efficient than full VMs.

In a 3-node HA cluster, LXC matters because:

Faster failover. Container restart is seconds; VM restart can be a minute or more.
Better resource utilization. You can pack 30-50 LXC containers in the RAM that holds 5-10 VMs.
Live migration is fast. LXC live-migrates in seconds; VM live-migration takes longer.
HA still works for containers. The Proxmox HA layer treats LXC and VMs uniformly.

The 2025 pattern: VMs for things that need them (Windows, specific kernels, security isolation); LXC for everything else.

Network architecture

The cluster network design matters more than people credit. Three networks:

1. Management network (your home LAN). SSH, web UI, normal traffic. Typically 1GbE.

2. Cluster network (corosync + Ceph public). Inter-node traffic for cluster coordination. Dedicated 2.5GbE recommended; isolated from your home LAN traffic.

3. Ceph cluster network (Ceph private). OSD-to-OSD traffic. Ideally also 2.5GbE+, separate from #2. For homelab, often combined with #2 on shared 2.5GbE.

Lempa”s OPNSense tutorial (Aug 29) is the adjacent piece — most serious homelab clusters live behind an OPNSense (or pfSense) firewall that handles segmentation between management/cluster/internal networks. The “throw everything on the home LAN” pattern doesn”t scale once HA matters.

The Reddit homelab-HA discourse

The r/homelab “How long will your lab run without you?” thread (464 upvotes, Aug 9) is the canonical operational-resilience conversation:

“If something fails while you”re on vacation, does the lab recover?”

Top comments split into three patterns:

“With HA + UPS + remote management, weeks to months.” Users with 3-node HA clusters report unattended runtime measured in months.
“Without HA, days at most.” Single-node setups fail unpredictably; “single-bad-disk takes everything down” is the recurring failure mode.
“It”s about monitoring, not just HA.” HA recovers from hardware failure but doesn”t fix configuration drift, full disks, expired certificates. Operational maturity matters as much as HA topology.

The r/Proxmox “Enterprise considerations from a homelab user” thread (43 upvotes, Jun 19) is the bridge: homelab users who realized the patterns they were running at home generalized to small-team production work. The 3-node Proxmox HA cluster pattern is now legitimate small-business infrastructure, not just hobby tier.

The r/homelab “Dashboard for HA and Proxmox data” thread (3,046 upvotes) shows the operational layer: e-paper displays cycling through Proxmox stats, Home Assistant data, weather. The 2025 homelab isn”t about building infrastructure; it”s about operating it like a production service. HA is now table stakes for serious homelabbers.

What can go wrong (operational notes from a year)

Ceph degradation under network stress. If your cluster network gets noisy or congested, Ceph health goes yellow/red. Diagnosis: ceph -s and check OSD heartbeats. Fix: better network isolation, sometimes a switch upgrade.

Split-brain risk during weird failures. A 3-node cluster requires 2 nodes for quorum. Lose 2 nodes simultaneously and the cluster freezes. Mitigation: UPS per node, redundant power supply on switch, don”t do hardware work on multiple nodes at once.

Storage migration is slow at first. Moving large VMs between nodes saturates your cluster network. Plan migrations for low-activity windows; consider 10GbE if you do this often.

OSD failures are normal. SSDs die. With 3-replica Ceph, single OSD failures are non-events — you swap the disk, Ceph rebuilds. But you need monitoring (Prometheus + Grafana + ceph-exporter is the canonical pattern) to catch them.

Creator POV vs Reddit dissent

Craft Computing”s POV is “this is the future of homelab” — once you”ve run Ceph + Proxmox HA, going back to single-node feels primitive. His “I”m never going back” framing captures the appeal.

Lempa”s POV is more pragmatic — HA matters when uptime matters; for many homelabs, simpler single-node setups are fine. His content emphasizes LXC efficiency and network segmentation as the foundational layers that pay off whether or not you go HA.

The Reddit dissent splits productively:

The “HA is overkill for homelab” camp — present and valid for many setups. If you”re hosting media services for your household, you don”t need HA; you need backups and a tolerance for occasional downtime.

The “ProxMox HA is brittle” camp — historically true, less true in 2025. PVE 8.x + Ceph Reef has matured significantly. Older critiques don”t carry.

The “k3s instead of Proxmox HA” camp — for users who want Kubernetes-shaped abstractions. Valid alternative; different operational model. Proxmox HA wins for VM/LXC workloads; k3s wins for container-orchestration workloads.

What this means for working homelabbers in August 2025

Three practical positions:

1. If you”re running services that affect other people (family, small team) and care about uptime, build HA. $1,500 is now a reasonable price for the operational maturity that gets you weeks of unattended runtime.

2. If you”re experimenting and learning, single-node Proxmox is still the right starting point. Add HA once you have services that matter; don”t over-architect early.

3. Whatever path you”re on, invest in monitoring before HA. Prometheus + Grafana + alerts catches problems HA can”t fix. Operational discipline beats topology choices.

The honest critique

What this guide doesn”t cover:

Off-site backup remains essential. 3-node Ceph protects against single-node failure; doesn”t protect against fire, flood, or whole-cluster ransomware. Restic / Kopia / Borg to remote storage.
HA doesn”t fix bad config. A bad systemd unit fails on all 3 nodes simultaneously. HA assumes the failures are hardware-level.
Power and cooling matter. Three nodes pulling 60W each = 180W continuous. Real money over a year. Plan for it.

For most working homelabbers reading this in mid-August 2025: a 3-node Proxmox HA cluster is now a reasonable mid-tier homelab investment. The hardware is affordable, the software stack is mature, the community patterns are documented, and the operational ceiling is now “small-business production” not “expensive enthusiast experiment.”

For broader homelab infrastructure context, see our Proxmox VE review and the ZFS pool design guide for storage patterns that complement this build.

Sources

Every reference behind this piece. If we make a claim, it's because at least one of these said so — or we lived it ourselves.

YouTube Craft Computing — "Proxmox CEPH Cluster Tutorial - I'm never going back!" — Craft Computing
YouTube Christian Lempa — "The BEST alternative to Docker and VMs! // Proxmox LXC" — Christian Lempa
YouTube Christian Lempa — "Build your own HomeLab Firewall! // OPNSense Tutorial" — Christian Lempa
Docs Proxmox VE — High Availability documentation — Proxmox Server Solutions
Blog r/homelab — "Here's my attempt at a dashboard to show HA and Proxmox data" (3046 upvotes) — r/homelab
Blog r/homelab — "How long will your lab run without you?" (464 upvotes) — r/homelab
Blog r/Proxmox — "Enterprise Proxmox considerations from a homelab user" (43 upvotes) — r/Proxmox
Firsthand Built and operated a 3-node Proxmox HA cluster with Ceph for over a year