Backup Strategy¶
CharlieHub uses a 3-2-1 backup strategy with PBS (Proxmox Backup Server) for efficient off-site backups.
Full backup documentation: Backup & Recovery
PBS documentation: PBS Service
Last Updated: 2026-02-04
Strategy Overview¶
| Requirement | Implementation |
|---|---|
| 3 copies | Ceph 5-OSD 3x replication + PBS (France) + vzdump (UK) |
| 2 media types | Local NAS (UK) + PBS (France) |
| 1 off-site | PBS on pbs-fr (France) |
Backup Methods¶
| Method | Storage | Transfer | Recovery Time |
|---|---|---|---|
| PBS (Primary) | pbs-fr (France) | Incremental 1-5 GB | 10-20 min |
| Vzdump (UK) | px3-nas | Full backup | 5-10 min |
| Ceph | ceph-pool | Automatic | Instant |
Why PBS?
PBS uses incremental deduplication. After the initial full backup, nightly transfers drop from 40-100GB to 1-5GB, making WAN backups practical.
Backup Schedule Summary¶
Node-staggered to prevent Ceph I/O contention:
| Time Window | Node | Method | Storage |
|---|---|---|---|
| 22:00-00:00 | px3-suzuka | PBS | pbs-fr (France) |
| 00:30-02:30 | px2-monza | PBS | pbs-fr (France) |
| 03:00-05:00 | px1-silverstone | PBS | pbs-fr (France) |
| 05:30-06:00 | px2, px3 | Vzdump | px3-nas (UK) |
Ceph Scrub Window¶
Scrubs restricted to 09:00-17:00 UTC (no overlap with backups).
Data Protection Layers¶
┌─────────────────────────────────────────────────────────────────────────┐
│ LAYER 1: CEPH REPLICATION │
│ All writes stored on 5 OSDs across 3 UK nodes (size=3 replication) │
│ Recovery: Instant (data already available) │
│ RPO: 0 (synchronous) │
└─────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ LAYER 2: PBS INCREMENTAL (FRANCE) │
│ Daily at 22:00-03:00 (node-staggered) to pbs-fr │
│ Deduplication reduces transfer to 1-5 GB/night │
│ Retention: 7 daily + 4 weekly + 2 monthly │
│ Recovery: 10-20 minutes │
│ RPO: Up to 24 hours │
└─────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ LAYER 3: UK LOCAL VZDUMP │
│ Daily at 05:30 (px2/px3) to px3-nas │
│ Full backup for fast local restore │
│ Recovery: 5-10 minutes │
│ RPO: Up to 25 hours │
└─────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ LAYER 4: PBS WEEKLY ARCHIVE │
│ Sunday at 07:00 to pbs-fr │
│ Retention: 8 weekly, 3 monthly │
│ RPO: Up to 7 days │
└─────────────────────────────────────────────────────────────────────────┘
Recovery Time Objectives (RTO)¶
| Scenario | Method | RTO |
|---|---|---|
| VM data corruption | Ceph RBD rollback | < 1 minute |
| VM configuration error | Vzdump restore (UK) | 5-10 minutes |
| Single node failure | Ceph automatic (HA failover) | 1-2 minutes |
| UK secondary restore | From px3-nas | 5-10 minutes |
| PBS restore (France) | From pbs-fr | 10-20 minutes |
| UK site failure | Restore on px5 from PBS | 1-2 hours |
PBS Benefits¶
| Metric | Before (vzdump NFS) | After (PBS) |
|---|---|---|
| Nightly WAN transfer | 40-100 GB | 1-5 GB |
| Storage used | ~1.5 TB | ~500 GB |
| Restore time (France) | 30-60 min | 10-20 min |
| Resume on failure | No | Yes |
| Deduplication | No | Yes |
Critical VMs¶
These VMs have all protection layers enabled:
| VMID | Name | Ceph | PBS (France) | UK Local |
|---|---|---|---|---|
| 1112 | prod-database | ✅ | ✅ | ✅ |
| 1113 | prod-iot-platform | ✅ | ✅ | ✅ |
| 1118 | isp-monitor (STOPPED - migrated to Mint) | ❌ | ❌ | ❌ |
| 3102 | homelab-monitor | ✅ | ✅ | ✅ |
Storage Summary¶
| Storage | Type | Location | Capacity | Purpose |
|---|---|---|---|---|
| pbs-fr | PBS | CT 5101 (France) | ~1.1 TB free | Primary off-site |
| px3-nas | NFS | UK | 1.8 TB | Fast UK restore |
| ceph-pool | RBD | 5 OSDs UK | ~2.9 TB | Live VM storage |
Hub2 Backups¶
hub2 (central dedicated server) is backed up daily via rsync over WireGuard.
| Time | Target | Retention |
|---|---|---|
| 03:00 UTC | px3 (UK) | 7 daily |
| 03:00 UTC | px5 (FR) | 7 daily |
For full details:
Last updated: 2026-02-04