WireGuard Reachability Monitor¶
Lightweight UDP-based monitoring service that continuously checks the reachability of WireGuard VPN endpoints. Detects network issues, DNS failures, and port unavailability with structured logging and failure alerting.
Overview¶
| Property | Value |
|---|---|
| Location | hub2 (OVH Dedicated Server) |
| Container | charliehub_wg_monitor |
| Service Type | Observability (monitoring only) |
| Check Interval | Every 5 minutes |
| Endpoints | UK (51821), FR (51820) |
| Version | 1.0 (Feb 2026) |
!!! info "Purpose" Provides early warning of WireGuard endpoint unavailability. Complements WAN Watcher's IP detection with active UDP reachability probes to catch network-level issues (firewall blocks, ISP changes, port failures) before VPN clients experience disconnections.
Architecture¶
┌─────────────────────────────────────┐
│ WG Monitor Service (Docker) │
├─────────────────────────────────────┤
│ Every 5 minutes: │
│ 1. Resolve uk-vpn.charliehub.net │
│ 2. Resolve fr-vpn.charliehub.net │
│ 3. UDP probe to (ip):51821 (UK) │
│ 4. UDP probe to (ip):51820 (FR) │
│ 5. Track consecutive failures │
│ 6. Alert on 3 consecutive failures │
│ 7. Log results (structured) │
└─────────────────────────────────────┘
How It Works¶
UDP Reachability Detection¶
The service uses UDP socket probes to test endpoint reachability:
def probe_udp_port(ip: str, port: int, timeout: int = 3) -> bool:
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.settimeout(timeout)
try:
sock.sendto(b"", (ip, port)) # Send empty datagram
except socket.timeout:
return True # No ICMP error = reachable
except ConnectionRefusedError:
return False # ICMP port unreachable
try:
sock.recvfrom(1024) # Try to receive
except socket.timeout:
pass # No response but no error = reachable
return True
What constitutes "reachable": - ✅ UDP socket sendto() succeeds - ✅ Socket timeout on send (no ICMP error) - ✅ No response to receive (UDP is stateless) - ❌ ConnectionRefusedError (ICMP port unreachable)
Consecutive Failure Tracking¶
Prevents false alerts from transient packet loss:
- Each probe result is tracked per endpoint
- Consecutive failures counter increments on each failure
- Counter resets to 0 on success
- Alert triggered after 3 consecutive failures (~15 minutes)
- Single successful probe resets counter and clears alert status
Parallel Checking¶
Both endpoints checked simultaneously using threads:
Start checks at T=0
├── Thread 1: UK probe (0-3s)
├── Thread 2: FR probe (0-3s)
└── Wait for both
└── Log results (T=3s total, not 6s)
Configuration¶
Environment variables in docker-compose.yml:
environment:
- UK_VPN_DOMAIN=uk-vpn.charliehub.net
- FR_VPN_DOMAIN=fr-vpn.charliehub.net
- UK_WG_PORT=51821
- FR_WG_PORT=51820
- CHECK_INTERVAL=300
- PROBE_TIMEOUT=3
- CONSECUTIVE_FAILURE_THRESHOLD=3
| Variable | Default | Description |
|---|---|---|
UK_VPN_DOMAIN |
uk-vpn.charliehub.net |
UK endpoint FQDN |
FR_VPN_DOMAIN |
fr-vpn.charliehub.net |
FR endpoint FQDN |
UK_WG_PORT |
51821 |
UK WireGuard listen port |
FR_WG_PORT |
51820 |
FR WireGuard listen port |
CHECK_INTERVAL |
300 |
Polling interval (seconds) |
PROBE_TIMEOUT |
3 |
UDP probe timeout (seconds) |
CONSECUTIVE_FAILURE_THRESHOLD |
3 |
Alert threshold (failures) |
Customization¶
To change polling interval or thresholds:
# Update docker-compose.yml
CHECK_INTERVAL=60 docker compose up -d wg-monitor
# Or edit docker-compose.yml and apply
docker compose up -d wg-monitor
Logging & Monitoring¶
Log Formats¶
Startup (initial configuration):
2026-02-13T09:53:50 [INFO] Starting WireGuard Reachability Monitor
2026-02-13T09:53:50 [INFO] UK endpoint: uk-vpn.charliehub.net:51821
2026-02-13T09:53:50 [INFO] FR endpoint: fr-vpn.charliehub.net:51820
2026-02-13T09:53:50 [INFO] Check interval: 300s
2026-02-13T09:53:50 [INFO] Probe timeout: 3s
2026-02-13T09:53:50 [INFO] Alert threshold: 3 consecutive failures
Reachability - Normal:
2026-02-13T09:53:53 [INFO] ✅ UK WireGuard reachable: uk-vpn.charliehub.net (185.122.194.5:51821)
2026-02-13T09:53:53 [INFO] ✅ FR WireGuard reachable: fr-vpn.charliehub.net (78.116.21.175:51820)
2026-02-13T09:53:53 [INFO] Summary: UK ✅ OK | FR ✅ OK
Reachability - First failure:
2026-02-13T10:00:00 [WARNING] ⚠️ UK WireGuard unreachable: uk-vpn.charliehub.net (185.122.194.5:51821) - failure #1
Reachability - Threshold reached (alert):
2026-02-13T10:10:00 [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. Endpoint uk-vpn.charliehub.net (185.122.194.5:51821) not responding to UDP probes.
Reachability - Recovery:
2026-02-13T10:15:00 [INFO] ✅ UK WireGuard RECOVERED: uk-vpn.charliehub.net (185.122.194.5:51821) reachable
Service Management¶
Viewing Logs¶
# View recent logs
docker logs charliehub_wg_monitor --tail 50
# Follow logs in real-time
docker logs charliehub_wg_monitor -f
# View with timestamps
docker logs charliehub_wg_monitor --timestamps
Health Check¶
The service includes a health check endpoint:
# Check container health
docker ps | grep wg_monitor
# Should show: (healthy)
# Verify service startup
docker exec charliehub_wg_monitor python3 -c "import socket; socket.socket()"
# Exit code 0 = healthy
Manual Restart¶
To trigger an immediate reachability check:
docker restart charliehub_wg_monitor
Testing & Failure Simulation¶
Test 1: Verify Normal Operation¶
Expected state: Both endpoints reachable
# View logs
docker logs charliehub_wg_monitor --tail 10
# Should show: Summary: UK ✅ OK | FR ✅ OK
Test 2: Simulate DNS Failure¶
To test DNS resolution failure:
# In docker-compose.yml, temporarily change:
UK_VPN_DOMAIN=nonexistent.invalid
# Redeploy
docker compose up -d wg_monitor
# Expect after 3 checks (~15 min):
# [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. Cannot resolve...
Test 3: Simulate Port Block¶
To test port unavailability (requires access to WireGuard endpoint):
# On UK WireGuard host, temporarily block port 51821:
sudo ufw deny 51821/udp
# Monitor will show:
# [WARNING] ⚠️ UK WireGuard unreachable: ... - failure #1
# [WARNING] ⚠️ UK WireGuard unreachable: ... - failure #2
# [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. ...
# Re-enable port
sudo ufw allow 51821/udp
# Next successful probe (~5 min):
# [INFO] ✅ UK WireGuard RECOVERED: ... reachable
Test 4: Verify Parallel Execution¶
Check container logs for timing:
docker logs charliehub_wg_monitor | grep "Running reachability checks"
# Should show both UK and FR results within ~3 seconds of each other
Troubleshooting¶
Both endpoints showing as unreachable¶
Check 1: DNS resolution
docker exec charliehub_wg_monitor nslookup uk-vpn.charliehub.net
docker exec charliehub_wg_monitor nslookup fr-vpn.charliehub.net
# Should resolve to public IPs (185.x.x.x, 78.x.x.x)
Check 2: Network connectivity
# Verify container can reach external hosts
docker exec charliehub_wg_monitor ping -c 1 8.8.8.8
# Should succeed (ICMP allowed to external networks)
Check 3: Port accuracy
# Verify ports match WireGuard configuration
docker exec charliehub_wg_monitor cat /etc/wireguard/wg-uk.conf | grep ListenPort
# Should show: ListenPort=51821
One endpoint reachable, other failing¶
For UK endpoint failing:
# Check wan-watcher updated the endpoint correctly
docker logs charliehub_wan_watcher | grep "UK"
# Verify WireGuard is listening on the port
ssh uk-host 'sudo ss -lun | grep 51821'
# Should show: UNCONN 0 0 0.0.0.0:51821
For FR endpoint failing:
# Verify FR endpoint is configured
docker exec charliehub_wg_monitor cat /etc/wireguard/wg-fr.conf | grep Endpoint
Container not starting¶
# Check Docker build
docker logs charliehub_wg_monitor
# Should show: Starting WireGuard Reachability Monitor
# Verify image exists
docker images | grep wg-monitor
# Should show: charliehub-wg-monitor
# Check permissions on mounted files
ls -la /opt/charliehub/wg-monitor/
Design Decisions¶
Why UDP Probes (Not TCP)?¶
WireGuard listens on UDP only. TCP probes would always fail:
| Protocol | Port 51821 | Port 51820 | Result |
|---|---|---|---|
| TCP | ❌ Refused | ❌ Refused | Would always fail |
| UDP | ✅ Open | ✅ Open | Correct detection |
Why Parallel Threads?¶
- Both endpoints checked simultaneously
- Total check time: ~3 seconds (not 6)
- Independent checks (one failure doesn't block the other)
- Scales to additional endpoints easily
Why Consecutive Failure Tracking?¶
- Prevents false alerts from transient packet loss
- 3-check threshold = 15 minutes of data
- Only alerts on sustained problems
- Recovery is immediate (1 successful probe = reset)
Why Separate Service (Not in WAN Watcher)?¶
- Separation of concerns: DNS updates vs. reachability monitoring
- Independent restart cycles: Can restart independently
- Can be disabled without affecting DNS updates
- Easier to test and troubleshoot
Impact & Safety¶
Non-Invasive Observability¶
✅ Does NOT modify: - WireGuard configurations - DNS records - CT1119 (WireGuard endpoints) - WAN Watcher behavior
✅ Does ONLY: - Send empty UDP datagrams to ports 51821 and 51820 - Resolve domain names (standard DNS queries) - Log results to stdout - Track state in memory
Rollback¶
If wg-monitor causes issues:
# Stop the service
docker compose down wg-monitor
# Remove from docker-compose.yml (optional)
git checkout docker-compose.yml
# Remove directory (optional)
sudo rm -rf /opt/charliehub/wg-monitor
# Verify other services unaffected
docker compose up -d
Impact of rollback: Zero impact. wg-monitor is standalone observability, not required for core functionality.
Related¶
- WAN IP Watcher - Detects WAN IP changes and updates endpoints
- WireGuard VPN - VPN tunnels being monitored
- VPN WG Manager - WireGuard endpoint management on CT1119
- Domain Manager API - DNS record management