Skip to content

WireGuard Reachability Monitor

Lightweight UDP-based monitoring service that continuously checks the reachability of WireGuard VPN endpoints. Detects network issues, DNS failures, and port unavailability with structured logging and failure alerting.

Overview

Property Value
Location hub2 (OVH Dedicated Server)
Container charliehub_wg_monitor
Service Type Observability (monitoring only)
Check Interval Every 5 minutes
Endpoints UK (51821), FR (51820)
Version 1.0 (Feb 2026)

!!! info "Purpose" Provides early warning of WireGuard endpoint unavailability. Complements WAN Watcher's IP detection with active UDP reachability probes to catch network-level issues (firewall blocks, ISP changes, port failures) before VPN clients experience disconnections.

Architecture

┌─────────────────────────────────────┐
│  WG Monitor Service (Docker)        │
├─────────────────────────────────────┤
│  Every 5 minutes:                   │
│  1. Resolve uk-vpn.charliehub.net   │
│  2. Resolve fr-vpn.charliehub.net   │
│  3. UDP probe to (ip):51821 (UK)    │
│  4. UDP probe to (ip):51820 (FR)    │
│  5. Track consecutive failures      │
│  6. Alert on 3 consecutive failures │
│  7. Log results (structured)        │
└─────────────────────────────────────┘

How It Works

UDP Reachability Detection

The service uses UDP socket probes to test endpoint reachability:

def probe_udp_port(ip: str, port: int, timeout: int = 3) -> bool:
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.settimeout(timeout)

    try:
        sock.sendto(b"", (ip, port))  # Send empty datagram
    except socket.timeout:
        return True                    # No ICMP error = reachable
    except ConnectionRefusedError:
        return False                   # ICMP port unreachable

    try:
        sock.recvfrom(1024)           # Try to receive
    except socket.timeout:
        pass                           # No response but no error = reachable

    return True

What constitutes "reachable": - ✅ UDP socket sendto() succeeds - ✅ Socket timeout on send (no ICMP error) - ✅ No response to receive (UDP is stateless) - ❌ ConnectionRefusedError (ICMP port unreachable)

Consecutive Failure Tracking

Prevents false alerts from transient packet loss:

  • Each probe result is tracked per endpoint
  • Consecutive failures counter increments on each failure
  • Counter resets to 0 on success
  • Alert triggered after 3 consecutive failures (~15 minutes)
  • Single successful probe resets counter and clears alert status

Parallel Checking

Both endpoints checked simultaneously using threads:

Start checks at T=0
├── Thread 1: UK probe (0-3s)
├── Thread 2: FR probe (0-3s)
└── Wait for both
    └── Log results (T=3s total, not 6s)

Configuration

Environment variables in docker-compose.yml:

environment:
  - UK_VPN_DOMAIN=uk-vpn.charliehub.net
  - FR_VPN_DOMAIN=fr-vpn.charliehub.net
  - UK_WG_PORT=51821
  - FR_WG_PORT=51820
  - CHECK_INTERVAL=300
  - PROBE_TIMEOUT=3
  - CONSECUTIVE_FAILURE_THRESHOLD=3
Variable Default Description
UK_VPN_DOMAIN uk-vpn.charliehub.net UK endpoint FQDN
FR_VPN_DOMAIN fr-vpn.charliehub.net FR endpoint FQDN
UK_WG_PORT 51821 UK WireGuard listen port
FR_WG_PORT 51820 FR WireGuard listen port
CHECK_INTERVAL 300 Polling interval (seconds)
PROBE_TIMEOUT 3 UDP probe timeout (seconds)
CONSECUTIVE_FAILURE_THRESHOLD 3 Alert threshold (failures)

Customization

To change polling interval or thresholds:

# Update docker-compose.yml
CHECK_INTERVAL=60 docker compose up -d wg-monitor

# Or edit docker-compose.yml and apply
docker compose up -d wg-monitor

Logging & Monitoring

Log Formats

Startup (initial configuration):

2026-02-13T09:53:50 [INFO] Starting WireGuard Reachability Monitor
2026-02-13T09:53:50 [INFO]   UK endpoint: uk-vpn.charliehub.net:51821
2026-02-13T09:53:50 [INFO]   FR endpoint: fr-vpn.charliehub.net:51820
2026-02-13T09:53:50 [INFO]   Check interval: 300s
2026-02-13T09:53:50 [INFO]   Probe timeout: 3s
2026-02-13T09:53:50 [INFO]   Alert threshold: 3 consecutive failures

Reachability - Normal:

2026-02-13T09:53:53 [INFO] ✅ UK WireGuard reachable: uk-vpn.charliehub.net (185.122.194.5:51821)
2026-02-13T09:53:53 [INFO] ✅ FR WireGuard reachable: fr-vpn.charliehub.net (78.116.21.175:51820)
2026-02-13T09:53:53 [INFO] Summary: UK ✅ OK | FR ✅ OK

Reachability - First failure:

2026-02-13T10:00:00 [WARNING] ⚠️  UK WireGuard unreachable: uk-vpn.charliehub.net (185.122.194.5:51821) - failure #1

Reachability - Threshold reached (alert):

2026-02-13T10:10:00 [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. Endpoint uk-vpn.charliehub.net (185.122.194.5:51821) not responding to UDP probes.

Reachability - Recovery:

2026-02-13T10:15:00 [INFO] ✅ UK WireGuard RECOVERED: uk-vpn.charliehub.net (185.122.194.5:51821) reachable

Service Management

Viewing Logs

# View recent logs
docker logs charliehub_wg_monitor --tail 50

# Follow logs in real-time
docker logs charliehub_wg_monitor -f

# View with timestamps
docker logs charliehub_wg_monitor --timestamps

Health Check

The service includes a health check endpoint:

# Check container health
docker ps | grep wg_monitor
# Should show: (healthy)

# Verify service startup
docker exec charliehub_wg_monitor python3 -c "import socket; socket.socket()"
# Exit code 0 = healthy

Manual Restart

To trigger an immediate reachability check:

docker restart charliehub_wg_monitor

Testing & Failure Simulation

Test 1: Verify Normal Operation

Expected state: Both endpoints reachable

# View logs
docker logs charliehub_wg_monitor --tail 10
# Should show: Summary: UK ✅ OK | FR ✅ OK

Test 2: Simulate DNS Failure

To test DNS resolution failure:

# In docker-compose.yml, temporarily change:
UK_VPN_DOMAIN=nonexistent.invalid

# Redeploy
docker compose up -d wg_monitor

# Expect after 3 checks (~15 min):
# [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. Cannot resolve...

Test 3: Simulate Port Block

To test port unavailability (requires access to WireGuard endpoint):

# On UK WireGuard host, temporarily block port 51821:
sudo ufw deny 51821/udp

# Monitor will show:
# [WARNING] ⚠️  UK WireGuard unreachable: ... - failure #1
# [WARNING] ⚠️  UK WireGuard unreachable: ... - failure #2
# [ERROR] 🚨 ALERT: UK WireGuard unreachable for 3 checks. ...

# Re-enable port
sudo ufw allow 51821/udp

# Next successful probe (~5 min):
# [INFO] ✅ UK WireGuard RECOVERED: ... reachable

Test 4: Verify Parallel Execution

Check container logs for timing:

docker logs charliehub_wg_monitor | grep "Running reachability checks"
# Should show both UK and FR results within ~3 seconds of each other

Troubleshooting

Both endpoints showing as unreachable

Check 1: DNS resolution

docker exec charliehub_wg_monitor nslookup uk-vpn.charliehub.net
docker exec charliehub_wg_monitor nslookup fr-vpn.charliehub.net
# Should resolve to public IPs (185.x.x.x, 78.x.x.x)

Check 2: Network connectivity

# Verify container can reach external hosts
docker exec charliehub_wg_monitor ping -c 1 8.8.8.8
# Should succeed (ICMP allowed to external networks)

Check 3: Port accuracy

# Verify ports match WireGuard configuration
docker exec charliehub_wg_monitor cat /etc/wireguard/wg-uk.conf | grep ListenPort
# Should show: ListenPort=51821

One endpoint reachable, other failing

For UK endpoint failing:

# Check wan-watcher updated the endpoint correctly
docker logs charliehub_wan_watcher | grep "UK"

# Verify WireGuard is listening on the port
ssh uk-host 'sudo ss -lun | grep 51821'
# Should show: UNCONN 0 0 0.0.0.0:51821

For FR endpoint failing:

# Verify FR endpoint is configured
docker exec charliehub_wg_monitor cat /etc/wireguard/wg-fr.conf | grep Endpoint

Container not starting

# Check Docker build
docker logs charliehub_wg_monitor
# Should show: Starting WireGuard Reachability Monitor

# Verify image exists
docker images | grep wg-monitor
# Should show: charliehub-wg-monitor

# Check permissions on mounted files
ls -la /opt/charliehub/wg-monitor/

Design Decisions

Why UDP Probes (Not TCP)?

WireGuard listens on UDP only. TCP probes would always fail:

Protocol Port 51821 Port 51820 Result
TCP ❌ Refused ❌ Refused Would always fail
UDP ✅ Open ✅ Open Correct detection

Why Parallel Threads?

  • Both endpoints checked simultaneously
  • Total check time: ~3 seconds (not 6)
  • Independent checks (one failure doesn't block the other)
  • Scales to additional endpoints easily

Why Consecutive Failure Tracking?

  • Prevents false alerts from transient packet loss
  • 3-check threshold = 15 minutes of data
  • Only alerts on sustained problems
  • Recovery is immediate (1 successful probe = reset)

Why Separate Service (Not in WAN Watcher)?

  • Separation of concerns: DNS updates vs. reachability monitoring
  • Independent restart cycles: Can restart independently
  • Can be disabled without affecting DNS updates
  • Easier to test and troubleshoot

Impact & Safety

Non-Invasive Observability

Does NOT modify: - WireGuard configurations - DNS records - CT1119 (WireGuard endpoints) - WAN Watcher behavior

Does ONLY: - Send empty UDP datagrams to ports 51821 and 51820 - Resolve domain names (standard DNS queries) - Log results to stdout - Track state in memory

Rollback

If wg-monitor causes issues:

# Stop the service
docker compose down wg-monitor

# Remove from docker-compose.yml (optional)
git checkout docker-compose.yml

# Remove directory (optional)
sudo rm -rf /opt/charliehub/wg-monitor

# Verify other services unaffected
docker compose up -d

Impact of rollback: Zero impact. wg-monitor is standalone observability, not required for core functionality.