Security Maintenance: Quarterly Credential Rotation¶
Overview¶
This document describes the quarterly credential rotation process for CharlieHub infrastructure. Regular credential rotation is a critical security practice that:
- Limits exposure window - If a credential is compromised, it's valid for at most 3 months
- Reduces insider risk - Former team members' access automatically expires
- Demonstrates compliance - Required for SOC2, ISO27001, and other security frameworks
- Maintains operational readiness - Ensures rotation procedures work before they're needed in emergencies
Policy: All 13 credentials across CharlieHub infrastructure must be rotated quarterly on the last Sunday of March, June, September, and December at 02:00 UTC.
2026 Rotation Schedule¶
| Quarter | Date | Time (UTC) | Window | Notes |
|---|---|---|---|---|
| Q1 | March 29, 2026 | 02:00 | 2:00-2:30 AM | Spring rotation |
| Q2 | June 28, 2026 | 02:00 | 2:00-2:30 AM | Summer rotation |
| Q3 | September 27, 2026 | 02:00 | 2:00-2:30 AM | Fall rotation |
| Q4 | December 27, 2026 | 02:00 | 2:00-2:30 AM | Winter rotation |
Each rotation targets 30-minute maintenance window with < 5 minutes actual downtime.
Credentials Requiring Rotation (13 Total)¶
Phase 1: Internal Credentials (LOW RISK, 10 min)¶
These are stored in .env files, easily rotated, and container restarts handle the update:
| Credential | File | Line | Type | Rotated Via |
|---|---|---|---|---|
| DOMAIN_MANAGER_API_KEY | .env |
39 | API Key | rotate-secrets.sh |
| CHARLIEHUB_DB_PASSWORD | .env |
32-33 | Password | rotate-secrets.sh |
| AUTHELIA_JWT_SECRET | .env |
36 | JWT Secret | rotate-secrets.sh |
| GRAFANA_ADMIN_PASSWORD | .env |
48 | Password | rotate-secrets.sh |
| CODE_SERVER_PASSWORD | .env |
51 | Password | rotate-secrets.sh |
| UNIFI_PASSWORD | .env |
44 | Password | rotate-secrets.sh |
| SMTP_PASSWORD | .env |
20 | App Password | rotate-secrets.sh |
| POSTGRES_PASSWORD (domain-manager) | domain-manager/.env |
16 | Password | rotate-secrets.sh |
| API_KEY (domain-manager) | domain-manager/.env |
20 | API Key | rotate-secrets.sh |
Downtime: ~2 min (container restarts) Risk: LOW - Fully automated, easy rollback Rollback time: < 3 minutes
Phase 2: Configuration File Secrets (MEDIUM RISK, 15 min)¶
These are embedded in configuration files and require file edits:
| Credential | File | Purpose | Rotated Via |
|---|---|---|---|
| Authelia encryption_key | authelia/config/configuration.yml:93 |
Session encryption | rotate-secrets.sh |
| Authelia hmac_secret | authelia/config/configuration.yml:109 |
HMAC signing | rotate-secrets.sh |
| Parking DB password | docker-compose.yml:437 |
Database auth | rotate-secrets.sh |
| CBRE DB password | docker-compose.yml:449 |
Database auth | rotate-secrets.sh |
Downtime: ~2 min (container restarts) Risk: MEDIUM - Requires file edits, but still automated Rollback time: < 3 minutes (restore from backup)
Phase 3: External Dependency Credentials (HIGH RISK, 20 min)¶
These require action in external systems BEFORE restarting containers:
| Credential | System | Purpose | Manual Steps |
|---|---|---|---|
| OVH_APP_KEY | OVH API | Domain management | Regenerate in OVH console |
| OVH_APP_SECRET | OVH API | Domain management | Regenerate in OVH console |
| OVH_CONSUMER_KEY | OVH API | Domain management | Complete new request |
| SMTP_PASSWORD | Gmail | Alert email delivery | Generate new app password |
| UNIFI_PASSWORD | UniFi Controller | Network monitoring | Change user password |
Downtime: 0 (external changes don't affect uptime) Risk: HIGH - Requires external system access Rollback time: ~10 minutes (requires external system action)
Pre-Rotation Checklist (1 Week Before)¶
- [ ] Monday - Schedule maintenance window, notify team
- [ ] Tuesday - Test rotation script in dry-run mode
sudo /opt/charliehub/scripts/rotate-secrets.sh --internal --dry-run - [ ] Wednesday - Review SAFEGUARDS.md security policy
- [ ] Thursday - Document any recent credential changes
- [ ] Friday - Perform backup of entire system
Actions:
# Full dry-run test
sudo /opt/charliehub/scripts/rotate-secrets.sh --all --dry-run
# Review generated secrets (dry-run shows what would change)
# Verify container restart order makes sense
# Test rollback procedure with a fake backup
Day-Before Checklist (Saturday)¶
- [ ] Verify test results - Dry-run completed successfully?
- [ ] Check external access - Can you reach OVH, Gmail, UniFi consoles?
- [ ] Prepare external credentials - Have temporary access ready
- [ ] Notify stakeholders - Confirm maintenance window acceptable
- [ ] Take full system backup
Actions:
# Create pre-rotation system backup
sudo tar -czf /var/backups/charliehub-pre-rotation-$(date +%Y%m%d).tar.gz \
/opt/charliehub/.env* \
/opt/charliehub/docker-compose.yml \
/opt/charliehub/authelia/config/
# Verify backup integrity
tar -tzf /var/backups/charliehub-pre-rotation-*.tar.gz | head -10
Rotation Day Execution (Sunday, 02:00 UTC)¶
T-0:00 - Pre-Rotation Window (01:30 UTC)¶
Status checks:
# Verify all services healthy before starting
curl -s http://172.19.0.5:8001/health | jq .
docker ps | grep charliehub | wc -l # Should show ~8 containers
# Check DNS resolution
dig charliehub.net +short # Should return IP
dig *.charliehub.net +short # Should resolve subdomains
Notifications: - Send to #ops Slack channel: "Starting credential rotation at 02:00 UTC" - Monitor status page for any active incidents - Ensure no deployments in progress
T+0:00 - Rotation Execution (02:00 UTC)¶
Phase 1: Internal Secrets (02:00-02:10)
# Run rotation for internal secrets only
sudo /opt/charliehub/scripts/rotate-secrets.sh --internal
# This will:
# 1. Back up current .env files
# 2. Generate new secrets for all internal credentials
# 3. Update .env files (temporarily writable)
# 4. Restart containers in safe order:
# - PostgreSQL (5s)
# - Redis (2s)
# - Domain Manager & UniFi API (10s parallel)
# - Authelia (8s)
# - Grafana & Code Server (5s parallel)
# 5. Validate service health
Status checks (every 30 seconds):
# Monitor logs
tail -f /var/log/charliehub/secret-rotation-*.log
# Monitor containers
watch -n 1 'docker ps | grep -E "charliehub|authelia|grafana"'
# Check API health
watch -n 1 'curl -s http://172.19.0.5:8001/health | jq .'
Expected timeline: - T+2:10: All internal secrets rotated, containers restarting - T+2:15: Services returning to health - T+2:20: Full health check passing
Phase 2: Configuration Secrets (02:20-02:25)¶
# Rotate configuration file secrets
sudo /opt/charliehub/scripts/rotate-secrets.sh --config
Changes made: - Authelia encryption keys rotated - Database passwords in docker-compose.yml updated - Containers restarted again with new config
Phase 3: External Credentials (02:25-02:30, MANUAL)¶
# Display checklist
sudo /opt/charliehub/scripts/rotate-secrets.sh --external
Operator manually:
- OVH API Rotation (5 min):
- Open https://www.ovh.com/manager/
- Go to Account → API Credentials
- Create new "CharlieHub" application
- Save new APP_KEY, APP_SECRET, CONSUMER_KEY
- Update
/opt/charliehub/domain-manager/.envlines 8-10 -
Keep old credentials valid for 24h grace period
-
Gmail App Password (3 min):
- Open https://myaccount.google.com
- Go to Security → App Passwords
- Generate new password for "CharlieHub Alerts"
- Copy 16-char app password
- Update
/opt/charliehub/.envline 20 -
Test:
curl -X POST smtp.gmail.com:587 ... -
UniFi Controller (3 min):
- Open https://10.44.1.1:8443
- Go to Settings → Admins
- Edit api-service user
- Set new password (20+ chars, save securely)
- Update
/opt/charliehub/.envline 44 - Test API connectivity
T+0:30 - Rotation Complete (02:30 UTC)¶
Post-rotation validation:
# Full health check
curl http://172.19.0.5:8001/health && echo "API: OK"
curl -u admin:NEW_GRAFANA_PASSWORD http://grafana:3000/api/health && echo "Grafana: OK"
# Test each service
docker exec charliehub-postgres psql -U charliehub -c "SELECT version();" && echo "PostgreSQL: OK"
docker exec domain-manager curl -s http://localhost:8001/health | jq . && echo "Domain Manager: OK"
# Check logs for errors
docker logs charliehub-postgres | grep -i error | head -5
docker logs authelia | grep -i error | head -5
docker logs domain-manager | grep -i error | head -5
Post-Rotation Verification (Immediate)¶
Immediate Checks (T+0:30-1:00)¶
Service availability (should all pass):
# Test each domain
for domain in charliehub.net auth.charliehub.net grafana.charliehub.net; do
echo "Testing $domain..."
curl -I https://$domain 2>&1 | grep "200\|301\|302"
done
# Test authentication
curl -I https://charliehub.net/auth # Should redirect to Authelia
curl -I https://auth.charliehub.net # Should load login
# Test API
curl -X POST https://api.charliehub.net/domains \
-H "Authorization: Bearer $(grep DOMAIN_MANAGER_API_KEY /opt/charliehub/.env | cut -d= -f2)" \
-H "Content-Type: application/json" \
-d '{}' | jq .
Log review:
# Check for errors in past 30 minutes
journalctl -S "30 minutes ago" | grep -i "error\|fatal\|critical" | head -20
# Check docker logs
docker logs charliehub-postgres 2>&1 | tail -20
docker logs authelia 2>&1 | tail -20
docker logs domain-manager 2>&1 | tail -20
24-Hour Checks (After Monday 02:00 UTC)¶
- [ ] All domains still resolving correctly
- [ ] HTTPS certificates not expiring (check traefik logs)
- [ ] Database backups running normally
- [ ] Monitoring alerts not firing (except expected)
- [ ] User SSO logins working (test with a real user)
- [ ] API requests completing normally (check domain-manager logs)
- [ ] Email alerts being sent (check SMTP logs)
Commands:
# Check certificate expiry
curl -I https://charliehub.net 2>&1 | grep -i "expire\|valid"
# Check domain manager API activity
docker logs domain-manager 2>&1 | grep -i "request\|error" | tail -50
# Check Authelia auth logs
docker logs authelia 2>&1 | grep "Login\|Auth\|success" | tail -20
# Check SMTP connectivity
telnet smtp.gmail.com 587
1-Week Checks¶
- [ ] No increase in support tickets about access issues
- [ ] All monitoring dashboards showing normal patterns
- [ ] System performance metrics unchanged
- [ ] Zero authentication-related incidents
- [ ] External integrations (OVH, monitoring) working
Emergency Rollback Procedures¶
If rotation fails or services become unhealthy:
Automatic Rollback (Built-in)¶
The rotation script includes automatic rollback on failure:
# If validation fails, script automatically:
# 1. Stops container restarts
# 2. Restores .env files from backup
# 3. Restarts containers with old credentials
# 4. Validates services return to health
# 5. Logs all actions to audit trail
Manual Rollback (If Automatic Fails)¶
# List available backups
ls -lh /opt/charliehub/.env-backups/
# Restore from specific backup
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-20260329_020000
# Verify restoration
cat /opt/charliehub/.env | grep "CHARLIEHUB_DB_PASSWORD"
# Restart containers
docker-compose restart
# Validate health
curl http://172.19.0.5:8001/health | jq .
Complete System Rollback¶
# If full system rollback needed from backup created pre-rotation
sudo tar -xzf /var/backups/charliehub-pre-rotation-20260329.tar.gz -C /opt/charliehub/
# Verify files restored
ls -la /opt/charliehub/.env
# Ensure permissions correct
sudo chmod 440 /opt/charliehub/.env /opt/charliehub/.env*
# Restart all services
cd /opt/charliehub && docker-compose restart
Troubleshooting Common Issues¶
Issue: Service fails to start with wrong password error¶
Error message:
ERROR: relation "domain_manager" does not exist
FATAL: password authentication failed for user "charliehub"
Solution:
# Rollback immediately
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-LATEST
# Verify old password in effect
grep CHARLIEHUB_DB_PASSWORD /opt/charliehub/.env
# Restart containers
cd /opt/charliehub && docker-compose restart
# Test connectivity
docker exec charliehub-postgres psql -U charliehub -c "SELECT 1;"
Issue: Authelia failing to start after rotation¶
Error message:
level=error msg="Encryption key is invalid"
Solution:
# The encryption key format must be hex, 32 bytes (64 hex chars)
# Verify in configuration.yml
grep "encryption_key:" /opt/charliehub/authelia/config/configuration.yml
# If wrong, check backup
cat /opt/charliehub/.env-backups/configuration.yml.backup-LATEST | grep encryption_key
# Restore if needed
sudo chmod 644 /opt/charliehub/authelia/config/configuration.yml
sudo cp /opt/charliehub/.env-backups/configuration.yml.backup-LATEST \
/opt/charliehub/authelia/config/configuration.yml
sudo chmod 440 /opt/charliehub/authelia/config/configuration.yml
# Restart
docker-compose restart authelia
Issue: External credential failed to update (OVH, Gmail, UniFi)¶
Solution:
# CharlieHub will continue working with old credentials for 24h grace period
# This gives time to fix external systems
# For OVH: If new credentials invalid, revert to previous app credentials
# For Gmail: If app password wrong, generate new one again
# For UniFi: If password wrong, reset on controller, try again
# Monitor logs for connection failures
docker logs domain-manager 2>&1 | grep -i "ovh\|error"
journalctl -S "1 hour ago" | grep -i "smtp\|auth"
# Once external credentials fixed, containers restart will pick them up
docker-compose restart domain-manager # For OVH changes
docker-compose restart authelia # For any auth changes
Issue: Services remain unhealthy after rotation¶
Diagnosis:
# Check container logs
docker logs charliehub-postgres | tail -50
docker logs authelia | tail -50
docker logs domain-manager | tail -50
# Check which containers are running
docker ps
# Check if containers are restarting
docker ps -a | grep -E "Exit|Restarting"
# Check docker resource constraints
docker stats --no-stream
Recovery:
# If out of memory or resources, rebuild images
cd /opt/charliehub && docker-compose down
docker system prune -a # Warning: removes all unused images
docker-compose up -d
# If configuration is corrupted, restore entire config
sudo tar -xzf /var/backups/charliehub-pre-rotation-*.tar.gz -C /opt/charliehub/
# Restart everything
docker-compose restart
Audit and Compliance¶
Audit Trail¶
All rotation activities are logged to multiple sources:
Rotation logs:
# Main rotation log
tail -f /var/log/charliehub/secret-rotation-20260329_020000.log
# Git hooks log (if used)
tail -f /var/log/charliehub/git-hooks.log
# System sudo log
sudo journalctl -u sudo | grep rotate-secrets
Generating rotation report:
# Generate complete audit report
charliehub-audit-report
# This shows:
# - All sudo commands run during rotation
# - All file changes made
# - All container restarts
# - All service health checks
# - Timestamps and operators
# Save report to file
charliehub-audit-report > /var/backups/rotation-audit-20260329.txt
Compliance Requirements¶
For SOC2 / ISO27001:
Evidence to retain: - [ ] Pre-rotation dry-run log showing what would change - [ ] Actual rotation log showing what did change - [ ] Health check results before and after - [ ] Operator sign-off (Slack message, email, or git log) - [ ] Any rollbacks or issues encountered - [ ] Time taken (should be < 30 minutes)
Annual review: - [ ] 4 successful rotations completed (Q1, Q2, Q3, Q4) - [ ] Average rotation time tracked - [ ] Any issues or incidents documented - [ ] Rollback procedures tested quarterly
Documentation and Runbooks¶
For Operators¶
- This page - Complete rotation procedure
- /opt/charliehub/OPERATOR_HOWTO.md - How to make infrastructure changes
- /opt/charliehub/scripts/rotate-secrets.sh - Automated rotation script
- /opt/charliehub/scripts/restore-secrets.sh - Manual recovery script
- /opt/charliehub/.env-backups/ - Timestamped backup directory
For Developers/Team¶
- /opt/charliehub/SAFEGUARDS.md - Security policy (includes rotation requirement)
- /opt/charliehub/AGENT_TROUBLESHOOTING.md - Error handling
- Slack #ops channel - Real-time updates during rotation
Related Procedures¶
- SSL certificate rotation - Automated via Let's Encrypt / Traefik
- Database backup - Daily, automated via cron job
- System updates - Scheduled monthly, separate maintenance window
- Pre-commit hooks - Prevent secrets entering git (automatic on every commit)
Contact and Escalation¶
Primary contact: ops@charliehub.net
Slack channel: #ops (for real-time rotation updates)
On-call: Check PagerDuty rotation schedule
Escalation path: - Issues during rotation → Immediately notify team lead - External system failures → Contact system owner (OVH, Gmail, etc.) - Data corruption → Activate incident response (see SAFEGUARDS.md)
Appendix A: Script Reference¶
rotate-secrets.sh¶
# View usage
/opt/charliehub/scripts/rotate-secrets.sh --help
# Phases available
--internal # Rotate .env files only (low risk, 10 min)
--config # Rotate config files (medium risk, 15 min)
--external # Rotate external dependencies (high risk, manual)
--all # Rotate everything (default)
# Options
--dry-run # Preview changes without applying
# Examples
sudo /opt/charliehub/scripts/rotate-secrets.sh --internal
sudo /opt/charliehub/scripts/rotate-secrets.sh --all --dry-run
sudo /opt/charliehub/scripts/rotate-secrets.sh --config
restore-secrets.sh¶
# View available backups
ls -lh /opt/charliehub/.env-backups/
# Restore from specific backup
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-20260329_020000
# Backups retained for 90 days
charliehub-audit-report¶
# Generate complete audit trail
charliehub-audit-report
# Include in compliance reports
charliehub-audit-report > /var/backups/rotation-audit-$(date +%Y%m%d).txt
# Follow-up checks
charliehub-audit-report | grep "ERROR\|FAILED"
charliehub-audit-report | grep "rotation\|rotate"
Version History¶
- 2026-02-09 - Initial creation, quarterly schedule set, procedures documented
- 2026-Q1 - First rotation scheduled for March 29, 2026