Skip to content

Security Maintenance: Quarterly Credential Rotation

Overview

This document describes the quarterly credential rotation process for CharlieHub infrastructure. Regular credential rotation is a critical security practice that:

  • Limits exposure window - If a credential is compromised, it's valid for at most 3 months
  • Reduces insider risk - Former team members' access automatically expires
  • Demonstrates compliance - Required for SOC2, ISO27001, and other security frameworks
  • Maintains operational readiness - Ensures rotation procedures work before they're needed in emergencies

Policy: All 13 credentials across CharlieHub infrastructure must be rotated quarterly on the last Sunday of March, June, September, and December at 02:00 UTC.


2026 Rotation Schedule

Quarter Date Time (UTC) Window Notes
Q1 March 29, 2026 02:00 2:00-2:30 AM Spring rotation
Q2 June 28, 2026 02:00 2:00-2:30 AM Summer rotation
Q3 September 27, 2026 02:00 2:00-2:30 AM Fall rotation
Q4 December 27, 2026 02:00 2:00-2:30 AM Winter rotation

Each rotation targets 30-minute maintenance window with < 5 minutes actual downtime.


Credentials Requiring Rotation (13 Total)

Phase 1: Internal Credentials (LOW RISK, 10 min)

These are stored in .env files, easily rotated, and container restarts handle the update:

Credential File Line Type Rotated Via
DOMAIN_MANAGER_API_KEY .env 39 API Key rotate-secrets.sh
CHARLIEHUB_DB_PASSWORD .env 32-33 Password rotate-secrets.sh
AUTHELIA_JWT_SECRET .env 36 JWT Secret rotate-secrets.sh
GRAFANA_ADMIN_PASSWORD .env 48 Password rotate-secrets.sh
CODE_SERVER_PASSWORD .env 51 Password rotate-secrets.sh
UNIFI_PASSWORD .env 44 Password rotate-secrets.sh
SMTP_PASSWORD .env 20 App Password rotate-secrets.sh
POSTGRES_PASSWORD (domain-manager) domain-manager/.env 16 Password rotate-secrets.sh
API_KEY (domain-manager) domain-manager/.env 20 API Key rotate-secrets.sh

Downtime: ~2 min (container restarts) Risk: LOW - Fully automated, easy rollback Rollback time: < 3 minutes

Phase 2: Configuration File Secrets (MEDIUM RISK, 15 min)

These are embedded in configuration files and require file edits:

Credential File Purpose Rotated Via
Authelia encryption_key authelia/config/configuration.yml:93 Session encryption rotate-secrets.sh
Authelia hmac_secret authelia/config/configuration.yml:109 HMAC signing rotate-secrets.sh
Parking DB password docker-compose.yml:437 Database auth rotate-secrets.sh
CBRE DB password docker-compose.yml:449 Database auth rotate-secrets.sh

Downtime: ~2 min (container restarts) Risk: MEDIUM - Requires file edits, but still automated Rollback time: < 3 minutes (restore from backup)

Phase 3: External Dependency Credentials (HIGH RISK, 20 min)

These require action in external systems BEFORE restarting containers:

Credential System Purpose Manual Steps
OVH_APP_KEY OVH API Domain management Regenerate in OVH console
OVH_APP_SECRET OVH API Domain management Regenerate in OVH console
OVH_CONSUMER_KEY OVH API Domain management Complete new request
SMTP_PASSWORD Gmail Alert email delivery Generate new app password
UNIFI_PASSWORD UniFi Controller Network monitoring Change user password

Downtime: 0 (external changes don't affect uptime) Risk: HIGH - Requires external system access Rollback time: ~10 minutes (requires external system action)


Pre-Rotation Checklist (1 Week Before)

  • [ ] Monday - Schedule maintenance window, notify team
  • [ ] Tuesday - Test rotation script in dry-run mode
    sudo /opt/charliehub/scripts/rotate-secrets.sh --internal --dry-run
    
  • [ ] Wednesday - Review SAFEGUARDS.md security policy
  • [ ] Thursday - Document any recent credential changes
  • [ ] Friday - Perform backup of entire system

Actions:

# Full dry-run test
sudo /opt/charliehub/scripts/rotate-secrets.sh --all --dry-run

# Review generated secrets (dry-run shows what would change)
# Verify container restart order makes sense
# Test rollback procedure with a fake backup


Day-Before Checklist (Saturday)

  • [ ] Verify test results - Dry-run completed successfully?
  • [ ] Check external access - Can you reach OVH, Gmail, UniFi consoles?
  • [ ] Prepare external credentials - Have temporary access ready
  • [ ] Notify stakeholders - Confirm maintenance window acceptable
  • [ ] Take full system backup

Actions:

# Create pre-rotation system backup
sudo tar -czf /var/backups/charliehub-pre-rotation-$(date +%Y%m%d).tar.gz \
  /opt/charliehub/.env* \
  /opt/charliehub/docker-compose.yml \
  /opt/charliehub/authelia/config/

# Verify backup integrity
tar -tzf /var/backups/charliehub-pre-rotation-*.tar.gz | head -10


Rotation Day Execution (Sunday, 02:00 UTC)

T-0:00 - Pre-Rotation Window (01:30 UTC)

Status checks:

# Verify all services healthy before starting
curl -s http://172.19.0.5:8001/health | jq .
docker ps | grep charliehub | wc -l  # Should show ~8 containers

# Check DNS resolution
dig charliehub.net +short  # Should return IP
dig *.charliehub.net +short  # Should resolve subdomains

Notifications: - Send to #ops Slack channel: "Starting credential rotation at 02:00 UTC" - Monitor status page for any active incidents - Ensure no deployments in progress

T+0:00 - Rotation Execution (02:00 UTC)

Phase 1: Internal Secrets (02:00-02:10)

# Run rotation for internal secrets only
sudo /opt/charliehub/scripts/rotate-secrets.sh --internal

# This will:
# 1. Back up current .env files
# 2. Generate new secrets for all internal credentials
# 3. Update .env files (temporarily writable)
# 4. Restart containers in safe order:
#    - PostgreSQL (5s)
#    - Redis (2s)
#    - Domain Manager & UniFi API (10s parallel)
#    - Authelia (8s)
#    - Grafana & Code Server (5s parallel)
# 5. Validate service health

Status checks (every 30 seconds):

# Monitor logs
tail -f /var/log/charliehub/secret-rotation-*.log

# Monitor containers
watch -n 1 'docker ps | grep -E "charliehub|authelia|grafana"'

# Check API health
watch -n 1 'curl -s http://172.19.0.5:8001/health | jq .'

Expected timeline: - T+2:10: All internal secrets rotated, containers restarting - T+2:15: Services returning to health - T+2:20: Full health check passing

Phase 2: Configuration Secrets (02:20-02:25)

# Rotate configuration file secrets
sudo /opt/charliehub/scripts/rotate-secrets.sh --config

Changes made: - Authelia encryption keys rotated - Database passwords in docker-compose.yml updated - Containers restarted again with new config

Phase 3: External Credentials (02:25-02:30, MANUAL)

# Display checklist
sudo /opt/charliehub/scripts/rotate-secrets.sh --external

Operator manually:

  1. OVH API Rotation (5 min):
  2. Open https://www.ovh.com/manager/
  3. Go to Account → API Credentials
  4. Create new "CharlieHub" application
  5. Save new APP_KEY, APP_SECRET, CONSUMER_KEY
  6. Update /opt/charliehub/domain-manager/.env lines 8-10
  7. Keep old credentials valid for 24h grace period

  8. Gmail App Password (3 min):

  9. Open https://myaccount.google.com
  10. Go to Security → App Passwords
  11. Generate new password for "CharlieHub Alerts"
  12. Copy 16-char app password
  13. Update /opt/charliehub/.env line 20
  14. Test: curl -X POST smtp.gmail.com:587 ...

  15. UniFi Controller (3 min):

  16. Open https://10.44.1.1:8443
  17. Go to Settings → Admins
  18. Edit api-service user
  19. Set new password (20+ chars, save securely)
  20. Update /opt/charliehub/.env line 44
  21. Test API connectivity

T+0:30 - Rotation Complete (02:30 UTC)

Post-rotation validation:

# Full health check
curl http://172.19.0.5:8001/health && echo "API: OK"
curl -u admin:NEW_GRAFANA_PASSWORD http://grafana:3000/api/health && echo "Grafana: OK"

# Test each service
docker exec charliehub-postgres psql -U charliehub -c "SELECT version();" && echo "PostgreSQL: OK"
docker exec domain-manager curl -s http://localhost:8001/health | jq . && echo "Domain Manager: OK"

# Check logs for errors
docker logs charliehub-postgres | grep -i error | head -5
docker logs authelia | grep -i error | head -5
docker logs domain-manager | grep -i error | head -5


Post-Rotation Verification (Immediate)

Immediate Checks (T+0:30-1:00)

Service availability (should all pass):

# Test each domain
for domain in charliehub.net auth.charliehub.net grafana.charliehub.net; do
    echo "Testing $domain..."
    curl -I https://$domain 2>&1 | grep "200\|301\|302"
done

# Test authentication
curl -I https://charliehub.net/auth  # Should redirect to Authelia
curl -I https://auth.charliehub.net  # Should load login

# Test API
curl -X POST https://api.charliehub.net/domains \
    -H "Authorization: Bearer $(grep DOMAIN_MANAGER_API_KEY /opt/charliehub/.env | cut -d= -f2)" \
    -H "Content-Type: application/json" \
    -d '{}' | jq .

Log review:

# Check for errors in past 30 minutes
journalctl -S "30 minutes ago" | grep -i "error\|fatal\|critical" | head -20

# Check docker logs
docker logs charliehub-postgres 2>&1 | tail -20
docker logs authelia 2>&1 | tail -20
docker logs domain-manager 2>&1 | tail -20

24-Hour Checks (After Monday 02:00 UTC)

  • [ ] All domains still resolving correctly
  • [ ] HTTPS certificates not expiring (check traefik logs)
  • [ ] Database backups running normally
  • [ ] Monitoring alerts not firing (except expected)
  • [ ] User SSO logins working (test with a real user)
  • [ ] API requests completing normally (check domain-manager logs)
  • [ ] Email alerts being sent (check SMTP logs)

Commands:

# Check certificate expiry
curl -I https://charliehub.net 2>&1 | grep -i "expire\|valid"

# Check domain manager API activity
docker logs domain-manager 2>&1 | grep -i "request\|error" | tail -50

# Check Authelia auth logs
docker logs authelia 2>&1 | grep "Login\|Auth\|success" | tail -20

# Check SMTP connectivity
telnet smtp.gmail.com 587

1-Week Checks

  • [ ] No increase in support tickets about access issues
  • [ ] All monitoring dashboards showing normal patterns
  • [ ] System performance metrics unchanged
  • [ ] Zero authentication-related incidents
  • [ ] External integrations (OVH, monitoring) working

Emergency Rollback Procedures

If rotation fails or services become unhealthy:

Automatic Rollback (Built-in)

The rotation script includes automatic rollback on failure:

# If validation fails, script automatically:
# 1. Stops container restarts
# 2. Restores .env files from backup
# 3. Restarts containers with old credentials
# 4. Validates services return to health
# 5. Logs all actions to audit trail

Manual Rollback (If Automatic Fails)

# List available backups
ls -lh /opt/charliehub/.env-backups/

# Restore from specific backup
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-20260329_020000

# Verify restoration
cat /opt/charliehub/.env | grep "CHARLIEHUB_DB_PASSWORD"

# Restart containers
docker-compose restart

# Validate health
curl http://172.19.0.5:8001/health | jq .

Complete System Rollback

# If full system rollback needed from backup created pre-rotation
sudo tar -xzf /var/backups/charliehub-pre-rotation-20260329.tar.gz -C /opt/charliehub/

# Verify files restored
ls -la /opt/charliehub/.env

# Ensure permissions correct
sudo chmod 440 /opt/charliehub/.env /opt/charliehub/.env*

# Restart all services
cd /opt/charliehub && docker-compose restart

Troubleshooting Common Issues

Issue: Service fails to start with wrong password error

Error message:

ERROR: relation "domain_manager" does not exist
FATAL: password authentication failed for user "charliehub"

Solution:

# Rollback immediately
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-LATEST

# Verify old password in effect
grep CHARLIEHUB_DB_PASSWORD /opt/charliehub/.env

# Restart containers
cd /opt/charliehub && docker-compose restart

# Test connectivity
docker exec charliehub-postgres psql -U charliehub -c "SELECT 1;"

Issue: Authelia failing to start after rotation

Error message:

level=error msg="Encryption key is invalid"

Solution:

# The encryption key format must be hex, 32 bytes (64 hex chars)
# Verify in configuration.yml
grep "encryption_key:" /opt/charliehub/authelia/config/configuration.yml

# If wrong, check backup
cat /opt/charliehub/.env-backups/configuration.yml.backup-LATEST | grep encryption_key

# Restore if needed
sudo chmod 644 /opt/charliehub/authelia/config/configuration.yml
sudo cp /opt/charliehub/.env-backups/configuration.yml.backup-LATEST \
    /opt/charliehub/authelia/config/configuration.yml
sudo chmod 440 /opt/charliehub/authelia/config/configuration.yml

# Restart
docker-compose restart authelia

Issue: External credential failed to update (OVH, Gmail, UniFi)

Solution:

# CharlieHub will continue working with old credentials for 24h grace period
# This gives time to fix external systems

# For OVH: If new credentials invalid, revert to previous app credentials
# For Gmail: If app password wrong, generate new one again
# For UniFi: If password wrong, reset on controller, try again

# Monitor logs for connection failures
docker logs domain-manager 2>&1 | grep -i "ovh\|error"
journalctl -S "1 hour ago" | grep -i "smtp\|auth"

# Once external credentials fixed, containers restart will pick them up
docker-compose restart domain-manager  # For OVH changes
docker-compose restart authelia  # For any auth changes

Issue: Services remain unhealthy after rotation

Diagnosis:

# Check container logs
docker logs charliehub-postgres | tail -50
docker logs authelia | tail -50
docker logs domain-manager | tail -50

# Check which containers are running
docker ps

# Check if containers are restarting
docker ps -a | grep -E "Exit|Restarting"

# Check docker resource constraints
docker stats --no-stream

Recovery:

# If out of memory or resources, rebuild images
cd /opt/charliehub && docker-compose down
docker system prune -a  # Warning: removes all unused images
docker-compose up -d

# If configuration is corrupted, restore entire config
sudo tar -xzf /var/backups/charliehub-pre-rotation-*.tar.gz -C /opt/charliehub/

# Restart everything
docker-compose restart


Audit and Compliance

Audit Trail

All rotation activities are logged to multiple sources:

Rotation logs:

# Main rotation log
tail -f /var/log/charliehub/secret-rotation-20260329_020000.log

# Git hooks log (if used)
tail -f /var/log/charliehub/git-hooks.log

# System sudo log
sudo journalctl -u sudo | grep rotate-secrets

Generating rotation report:

# Generate complete audit report
charliehub-audit-report

# This shows:
# - All sudo commands run during rotation
# - All file changes made
# - All container restarts
# - All service health checks
# - Timestamps and operators

# Save report to file
charliehub-audit-report > /var/backups/rotation-audit-20260329.txt

Compliance Requirements

For SOC2 / ISO27001:

Evidence to retain: - [ ] Pre-rotation dry-run log showing what would change - [ ] Actual rotation log showing what did change - [ ] Health check results before and after - [ ] Operator sign-off (Slack message, email, or git log) - [ ] Any rollbacks or issues encountered - [ ] Time taken (should be < 30 minutes)

Annual review: - [ ] 4 successful rotations completed (Q1, Q2, Q3, Q4) - [ ] Average rotation time tracked - [ ] Any issues or incidents documented - [ ] Rollback procedures tested quarterly


Documentation and Runbooks

For Operators

  • This page - Complete rotation procedure
  • /opt/charliehub/OPERATOR_HOWTO.md - How to make infrastructure changes
  • /opt/charliehub/scripts/rotate-secrets.sh - Automated rotation script
  • /opt/charliehub/scripts/restore-secrets.sh - Manual recovery script
  • /opt/charliehub/.env-backups/ - Timestamped backup directory

For Developers/Team

  • /opt/charliehub/SAFEGUARDS.md - Security policy (includes rotation requirement)
  • /opt/charliehub/AGENT_TROUBLESHOOTING.md - Error handling
  • Slack #ops channel - Real-time updates during rotation
  • SSL certificate rotation - Automated via Let's Encrypt / Traefik
  • Database backup - Daily, automated via cron job
  • System updates - Scheduled monthly, separate maintenance window
  • Pre-commit hooks - Prevent secrets entering git (automatic on every commit)

Contact and Escalation

Primary contact: ops@charliehub.net
Slack channel: #ops (for real-time rotation updates)
On-call: Check PagerDuty rotation schedule

Escalation path: - Issues during rotation → Immediately notify team lead - External system failures → Contact system owner (OVH, Gmail, etc.) - Data corruption → Activate incident response (see SAFEGUARDS.md)


Appendix A: Script Reference

rotate-secrets.sh

# View usage
/opt/charliehub/scripts/rotate-secrets.sh --help

# Phases available
--internal    # Rotate .env files only (low risk, 10 min)
--config      # Rotate config files (medium risk, 15 min)
--external    # Rotate external dependencies (high risk, manual)
--all         # Rotate everything (default)

# Options
--dry-run     # Preview changes without applying

# Examples
sudo /opt/charliehub/scripts/rotate-secrets.sh --internal
sudo /opt/charliehub/scripts/rotate-secrets.sh --all --dry-run
sudo /opt/charliehub/scripts/rotate-secrets.sh --config

restore-secrets.sh

# View available backups
ls -lh /opt/charliehub/.env-backups/

# Restore from specific backup
sudo /opt/charliehub/scripts/restore-secrets.sh .env.backup-20260329_020000

# Backups retained for 90 days

charliehub-audit-report

# Generate complete audit trail
charliehub-audit-report

# Include in compliance reports
charliehub-audit-report > /var/backups/rotation-audit-$(date +%Y%m%d).txt

# Follow-up checks
charliehub-audit-report | grep "ERROR\|FAILED"
charliehub-audit-report | grep "rotation\|rotate"

Version History

  • 2026-02-09 - Initial creation, quarterly schedule set, procedures documented
  • 2026-Q1 - First rotation scheduled for March 29, 2026