Commit graph

8 commits

Author SHA1 Message Date
207e1dcff0 chore: project cleanup and docs update
All checks were successful
CI/CD / syntax-check (push) Successful in 1m29s
CI/CD / deploy (push) Successful in 16m39s
- Remove Syncthing mention from authelia comment in docker-compose
- Fix backup.sh.j2 comment: hourly → every 6 hours
- Update CLAUDE.md: add docs update rule, fix backup schedule note
- Update STATUS.md: dash.csrx.ru fixed, PTR pending, backup schedule, mail hostnames
- Update BACKLOG.md: mark DNS/PTR/backup-schedule done, add SnappyMail domain task
- Update DECISIONS.md: fix backup section (no --storage-class COLD, correct schedule)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 17:00:35 +07:00
634d50c25d chore(backup): change schedule from hourly to every 6 hours
All checks were successful
CI/CD / syntax-check (push) Successful in 1m10s
CI/CD / deploy (push) Successful in 15m50s
Runs at 00:00, 06:00, 12:00, 18:00. Removes old hourly cron entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 15:30:55 +07:00
ebac7d807e fix(backup): remove unsupported --storage-class COLD for Timeweb S3
All checks were successful
CI/CD / syntax-check (push) Successful in 1m32s
CI/CD / deploy (push) Successful in 17m3s
Timeweb S3 doesn't support per-object storage class via API parameter.
Cold storage is configured at bucket level in Timeweb control panel.
Also: make S3 upload failures explicit (exit 1) instead of silently ignored.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 20:22:50 +07:00
624b85cd15 feat(backup): hourly schedule, cold S3 storage at data/ prefix
Some checks failed
CI/CD / syntax-check (push) Successful in 1m13s
CI/CD / deploy (push) Has been cancelled
- Change cron from daily 03:00 to every hour (minute=0)
- Change S3 path from main/ to data/ as requested
- Change storage class from STANDARD to COLD (Timeweb cold storage)
- Update S3 pruning to match new data/ prefix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 19:44:34 +07:00
bf59b75c8f fix: redesign backup archive structure + enable Outline email auth
Some checks failed
CI/CD / syntax-check (push) Successful in 1m13s
CI/CD / deploy (push) Has been cancelled
Backup (backup.sh.j2):
- Creates a single data_YYYY-MM-DD_HH-MM.tar.gz archive
- Unified data/ layout: databases/ (pg_dump .sql.gz) + volumes/ (docker volumes)
- Includes RESTORE.md with step-by-step instructions inside the archive
- S3 uploads to main/ prefix instead of flat root

Outline (tools role):
- Add SMTP_HOST/PORT/FROM vars to env.j2 template (required for email magic-link auth to activate)
- Add outline_smtp_* defaults to roles/tools/defaults/main.yml
- Without SMTP_HOST, the email auth plugin is disabled and clicking login does nothing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 16:20:11 +07:00
92d2c845d8 feat: add n8n, outline routes, remove syncthing, fix backup awscli
Some checks failed
CI/CD / syntax-check (push) Successful in 1m14s
CI/CD / deploy (push) Failing after 10m51s
- Add n8n to tools server (n8n.csrx.ru)
- Add cross-server Traefik routes: wiki.csrx.ru + n8n.csrx.ru → tools
- Remove Syncthing (replaced by Outline wiki)
- Fix awscli install: download static binary (apt/pip broken on Ubuntu 24.04)
- Add n8n secrets to vault (encryption key + JWT secret)
- Improve CI/CD workflow: syntax-check both playbooks, deploy both servers
- Update site.yml: unified single-command deploy for all servers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 06:19:39 +07:00
fc6b1c0cec feat: Timeweb S3 offsite backup uploads
Some checks failed
CI/CD / syntax-check (push) Successful in 39s
CI/CD / deploy (push) Has been cancelled
- Add vault_s3_access_key / vault_s3_secret_key to Ansible Vault
- Expose via s3_access_key / s3_secret_key in all/main.yml
- Add s3_endpoint + s3_bucket to backup role defaults
- Install awscli via apt in backup role tasks
- Extend backup.sh.j2: upload *.gz to S3 after local backup,
  prune S3 objects older than backup_retention_days

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:58:58 +07:00
6ebd237894 feat: major infrastructure improvements
Some checks failed
CI/CD / deploy (push) Has been cancelled
CI/CD / syntax-check (push) Successful in 1m7s
Reliability:
- Add swap role (2GB, swappiness=10, idempotent via /etc/fstab)
- Add mem_limit to plane-worker (512m) and plane-beat (256m)
- Add health checks to all services (traefik, vaultwarden, forgejo,
  plane-*, syncthing, prometheus, grafana, loki)

Code quality:
- Remove Traefik Docker labels (file provider used, labels were dead code)
- Add comment explaining file provider architecture

Observability:
- Add AlertManager with Telegram notifications
- Add Prometheus alert rules: CPU, RAM, disk, swap, container health
- Add Loki + Promtail for centralized log aggregation
- Add Loki datasource to Grafana
- Enable Traefik /ping endpoint for health checks

Backups:
- Add backup role: pg_dump for forgejo + plane DBs, tar for
  vaultwarden and forgejo data
- 7-day retention, daily cron at 03:00
- Backup script at /usr/local/bin/backup-services

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:28:16 +07:00