Commit graph

25 commits

Author SHA1 Message Date
1f03022086 fix: correct invalid PromQL in ContainerHighMemory alert rule
Some checks failed
CI/CD / syntax-check (push) Successful in 53s
CI/CD / deploy (push) Failing after 57s
Cannot use comparison operators inside label matchers {}.
Move the > 0 filter outside braces as a scalar filter on the
denominator — idiomatic Prometheus way to exclude unlimited containers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:59:56 +07:00
fc6b1c0cec feat: Timeweb S3 offsite backup uploads
Some checks failed
CI/CD / syntax-check (push) Successful in 39s
CI/CD / deploy (push) Has been cancelled
- Add vault_s3_access_key / vault_s3_secret_key to Ansible Vault
- Expose via s3_access_key / s3_secret_key in all/main.yml
- Add s3_endpoint + s3_bucket to backup role defaults
- Install awscli via apt in backup role tasks
- Extend backup.sh.j2: upload *.gz to S3 after local backup,
  prune S3 objects older than backup_retention_days

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:58:58 +07:00
a344998405 feat: add uptime-kuma pull, logrotate deploy task, logrotate package
Some checks failed
CI/CD / syntax-check (push) Successful in 41s
CI/CD / deploy (push) Failing after 39s
- Add uptime_kuma_image to image pull loop in services/tasks/main.yml
- Add logrotate deploy task to services/tasks/configs.yml
- Add logrotate package to base_packages

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:54:24 +07:00
aa9706bbc4 feat: comprehensive security hardening
Some checks failed
CI/CD / syntax-check (push) Successful in 43s
CI/CD / deploy (push) Failing after 59s
Traefik:
- Enable access logs → /var/log/traefik/access.log (needed for CrowdSec)
- Add global security headers middleware: HSTS, X-Frame-Options, CSP,
  nosniff, XSS filter, referrer policy, permissions policy
- Add rate limiting: default 100/s, API 30/s, admin 10/s (strict)
- Add Authelia ForwardAuth middleware for SSO integration

CrowdSec (new service):
- Analyzes Traefik access logs + auth.log in real time
- Community IP reputation blocklist (crowdsecurity/traefik + http-cve)
- Firewall bouncer: bans malicious IPs at kernel level (iptables)

Authelia (new service, auth.csrx.ru):
- 2FA/SSO portal with TOTP (Google Authenticator)
- Protects: traefik.csrx.ru, sync.csrx.ru, /god-mode/ in Plane
- Session: 12h expiry, 30m inactivity, Redis backend
- argon2id password hashing

Container security:
- Add security_opt: no-new-privileges to traefik, vaultwarden,
  forgejo, grafana, authelia

CI/CD security:
- Remove hardcoded server IP 87.249.49.32 from workflow
- Use SSH_KNOWN_HOSTS secret instead of ssh-keyscan (prevents MITM)
- Added SSH_KNOWN_HOSTS secret to Forgejo

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:44:54 +07:00
a42ff4afc7 feat: configure Telegram alerting in AlertManager
All checks were successful
CI/CD / syntax-check (push) Successful in 1m28s
CI/CD / deploy (push) Successful in 6m52s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:31:35 +07:00
6ebd237894 feat: major infrastructure improvements
Some checks failed
CI/CD / deploy (push) Has been cancelled
CI/CD / syntax-check (push) Successful in 1m7s
Reliability:
- Add swap role (2GB, swappiness=10, idempotent via /etc/fstab)
- Add mem_limit to plane-worker (512m) and plane-beat (256m)
- Add health checks to all services (traefik, vaultwarden, forgejo,
  plane-*, syncthing, prometheus, grafana, loki)

Code quality:
- Remove Traefik Docker labels (file provider used, labels were dead code)
- Add comment explaining file provider architecture

Observability:
- Add AlertManager with Telegram notifications
- Add Prometheus alert rules: CPU, RAM, disk, swap, container health
- Add Loki + Promtail for centralized log aggregation
- Add Loki datasource to Grafana
- Enable Traefik /ping endpoint for health checks

Backups:
- Add backup role: pg_dump for forgejo + plane DBs, tar for
  vaultwarden and forgejo data
- 7-day retention, daily cron at 03:00
- Backup script at /usr/local/bin/backup-services

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:28:16 +07:00
972a76db4c feat: add monitoring stack (Prometheus + Grafana + cAdvisor + Node Exporter)
All checks were successful
CI/CD / syntax-check (push) Successful in 3m0s
CI/CD / deploy (push) Successful in 6m51s
- Adds monitoring Docker network (internal)
- Prometheus scrapes node-exporter (host metrics) and cAdvisor (containers)
  with 30-day retention
- Grafana exposed at dashboard.csrx.ru with pre-provisioned datasource
  and two dashboards: Node Exporter Full (1860) and cAdvisor (14282)
- Vault secret: vault_grafana_admin_password

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:05:34 +07:00
9e4ac718d9 docs: add infrastructure plan and Claude agent guide
All checks were successful
CI/CD / syntax-check (push) Successful in 2m38s
CI/CD / deploy (push) Successful in 9m44s
- infrastructure-plan.md: server resource analysis (1 vCPU / 2GB RAM
  critically overloaded), two-server architecture recommendation
- claude-agent.md: how to run Claude Code as an autonomous infra agent
  via Anthropic API + Telegram bot interface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 02:19:38 +07:00
efbbc3cac5 fix: add plane-admin, plane-space and configure instance URLs
Some checks failed
CI/CD / syntax-check (push) Successful in 2m31s
CI/CD / deploy (push) Has been cancelled
New Plane stable requires 3 frontend services:
- plane-admin (nginx:80) for /god-mode/ routes
- plane-space (node:3000) for /spaces/ routes
- plane-web (nginx:80) for all other routes

Also add APP/ADMIN/SPACE_BASE_URL env vars to plane-api so the
setup wizard knows where to redirect.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 02:14:34 +07:00
66c03ffc04 fix: update plane backend for new stable image requirements
All checks were successful
CI/CD / syntax-check (push) Successful in 2m49s
CI/CD / deploy (push) Successful in 8m54s
makeplane/plane-backend:stable now requires:
- AMQP_URL: Celery broker URL (defaults to amqp://localhost, broken)
  → set to redis://plane-redis:6379/ to reuse existing Redis
- GUNICORN_WORKERS: must be set explicitly (empty string causes crash)
  → set to 2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 01:32:11 +07:00
679d3ed010 fix: update plane-web for nginx-based stable image
All checks were successful
CI/CD / syntax-check (push) Successful in 2m41s
CI/CD / deploy (push) Successful in 11m24s
makeplane/plane-frontend:stable now uses nginx (not Next.js/node).
Remove `command: node web/server.js` override and update Traefik
port from 3000 to 80.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 00:44:00 +07:00
6afd298730 fix: commit encrypted vault file so CI can decrypt it
All checks were successful
CI/CD / syntax-check (push) Successful in 2m0s
CI/CD / deploy (push) Successful in 7m43s
vault.yml was in .gitignore so CI jobs had no vault variables.
The file is AES-256 encrypted — safe to commit to a private repo.
The password stays in ~/.vault-password-file (still gitignored).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 00:19:44 +07:00
2eba9b451e ci: trigger deploy (CI key now authorized on server)
Some checks failed
CI/CD / syntax-check (push) Successful in 2m3s
CI/CD / deploy (push) Failing after 7m2s
2026-03-21 23:42:42 +07:00
48f34e3e93 ci: fix ansible-galaxy --quiet flag (not supported)
Some checks failed
CI/CD / syntax-check (push) Successful in 2m9s
CI/CD / deploy (push) Failing after 3m0s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 23:31:33 +07:00
9bfb702322 ci: fix syntax-check vault password, update CI deploy key
Some checks failed
CI/CD / syntax-check (push) Successful in 2m24s
CI/CD / deploy (push) Failing after 2m4s
- Add vault password step to syntax-check job (ansible needs it even for --syntax-check)
- Regenerate CI deploy SSH key (old private key was lost, new pair generated)
- Add VAULT_PASSWORD and SSH_PRIVATE_KEY secrets to Forgejo via API

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 23:22:17 +07:00
43a870954a ci: test with re-registered runner (https://git.csrx.ru)
Some checks failed
CI/CD / syntax-check (push) Failing after 2m30s
CI/CD / deploy (push) Has been skipped
2026-03-21 23:16:20 +07:00
a04e709f30 ci: test after Forgejo URL fix
Some checks failed
CI/CD / syntax-check (push) Failing after 43s
CI/CD / deploy (push) Has been skipped
2026-03-21 23:00:41 +07:00
6a2c38b4bf Fix act_runner: use public Forgejo URL for job container access
Some checks failed
CI/CD / syntax-check (push) Failing after 48s
CI/CD / deploy (push) Has been skipped
Job containers run on runner-jobs network (internet only), so they
can't reach forgejo:3000 (backend-only). Use public https://git.csrx.ru
so both runner and job containers can reach Forgejo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 22:53:25 +07:00
6580e42f53 Fix CI workflow: remove container directive, use runner image directly
Some checks failed
CI/CD / syntax-check (push) Failing after 2m13s
CI/CD / deploy (push) Has been skipped
- Remove container: python:3.12-slim (lacked Node.js for actions/checkout)
- Use runner's ubuntu-latest image which has Node.js + Python pre-installed
- Fix deploy job if condition (remove ${{ }} wrapper)
- Enable debug logging in act_runner config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 22:34:56 +07:00
3107a2c5ad ci: test runner with debug logs
Some checks failed
CI/CD / syntax-check (push) Failing after 7s
CI/CD / deploy (push) Has been skipped
2026-03-21 22:33:37 +07:00
07cbba7759 ci: trigger workflow to test runner
Some checks failed
CI/CD / syntax-check (push) Failing after 22s
CI/CD / deploy (push) Has been skipped
2026-03-21 22:04:54 +07:00
1d2ba7f7ea Fix act_runner network name to use Docker Compose prefix
Some checks failed
CI/CD / syntax-check (push) Failing after 2s
CI/CD / deploy (push) Has been skipped
Docker Compose prefixes project name to network names:
runner-jobs → services_runner-jobs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 21:54:14 +07:00
d2d5f12d5a Add Forgejo Actions CI/CD with act_runner
Some checks failed
CI/CD / syntax-check (push) Failing after 12s
CI/CD / deploy (push) Has been skipped
- Add gitea/act_runner:0.3.0 to docker-compose stack on runner-jobs network
- Add act_runner config template and directory provisioning
- Add FORGEJO_RUNNER_TOKEN to env template
- Add CI deploy SSH public key to authorized_keys via base role
- Create .forgejo/workflows/deploy.yml: syntax-check on PR, deploy on push to master
- Add .claude/launch.json with ansible-playbook configurations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 21:28:15 +07:00
652737239d Add Forgejo SSH port 2222 and open in UFW
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 19:43:22 +07:00
a1b97f3e4b Initial commit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 19:39:26 +07:00