Commit graph

109 commits

Author SHA1 Message Date
ccd7c44293 chore: delete dead templates, remove duplicate MinIO task, update CLAUDE.md
Some checks failed
CI/CD / deploy (push) Failing after 10m23s
CI/CD / syntax-check (push) Successful in 1m9s
- Delete grafana provisioning templates (grafana/loki removed)
- Delete env.outline.j2 (Outline replaced by Docmost)
- Remove duplicate MinIO bucket creation Ansible task (plane-createbuckets
  compose service handles this more reliably)
- Update CLAUDE.md: single server, correct domains, remove tools references
2026-03-27 19:24:14 +07:00
5f44441bd1 fix: remove grafana_admin_password from env.j2, delete dead prometheus templates
Some checks failed
CI/CD / syntax-check (push) Successful in 1m2s
CI/CD / deploy (push) Has been cancelled
These files referenced variables removed in the previous refactor commit,
causing deploy failure (undefined variable: grafana_admin_password).
2026-03-27 19:21:51 +07:00
339f0e8484 terraform: remove tools server and outline bucket, add docmost bucket
Some checks failed
CI/CD / syntax-check (push) Successful in 1m21s
CI/CD / deploy (push) Has been cancelled
- Remove twc_server.tools (tools server decommissioned)
- Remove twc_s3_bucket.outline (walava-outline deleted)
- Add twc_s3_bucket.docmost (walava-docmost, ID 481385)
- Update main server comment, remove tools_ip output
- Removed from state: twc_server.tools, twc_s3_bucket.outline
- Imported into state: twc_s3_bucket.docmost
2026-03-27 19:18:26 +07:00
44ccdf4882 refactor: remove tools server, Vaultwarden, monitoring stack; rename plane→hub
Some checks failed
CI/CD / syntax-check (push) Successful in 1m30s
CI/CD / deploy (push) Failing after 6m8s
- Remove tools server entirely (roles/tools, playbooks/tools.yml, CI deploy step)
- Remove Vaultwarden (already absent from compose, clean up vars)
- Remove node-exporter, cadvisor, promtail from main stack
- Remove grafana/uptime-kuma Traefik routes (pointed to tools)
- Remove monitoring network from docker-compose
- Remove tools vault vars (grafana_admin_password, alertmanager telegram)
- Rename domain_plane: plane.walava.io → hub.walava.io
- Update CI workflow to only deploy main server
- Update STATUS.md and BACKLOG.md to reflect current state
2026-03-27 19:05:19 +07:00
0a85e6fd2d fix: add plane-createbuckets to init MinIO uploads bucket on fresh deploy
Some checks failed
CI/CD / syntax-check (push) Successful in 1m20s
CI/CD / deploy (push) Has been cancelled
2026-03-27 18:55:13 +07:00
8feff04136 fix: use curl for Docmost health check, check Docker health status
All checks were successful
CI/CD / deploy (push) Successful in 13m46s
CI/CD / syntax-check (push) Successful in 1m0s
- wget not available in Docmost Node.js image → switch to curl
- Ansible now checks Docker health status instead of exec-ing into container
- Increased healthcheck start_period to 60s and retries to 10 for DB migrations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 10:09:14 +07:00
9f85988c1f fix: increase Docmost health check retries to 30 (5 min total)
Some checks failed
CI/CD / deploy (push) Failing after 13m46s
CI/CD / syntax-check (push) Successful in 59s
First deploy needs time for DB migrations and initial setup.
30×10s = 300s gives enough buffer for cold start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 09:48:25 +07:00
29ba8a64ba fix: remove outline-mcp commented block with undefined Jinja2 vars
Some checks failed
CI/CD / syntax-check (push) Failing after 54s
CI/CD / deploy (push) Has been skipped
Ansible evaluates Jinja2 expressions even in YAML comments, causing
'outline_mcp_image is undefined' error. Removed the entire block since
outline-mcp is no longer relevant (replaced Outline with Docmost).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 09:39:56 +07:00
472c2b944b feat: replace Outline with Docmost
Some checks failed
CI/CD / syntax-check (push) Successful in 1m0s
CI/CD / deploy (push) Failing after 5m1s
- Replace outline/outline-db/outline-redis with docmost/docmost-db/docmost-redis
- Update Traefik route: wiki → http://docmost:3000
- Update S3 bucket: walava-outline → walava-docmost (new bucket created: 481385)
- Remove env.outline.j2 deploy task (Docmost config is inline in compose)
- Update backup script: outline.sql.gz → docmost.sql.gz
- Update CORS task for walava-docmost bucket
- Add vault_docmost_app_secret + vault_docmost_db_password secrets
- Remove outline_mcp_image (no longer needed)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 09:31:51 +07:00
f0c3fbbe1b fix: auto-bootstrap Outline team on fresh install
All checks were successful
CI/CD / deploy (push) Successful in 15m8s
CI/CD / syntax-check (push) Successful in 1m4s
On a fresh DB Outline shows a blank login page because there is no team
and emailSigninEnabled = false. Add idempotent Ansible tasks that:
1. Create the 'Visual' team if none exists
2. Set guestSignin=true so email magic-link login works
Triggered by: server rebuild lost Outline DB (no backup existed).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 09:14:14 +07:00
aa8d5082d3 fix: new CI deploy key + plane-api longer startup timeout
Some checks failed
CI/CD / syntax-check (push) Successful in 1m2s
CI/CD / deploy (push) Failing after 8m13s
- Rotate ci_deploy_pubkey to new ed25519 key (old key lost after
  server rebuild; Forgejo secret SSH_PRIVATE_KEY updated to match)
- Increase plane-api start_period 60s→120s, retries 5→10 to give
  Django time to run DB migrations after backup restore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 08:38:51 +07:00
3b875f57d2 fix: disable discord-bot and walava-web until images exist in registry
Some checks failed
CI/CD / syntax-check (push) Successful in 3m0s
CI/CD / deploy (push) Failing after 1m39s
These custom images (discord-bot, walava-web) are built by their own
repos' CI/CD and pushed to git.walava.io registry. On a fresh server
Forgejo hasn't run yet so images don't exist — bootstrap chicken/egg.
Re-enable after Forgejo is up and images are pushed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 06:26:14 +07:00
f4688ed8be fix: disable outline-mcp until image is built and pushed to registry
outline-mcp uses git.walava.io/jack/outline-mcp:latest which doesn't
exist in Forgejo registry yet (Forgejo itself wasn't running).
Comment out the service; re-enable after building the image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 06:05:30 +07:00
7aa5574098 chore: remove mon from inventory, update server descriptions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 04:29:26 +07:00
f704ede1cd chore: rename servers to main and tools in Timeweb
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 04:27:23 +07:00
8a3aaa2fca feat: Terraform infra-as-code + delete mon server + fix S3/Outline
Terraform: imported main (7004701) + tools (7076013) into state,
destroyed mon (7076015, 188.225.79.34). State: No changes.

S3: fix endpoint s3.timeweb.cloud → s3.twcstorage.ru (actual Timeweb
endpoint), remove AWS_S3_ACL=private (Timeweb doesn't support per-object
ACLs — was causing Outline upload failures).

Vault: added vault_timeweb_token.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 04:26:33 +07:00
862eac5f11 feat: add Terraform config for Timeweb Cloud infrastructure
Manages main + tools servers and S3 buckets (walava-backup, walava-outline).
Includes mon server resource for import + destroy workflow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 04:15:27 +07:00
fde51352d7 feat: migrate monitoring to tools server, fix Outline S3 uploads
Monitoring stack (Prometheus, AlertManager, Grafana, Loki, Uptime Kuma)
moved from main to tools server. Prometheus now scrapes main exporters
over network (ip_main:9100/8080). Promtail pushes logs to ip_tools:3100.
Traefik routes for dash/status.walava.io updated to ip_tools. discord-bot
PROMETHEUS_URL updated to http://ip_tools:9090.

Outline S3 fix: remove AWS_S3_ACL=private (Timeweb doesn't support
per-object ACLs — caused upload failures). Add CORS configuration task
for browser-side presigned uploads.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 04:10:28 +07:00
d6015b76a3 fix: add proxy network to Outline and n8n for outbound internet access
Outline needs proxy network for SMTP (Resend) and S3 (Timeweb).
n8n needs proxy network for external API calls in workflows.
Both were only on backend (internal:true) so DNS/TCP to internet was blocked.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 03:54:19 +07:00
036b80501f ci: retrigger deploy after image retag 2026-03-27 03:33:33 +07:00
521c806ed9 docs: update STATUS.md — reflect walava.io migration and service layout 2026-03-27 03:11:54 +07:00
36be9fb33d chore: remove SMTP relay, clean up tools role after Outline/n8n migration to main
- Remove smtp-relay (postfix) container — Outline now on main, uses Resend directly
- Remove UFW port 1025 rule (SMTP relay no longer needed)
- Remove postfix-relay from image pull list
- Clean up tools role: remove Outline/n8n/env.j2, simplify tasks/main.yml
- tools docker-compose now empty (pending monitoring migration)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 03:10:56 +07:00
489791403c feat: migrate Outline + n8n to main server, rename S3 buckets to walava-*
- Add Outline, outline-db, outline-redis, n8n, outline-mcp containers to main docker-compose
- Add env.outline.j2 template with Resend SMTP and S3 (walava-outline bucket)
- Update Traefik routes: wiki → outline:3000, auto → n8n:5678 (local, not cross-server)
- Rename S3 buckets: visual-backup → walava-backup, visual-outline → walava-outline
- Extend backup.sh.j2: add Outline DB, n8n, Plane MinIO to backup scope
- Add outline_image, n8n_image, outline_mcp_image to services/defaults
- Remove Authelia config deployment tasks from configs.yml
- Add outline-internal and n8n-internal networks to docker-compose

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 03:04:54 +07:00
fba7eb68ea fix: add SMTP relay on main server for Outline email auth
Some checks failed
CI/CD / deploy (push) Blocked by required conditions
CI/CD / syntax-check (push) Has been cancelled
tools-server (85.193.83.9) has outbound SMTP ports 465/587 blocked by VPS
provider. Added tecnativa/postfix-relay container on main server that relays
to smtp.resend.com:587. Outline now uses ip_main:1025 as SMTP host.

- UFW rule: allow port 1025 from ip_tools only
- Remove stale authelia_image from docker pull list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 23:35:30 +07:00
e754d54e81 chore: add outline-mcp to tools stack, clean up stale authelia vars
Some checks are pending
CI/CD / syntax-check (push) Waiting to run
CI/CD / deploy (push) Blocked by required conditions
- Add outline-mcp service to tools docker-compose (was running unmanaged)
- Update OUTLINE_URL from csrx.ru → walava.io via domain_wiki variable
- Bind port 8765 to 127.0.0.1 only (was 0.0.0.0 — security improvement)
- Add vault_outline_mcp_api_key to vault + alias in main.yml
- Remove stale authelia_* aliases from main.yml (authelia removed)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 22:54:14 +07:00
d635522199 feat: remove Authelia, protect dashboard with basic auth
Some checks are pending
CI/CD / syntax-check (push) Waiting to run
CI/CD / deploy (push) Blocked by required conditions
Authelia was unused overhead — only traefik-dashboard and plane /god-mode/
were behind it. Dashboard now uses traefik-auth (basic auth). /god-mode/
uses rate-limit-strict only.

Removes: authelia + authelia-redis containers, authelia-internal network,
authelia_data volume, authelia router/service/forwardAuth middleware.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 22:50:41 +07:00
2770cb61ef fix: CF_DNS_API_TOKEN env var name for Traefik ACME + n8n domain update
Some checks are pending
CI/CD / syntax-check (push) Waiting to run
CI/CD / deploy (push) Blocked by required conditions
- Fix env var CLOUDFLARE_DNS_API_TOKEN → CF_DNS_API_TOKEN (lego requirement)
- n8n env already uses domain_n8n variable (auto.walava.io)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 22:44:05 +07:00
fb769b2f8c feat: migrate domain from csrx.ru to walava.io
Some checks failed
CI/CD / syntax-check (push) Successful in 1m44s
CI/CD / deploy (push) Failing after 20m21s
- domain_base changed to walava.io
- domain_n8n now auto.walava.io
- Added domain_landing for walava.io root
- Added walava-web landing page container + Traefik route
- Updated Cloudflare token/zone_id for walava.io account
- Updated ACME email to walava@tutamail.com
- Fixed discord-bot image to use domain_base variable
- DNS records already created in Cloudflare

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 22:17:00 +07:00
cbab48fb03 chore: change admin email to walava@tutamail.com
All checks were successful
CI/CD / syntax-check (push) Successful in 1m5s
CI/CD / deploy (push) Successful in 15m48s
Updates ACME/Let's Encrypt contact email and unattended-upgrades config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 17:43:40 +07:00
8b140473b4 feat: add Resend SMTP for Outline email auth
Some checks failed
CI/CD / syntax-check (push) Successful in 1m8s
CI/CD / deploy (push) Has been cancelled
Configures smtp.resend.com as SMTP provider for Outline magic links.
Domain csrx.ru needs verification in Resend dashboard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 17:38:35 +07:00
c1a71b7f50 fix: add remove_orphans to docker compose tasks
All checks were successful
CI/CD / syntax-check (push) Successful in 1m33s
CI/CD / deploy (push) Successful in 14m0s
Ensures removed services (vaultwarden, mailserver, snappymail)
are automatically stopped on next deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 07:00:17 +07:00
4090d8289b fix: add username/icon_url to Forgejo Discord webhook config
All checks were successful
CI/CD / syntax-check (push) Successful in 1m8s
CI/CD / deploy (push) Successful in 13m10s
Prevents the 'meta json: readObjectStart' error on fresh deploys.
Existing hooks already fixed via direct DB update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 06:10:56 +07:00
9715ac900a test: final webhook check
Some checks failed
CI/CD / syntax-check (push) Successful in 1m37s
CI/CD / deploy (push) Has been cancelled
2026-03-26 06:00:13 +07:00
11d4660eba test: webhook after Forgejo restart
Some checks failed
CI/CD / deploy (push) Blocked by required conditions
CI/CD / syntax-check (push) Has been cancelled
2026-03-26 05:59:17 +07:00
2c3115e672 test: Discord webhook meta fixed
Some checks failed
CI/CD / syntax-check (push) Successful in 59s
CI/CD / deploy (push) Has been cancelled
2026-03-26 05:58:03 +07:00
16f7043fe9 test: Discord webhook with meta fix
Some checks failed
CI/CD / syntax-check (push) Successful in 1m3s
CI/CD / deploy (push) Has been cancelled
2026-03-26 05:56:50 +07:00
f3bbeb06e3 test: check Discord webhook delivery
Some checks failed
CI/CD / deploy (push) Blocked by required conditions
CI/CD / syntax-check (push) Has been cancelled
2026-03-26 05:56:06 +07:00
7cdbcd5301 test: trigger Discord deploy notification
Some checks failed
CI/CD / deploy (push) Has been cancelled
CI/CD / syntax-check (push) Successful in 58s
2026-03-26 05:54:44 +07:00
4b00804f3e fix: use forgejo_api_token for webhook creation, cover both repos
Some checks failed
CI/CD / syntax-check (push) Successful in 1m6s
CI/CD / deploy (push) Has been cancelled
- Add vault_forgejo_api_token (Personal Access Token with write:repository)
- Ansible task now creates Discord webhook on both jack/infra and jack/discord-bot
- Webhooks already created manually for this deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 05:53:25 +07:00
f3f665a5be fix: add DISCORD_APP_ID env var to discord-bot container
Some checks failed
CI/CD / syntax-check (push) Successful in 1m29s
CI/CD / deploy (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 05:42:55 +07:00
0315ee6a72 feat: add Discord bot service + workflow_dispatch trigger
All checks were successful
CI/CD / syntax-check (push) Successful in 1m5s
CI/CD / deploy (push) Successful in 14m7s
- Add discord-bot container to docker-compose (uses git.csrx.ru registry image)
- Inject DISCORD_BOT_TOKEN via .env, bot accesses Docker socket + Prometheus
- Add vault_discord_bot_{token,app_id,public_key}, aliases in main.yml
- Add workflow_dispatch to deploy.yml so /deploy bot command works

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 05:27:42 +07:00
f6f283944f vault: add OpenRouter API key, remove Syncthing remnant
Some checks failed
CI/CD / syntax-check (push) Successful in 1m6s
CI/CD / deploy (push) Has been cancelled
- Added vault_openrouter_api_key for n8n AI automations
- Added openrouter_api_key alias in main.yml
- Removed vault_syncthing_basic_auth_htpasswd (Syncthing was removed)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 05:16:44 +07:00
1b063c3947 fix(uptime-kuma): add proxy network for internet access to Discord/Telegram
Some checks failed
CI/CD / syntax-check (push) Successful in 1m7s
CI/CD / deploy (push) Has been cancelled
Container was on backend (internal: true) only — couldn't resolve
discord.com for webhook notifications. Added proxy network which
has outbound internet access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 05:01:27 +07:00
d83ead2cbe feat(discord): integrate alerts and deploy notifications
Some checks failed
CI/CD / syntax-check (push) Successful in 1m3s
CI/CD / deploy (push) Has been cancelled
- Add discord_webhook_alerts and discord_webhook_deploys to vault + main.yml
- AlertManager: send alerts to both Telegram and Discord #alerts channel
- Forgejo: auto-create Discord webhook on repo pushes → #deploys channel

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 04:58:12 +07:00
a620bb381c fix: remove all remaining Vaultwarden references after service removal
Some checks failed
CI/CD / syntax-check (push) Successful in 1m1s
CI/CD / deploy (push) Has been cancelled
- tasks/main.yml: remove vaultwarden_image from image pull list
- tasks/directories.yml: remove vaultwarden/data directory creation
- backup.sh.j2: remove Vaultwarden backup/restore section and stop command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 04:49:12 +07:00
58e9a0f08b fix: remove vaultwarden_admin_token and DOMAIN_VAULT from env.j2
Some checks failed
CI/CD / syntax-check (push) Successful in 1m3s
CI/CD / deploy (push) Failing after 6m54s
Leftover after Vaultwarden removal caused CI/CD deploy to fail with
'vaultwarden_admin_token is undefined' during .env template rendering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 04:38:12 +07:00
40c8d291ca fix(plane): add WEB_URL and NEXT_PUBLIC_API_BASE_URL to plane-web container
Some checks failed
CI/CD / syntax-check (push) Successful in 1m6s
CI/CD / deploy (push) Failing after 5m49s
Without these env vars Next.js SSR renders with wrong base URL causing
React hydration error #418 — server/client HTML mismatch on first render.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 04:13:35 +07:00
75bed6bb04 feat: remove mail stack and Vaultwarden
Some checks failed
CI/CD / syntax-check (push) Successful in 1m15s
CI/CD / deploy (push) Has been cancelled
Removed services:
- docker-mailserver (Postfix + Dovecot)
- SnappyMail webmail
- Vaultwarden password manager

Removed infrastructure:
- certbot + Cloudflare DNS-01 TLS for mx.csrx.ru
- UFW rules for ports 25/587/993/465
- mail-internal and webmail-internal Docker networks
- SMTP config from Outline env
- vault, mail Traefik routes
- All related vault secrets and variables

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 04:06:29 +07:00
207e1dcff0 chore: project cleanup and docs update
All checks were successful
CI/CD / syntax-check (push) Successful in 1m29s
CI/CD / deploy (push) Successful in 16m39s
- Remove Syncthing mention from authelia comment in docker-compose
- Fix backup.sh.j2 comment: hourly → every 6 hours
- Update CLAUDE.md: add docs update rule, fix backup schedule note
- Update STATUS.md: dash.csrx.ru fixed, PTR pending, backup schedule, mail hostnames
- Update BACKLOG.md: mark DNS/PTR/backup-schedule done, add SnappyMail domain task
- Update DECISIONS.md: fix backup section (no --storage-class COLD, correct schedule)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 17:00:35 +07:00
634d50c25d chore(backup): change schedule from hourly to every 6 hours
All checks were successful
CI/CD / syntax-check (push) Successful in 1m10s
CI/CD / deploy (push) Successful in 15m50s
Runs at 00:00, 06:00, 12:00, 18:00. Removes old hourly cron entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 15:30:55 +07:00