infra/roles/services/tasks/main.yml
jack 6ebd237894
Some checks failed
CI/CD / deploy (push) Has been cancelled
CI/CD / syntax-check (push) Successful in 1m7s
feat: major infrastructure improvements
Reliability:
- Add swap role (2GB, swappiness=10, idempotent via /etc/fstab)
- Add mem_limit to plane-worker (512m) and plane-beat (256m)
- Add health checks to all services (traefik, vaultwarden, forgejo,
  plane-*, syncthing, prometheus, grafana, loki)

Code quality:
- Remove Traefik Docker labels (file provider used, labels were dead code)
- Add comment explaining file provider architecture

Observability:
- Add AlertManager with Telegram notifications
- Add Prometheus alert rules: CPU, RAM, disk, swap, container health
- Add Loki + Promtail for centralized log aggregation
- Add Loki datasource to Grafana
- Enable Traefik /ping endpoint for health checks

Backups:
- Add backup role: pg_dump for forgejo + plane DBs, tar for
  vaultwarden and forgejo data
- 7-day retention, daily cron at 03:00
- Backup script at /usr/local/bin/backup-services

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:28:16 +07:00

77 lines
2.5 KiB
YAML

---
- import_tasks: directories.yml
- import_tasks: configs.yml
- name: Pull Docker images one by one
ansible.builtin.command: docker pull {{ item }}
loop:
- "{{ traefik_image }}"
- "{{ vaultwarden_image }}"
- "{{ forgejo_image }}"
- "{{ forgejo_db_image }}"
- "{{ plane_frontend_image }}"
- "{{ plane_admin_image }}"
- "{{ plane_space_image }}"
- "{{ plane_backend_image }}"
- "{{ plane_db_image }}"
- "{{ plane_redis_image }}"
- "{{ plane_minio_image }}"
- "{{ syncthing_image }}"
- "{{ act_runner_image }}"
- "{{ prometheus_image }}"
- "{{ node_exporter_image }}"
- "{{ cadvisor_image }}"
- "{{ grafana_image }}"
- "{{ alertmanager_image }}"
- "{{ loki_image }}"
- "{{ promtail_image }}"
register: pull_result
changed_when: "'Status: Downloaded newer image' in pull_result.stdout"
retries: 5
delay: 30
until: pull_result.rc == 0
- name: Deploy Docker Compose stack
community.docker.docker_compose_v2:
project_src: "{{ services_root }}"
state: present
pull: never
retries: 3
delay: 15
register: compose_result
until: compose_result is succeeded
notify: Stack deployed
- name: Wait for MinIO to be ready
ansible.builtin.command: docker exec plane-minio curl -sf http://localhost:9000/minio/health/live
register: minio_ready
changed_when: false
retries: 15
delay: 10
until: minio_ready.rc == 0
- name: Get plane-internal network name
ansible.builtin.shell: >
docker inspect plane-minio |
python3 -c "import sys,json; d=json.load(sys.stdin)[0];
print([k for k in d['NetworkSettings']['Networks'] if 'plane-internal' in k][0])"
register: plane_internal_network
changed_when: false
- name: Create MinIO uploads bucket via mc container
# minio/mc entrypoint = mc, поэтому нужен --entrypoint sh
# access-key = имя пользователя MinIO (plane-minio), secret-key = пароль
ansible.builtin.shell: |
docker run --rm \
--entrypoint sh \
--network "{{ plane_internal_network.stdout | trim }}" \
-e MC_ACCESS="{{ plane_minio_password }}" \
minio/mc:RELEASE.2025-05-21T01-59-54Z \
-c 'mc alias set local http://plane-minio:9000 plane-minio "{{ plane_minio_password }}" 2>/dev/null \
&& mc mb --ignore-existing local/uploads \
&& echo "Bucket created or already exists"'
register: minio_bucket
changed_when: "'Bucket created' in minio_bucket.stdout"
retries: 5
delay: 10
until: minio_bucket.rc == 0