diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml deleted file mode 100644 index 29c7a88..0000000 --- a/.github/workflows/deploy.yml +++ /dev/null @@ -1,51 +0,0 @@ -name: Deploy to Production - -on: - push: - branches: [master] - workflow_dispatch: # Manual trigger - -# Only run one deploy at a time -concurrency: - group: production-deploy - cancel-in-progress: false - -jobs: - # Run CI checks first - ci: - uses: ./.github/workflows/ci.yml - - deploy: - name: Deploy to cardano402.com - needs: ci - runs-on: ubuntu-latest - timeout-minutes: 10 - if: github.ref == 'refs/heads/master' - - steps: - - name: Deploy via SSH - uses: appleboy/ssh-action@v1.2.2 - with: - host: ${{ secrets.DEPLOY_HOST }} - username: ${{ secrets.DEPLOY_USER }} - key: ${{ secrets.DEPLOY_SSH_KEY }} - port: ${{ secrets.DEPLOY_PORT }} - script: | - set -euo pipefail - cd /opt/cardano402 - git pull origin master - docker compose -f docker-compose.prod.yml up -d --build facilitator - - # Poll /health for up to 60s instead of a fixed sleep. - for i in $(seq 1 30); do - HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/health || echo 000) - if [ "$HTTP_CODE" = "200" ]; then - echo "Deploy successful -- health check passed after ${i} polls" - exit 0 - fi - sleep 2 - done - - echo "Deploy failed -- /health did not return 200 within 60s (last code: ${HTTP_CODE})" - docker compose -f docker-compose.prod.yml logs facilitator --tail 40 - exit 1 diff --git a/docs/deployment.md b/docs/deployment.md index 06276d0..969f5cb 100644 --- a/docs/deployment.md +++ b/docs/deployment.md @@ -155,6 +155,14 @@ Image details: - **Size:** ~180 MB - **Health check:** Built-in (`wget` to `/health` every 30s) +## Production deploys are manual by design + +There is no auto-deploy on merge. The VPS is reachable only over Tailscale and SSH is closed to the public internet — adding a GitHub-Actions deploy key would require widening the firewall to GitHub's runner IP ranges, which is a worse security posture than `bash deploy.sh` from a tailnet-attached laptop. See [`operations.md` § Manual deploy procedure](operations.md#manual-deploy-procedure) for the canonical runbook. + +CI (`.github/workflows/ci.yml`) still runs on every push and PR — lint, typecheck, test, build, docker build, security audit. It only runs inside the GitHub-Actions runner and makes no outbound SSH connection. + +If auto-deploy ever becomes desirable again, the right approach is the [Tailscale GitHub Action](https://github.com/tailscale/github-action), which attaches the runner to your tailnet for the deploy duration without opening any public port. Deferred until there's actual need. + ## Bare Metal Deployment If you prefer running without Docker: diff --git a/docs/operations.md b/docs/operations.md index 75e719c..9ab907f 100644 --- a/docs/operations.md +++ b/docs/operations.md @@ -15,6 +15,50 @@ 4. Start server: `pnpm dev` 5. Verify: `curl http://localhost:3000/health` +## Manual deploy procedure + +Production deploys run manually from a tailnet-attached laptop (the VPS is Tailscale-only, no public SSH). The canonical "phased deploy" pattern used for any change that touches `docker-compose.prod.yml` or `Dockerfile`: + +```bash +# On the VPS, in /opt/cardano402 +git pull origin master + +# Phase 1 — preserve current image as a rollback tag +docker tag cardano402:latest cardano402:rollback-$(date +%Y-%m-%d) + +# Phase 2 — build the new image (no production impact) +docker compose -f docker-compose.prod.yml build --no-cache facilitator + +# Phase 3 — smoke-test on a side port (no production impact) +docker run --rm -d --name cardano402-smoke -p 127.0.0.1:3001:3000 \ + -v /opt/cardano402/config/config.json:/app/config/config.json:ro \ + --network cardano402_default \ + -e NODE_ENV=production -e MAINNET=true \ + cardano402:latest +sleep 8 && curl -s http://127.0.0.1:3001/health && docker stop cardano402-smoke + +# Phase 4 — swap (~30s downtime, watch for healthy) +docker compose -f docker-compose.prod.yml up -d facilitator +for i in $(seq 1 30); do + [ "$(docker inspect cardano402 --format '{{.State.Health.Status}}')" = "healthy" ] && break + sleep 2 +done + +# Phase 5 — verify +curl -s http://localhost:3000/health +docker inspect cardano402 --format 'mem_limit: {{.HostConfig.Memory}} restartCount: {{.RestartCount}}' +docker logs --since 5m cardano402 2>&1 | grep -iE '"level":(50|40)' | head -5 +``` + +**Rollback** if Phase 4 or 5 reveals a problem: + +```bash +docker tag cardano402:rollback- cardano402:latest +docker compose -f docker-compose.prod.yml up -d facilitator +``` + +For routine deploys (no Dockerfile or compose change), `bash deploy.sh` runs the same pull + build + restart sequence in one shot — it skips the phased smoke-test gate, so use the phased procedure above whenever the change could affect container behavior. + ## Production Deployment (Docker) ### 1. Create production config