Skip to content

Latest commit

 

History

History
174 lines (134 loc) · 4.82 KB

File metadata and controls

174 lines (134 loc) · 4.82 KB

Tutorial: Supervise Your First Three Processes in 5 Minutes

This walkthrough takes a freshly built gula binary from "I cloned the repo" to "three cooperating processes are running with dependency ordering, a readiness probe, a memory limit, and graceful shutdown" — without leaving the terminal.

If you have not built Gula yet, see Installation first. This page assumes a target/release/gula binary on a Linux host.

What you will build

A tiny three-process stack:

  1. config_writer — a one-shot setup job that writes a config file.
  2. api — a long-running fake API that waits for the config, then becomes ready when its health endpoint responds.
  3. worker — a long-running consumer that only starts after api is ready, with a memory limit that triggers a restart on overrun.

By the end you will have used:

  • DAG ordering with both depends_on_completion and depends_on_ready.
  • A readiness_probe.
  • A memory_threshold_action.
  • Graceful shutdown via Ctrl-C.

Step 1 — Make a working directory

mkdir -p /tmp/gula-tutorial && cd /tmp/gula-tutorial
mkdir -p logs scripts

Step 2 — Drop in three trivial scripts

These stand in for real services. Each is just enough to exercise the supervisor.

cat > scripts/config_writer.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
echo "writing config..."
sleep 1
echo 'api_port=8080' > /tmp/gula-tutorial/app.conf
echo "done"
EOF

cat > scripts/api.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
# Pretend we boot for a moment, then expose a "health" file.
sleep 2
: > /tmp/gula-tutorial/api.ready
trap 'rm -f /tmp/gula-tutorial/api.ready; exit 0' TERM INT
echo "api up"
while true; do sleep 1; done
EOF

cat > scripts/worker.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
echo "worker started"
trap 'echo "worker stopping"; exit 0' TERM INT
i=0
while true; do
    echo "tick $i"
    i=$((i + 1))
    sleep 1
done
EOF

chmod +x scripts/*.sh

Step 3 — Write the Gula config

Save this as gula.yaml:

# yaml-language-server: $schema=https://raw.githubusercontent.com/unstablecursor/gula/main/schemas/gula.schema.json
logs_dir: "logs"

processes:
  - name: "config_writer"
    command: "/tmp/gula-tutorial/scripts/config_writer.sh"
    timeout_seconds: 30

  - name: "api"
    command: "/tmp/gula-tutorial/scripts/api.sh"
    timeout_seconds: 86400
    depends_on_completion: ["config_writer"]
    readiness_probe:
      command: "test -f /tmp/gula-tutorial/api.ready"
      interval_ms: 250
      timeout_seconds: 30

  - name: "worker"
    command: "/tmp/gula-tutorial/scripts/worker.sh"
    timeout_seconds: 86400
    depends_on_ready: ["api"]
    memory_limit_bytes: 134217728   # 128 MB
    memory_threshold_action: restart
    restart_policy:
      max_restarts: 3
      backoff_base_seconds: 1.0

And save the system config as gula_config.yaml:

cleanup_timeout_seconds: 5
metrics_flush_interval_ticks: 100

Step 4 — Validate before you run

gula validate --config gula.yaml --sys-config gula_config.yaml

You should see Configuration is valid. If not, the error message names the field and the reason — fix it and re-run.

Step 5 — Run

gula run --config gula.yaml --sys-config gula_config.yaml

What you should observe:

  1. config_writer runs first and exits 0 within ~1 s.
  2. api starts, sleeps 2 s, then creates /tmp/gula-tutorial/api.ready. The readiness probe (test -f ...) flips to success.
  3. worker only starts after api reports ready. It then prints tick N every second.

Step 6 — Inspect logs and metrics

In a second shell:

ls /tmp/gula-tutorial/logs
tail -f /tmp/gula-tutorial/logs/worker.log
column -ts, /tmp/gula-tutorial/logs/worker.csv | head

Each managed process gets its own <name>.log and <name>.csv. The supervisor's own usage is in gula_metrics.csv.

Step 7 — Trigger a graceful shutdown

In the first shell, press Ctrl-C once.

Gula records the signal, forwards that same shutdown signal to all process groups, interrupts restart back-offs, opens a 5 s grace window (from cleanup_timeout_seconds), and only escalates to SIGKILL for processes that did not exit. You should see your worker stopping trap message before the supervisor returns.

Exit code 0 means "clean signal shutdown or all processes succeeded".

Where to go next