rohitg00 · rohitg00 · May 14, 2026 · May 13, 2026 · May 13, 2026 · May 13, 2026
diff --git a/README.md b/README.md
@@ -529,6 +529,44 @@ npx -y @agentmemory/mcp
 
 ---
 
+<h2 id="deploy">Deploy</h2>
+
+One-click templates for managed hosts. Each one ships a self-contained
+Dockerfile that pulls `@agentmemory/agentmemory` from npm and copies
+the iii engine binary in from the official `iiidev/iii` Docker Hub
+image — no pre-built agentmemory image required. Persistent storage
+mounts at `/data`; the first-boot entrypoint overwrites the
+npm-bundled iii config (which binds `127.0.0.1`) with a deploy-tuned
+one that binds `0.0.0.0` and uses absolute `/data` paths, generates
+the HMAC secret, then drops privileges from `root` to `node` via
+`gosu` before exec'ing the agentmemory CLI.
+
+<p>
+  <a href="https://fly.io/launch?repo=https://github.com/rohitg00/agentmemory&path=deploy/fly"><img src="https://img.shields.io/badge/Deploy%20to-fly.io-8b5cf6?style=for-the-badge&logo=fly.io&logoColor=white" alt="Deploy to fly.io" /></a>
+  <a href="https://railway.com/new/template?template=https%3A%2F%2Fgithub.com%2Frohitg00%2Fagentmemory&rootDirectory=deploy%2Frailway"><img src="https://img.shields.io/badge/Deploy%20to-Railway-0B0D0E?style=for-the-badge&logo=railway&logoColor=white" alt="Deploy to Railway" /></a>
+</p>
+
+Render's one-click deploy button requires `render.yaml` at the repository root, which we deliberately keep clean. Use the Render Blueprint flow documented in [`deploy/render/`](./deploy/render/README.md) to point at the in-repo blueprint manually.
+
+Full setup details (HMAC capture, viewer SSH tunnel, rotation, backup,
+cost floors) live in [`deploy/`](./deploy/README.md):
+
+- [`deploy/fly`](./deploy/fly/README.md) — single machine with
+  `auto_stop_machines = "stop"`; cheapest idle.
+- [`deploy/railway`](./deploy/railway/README.md) — Hobby plan flat fee,
+  volume in the dashboard.
+- [`deploy/render`](./deploy/render/README.md) — Blueprint flow,
+  automatic disk snapshots on paid plans.
+- [`deploy/coolify`](./deploy/coolify/README.md) — self-hosted on your
+  own VPS via [Coolify](https://coolify.io/self-hosted); same Docker
+  Compose stack, you own the host and the data.
+
+Only port `3111` is published. The viewer on `3113` stays bound to
+loopback inside the container — every template's README documents the
+SSH-tunnel pattern for reaching it.
+
+---
+
 <h2 id="why-agentmemory"><picture><source media="(prefers-color-scheme: dark)" srcset="assets/tags/light/section-why.svg"><img src="assets/tags/section-why.svg" alt="Why agentmemory" height="32" /></picture></h2>
 
 Every coding agent forgets everything when the session ends. You waste the first 5 minutes of every session re-explaining your stack. agentmemory runs in the background and eliminates that entirely.

diff --git a/deploy/README.md b/deploy/README.md
@@ -0,0 +1,100 @@
+# One-click deploy templates
+
+Stand up agentmemory on managed infrastructure without rolling your own
+Docker host. Each template ships a self-contained Dockerfile that pulls
+`@agentmemory/agentmemory` from npm at build time and copies the iii
+engine binary in from the official `iiidev/iii` image — no pre-built
+agentmemory image required. Storage mounts at `/data`; an HMAC secret
+is generated by the first-boot entrypoint and persisted to the volume.
+The entrypoint overwrites the npm-bundled iii config with a
+deploy-tuned one that binds `0.0.0.0` and uses absolute `/data` paths,
+then drops privileges from `root` to `node` via `gosu` before
+exec'ing the agentmemory CLI.
+
+| Platform | Pitch | Cost floor |
+|----------|-------|------------|
+| [fly.io](./fly/README.md) | Single machine with auto-stop. Cheapest idle cost on a managed host; cold-start on first request after sleep. | ~$0.15/month at full idle |
+| [Railway](./railway/README.md) | Push from GitHub, volume in the dashboard. Easiest managed dashboard flow. | $5/month (Hobby plan flat fee) |
+| [Render](./render/README.md) | Blueprint-driven; persistent disk attaches automatically. Most "set it and forget it." | $7.25/month (Starter web + 1 GB disk) |
+| [Coolify](./coolify/README.md) | Self-hosted on your own VPS. Same Docker Compose stack, you own the host and the data. | VPS cost only (Hetzner CX22 ~€3.79/month) |
+
+## What every template guarantees
+
+- **Volume mounted at `/data`.** Matches the path the engine has used
+  since v0.9.10.
+- **HMAC secret generated on first boot** via `openssl rand -hex 32`,
+  written to `/data/.hmac` with `chmod 600`, and printed to stdout
+  exactly once so the operator can capture it from the deploy logs.
+  Subsequent boots load the secret from the file. The secret is never
+  committed to a config file or set as a platform env var.
+- **Only port 3111 is exposed publicly.** The viewer on port 3113
+  stays bound to the container's localhost. Reach it via SSH tunnel
+  (see each platform's README).
+- **TLS upstream of the container.** Every managed platform terminates
+  TLS at its edge proxy; the templates publish a single internal port
+  (`3111`) to that proxy, never to the host. Integration plugins
+  configured with `AGENTMEMORY_REQUIRE_HTTPS=1` will refuse to send the
+  bearer over plaintext HTTP to a non-loopback host, so a
+  misconfigured TLS layer fails loud instead of silently leaking the
+  secret.
+
+## Pick a platform
+
+- Pick **fly.io** if you want the lowest idle cost and don't mind a
+  cold-start latency hit on the first request after sleep.
+- Pick **Railway** if you want a clicky dashboard flow and a flat
+  monthly bill.
+- Pick **Render** if you want the most "set it and forget it"
+  Blueprint flow with automatic disk snapshots on paid plans.
+- Pick **Coolify** if you already run a VPS and want a self-hosted
+  control plane — same Docker Compose stack, no third-party host has
+  your memories.
+
+All four give you the same agentmemory API at the same port (3111)
+with the same auth model. Migrating between them later is a `tar` of
+`/data` and a re-import — see each platform's README for the exact
+commands.
+
+## Optional: LLM + embedding provider keys
+
+Every template runs out of the box without any LLM or embedding key —
+search falls back to BM25-only mode and synthetic (zero-LLM)
+compression keeps memories indexable. To unlock LLM-powered
+compression and hybrid (BM25 + vector) recall, add one of the
+following to your platform's environment variables (Fly:
+`flyctl secrets set`; Railway / Render / Coolify: dashboard
+*Variables / Environment* tab):
+
+| Variable                  | Purpose                                                  |
+|---------------------------|----------------------------------------------------------|
+| `ANTHROPIC_API_KEY`       | LLM-backed compression + summarization                   |
+| `GEMINI_API_KEY`          | LLM provider alternative                                 |
+| `OPENROUTER_API_KEY`      | LLM provider alternative                                 |
+| `OPENAI_API_KEY`          | Embedding provider (text-embedding-3-small by default)   |
+| `VOYAGE_API_KEY`          | Embedding provider alternative                           |
+| `AGENTMEMORY_AUTO_COMPRESS=true` | Run LLM compression on every observation batch    |
+| `AGENTMEMORY_INJECT_CONTEXT=true` | Inject recalled memories back into agent prompts |
+
+The defaults are intentionally conservative: provider keys default to
+absent (no third-party calls), `AGENTMEMORY_AUTO_COMPRESS` is off,
+and `AGENTMEMORY_INJECT_CONTEXT` is off. Opt in only after you've
+confirmed your provider quota can absorb the workload.
+
+## Cold-start budget
+
+Measured against fly.io's `iad` region with a 1 GB volume:
+
+```
+machine image prepared :  5.1 s
+volume mount + format  :  2.5 s
+firecracker boot       :  1.0 s
+entrypoint + chown     :  0.5 s
+iii-engine ready       :  3.0 s
+agentmemory worker reg :  2.0 s
+─────────────────────────────────
+healthcheck passes     : ~9-10 s
+```
+
+Every template's health-check `grace_period` (or compose
+`start_period`) is set to 30 s for a 3x safety margin. Tune lower
+once you've measured your own platform's image-pull characteristics.
diff --git a/deploy/coolify/Dockerfile b/deploy/coolify/Dockerfile
@@ -0,0 +1,32 @@
+ARG III_VERSION=0.11.2
+
+FROM iiidev/iii:${III_VERSION} AS iii-image
+
+FROM node:22-slim
+
+ARG AGENTMEMORY_VERSION=0.9.12
+ARG III_VERSION=0.11.2
+ARG III_SDK_VERSION=0.11.2
+
+RUN apt-get update \
+ && apt-get install -y --no-install-recommends openssl ca-certificates tini gosu curl \
+ && rm -rf /var/lib/apt/lists/*
+
+COPY --from=iii-image /app/iii /usr/local/bin/iii
+
+WORKDIR /opt/agentmemory
+RUN printf '{"name":"agentmemory-deploy","version":"1.0.0","private":true,"overrides":{"iii-sdk":"%s"}}\n' "${III_SDK_VERSION}" > package.json \
+ && npm install "@agentmemory/agentmemory@${AGENTMEMORY_VERSION}" --omit=optional --no-fund --no-audit \
+ && ln -s /opt/agentmemory/node_modules/.bin/agentmemory /usr/local/bin/agentmemory
+
+ENV AGENTMEMORY_III_VERSION=${III_VERSION} \
+    TINI_SUBREAPER=1
+
+COPY --chmod=0755 entrypoint.sh /usr/local/bin/agentmemory-entrypoint.sh
+
+EXPOSE 3111
+
+HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=3 \
+  CMD curl -fsS http://127.0.0.1:3111/agentmemory/livez || exit 1
+
+ENTRYPOINT ["/usr/bin/tini", "--", "/usr/local/bin/agentmemory-entrypoint.sh"]
diff --git a/deploy/coolify/README.md b/deploy/coolify/README.md
@@ -0,0 +1,132 @@
+# Deploy agentmemory on Coolify
+
+[Coolify](https://coolify.io/self-hosted) is an open-source, self-hosted
+Heroku/Render alternative that you run on your own VPS. This template
+deploys agentmemory as a Coolify *Application* backed by a Docker
+Compose stack — Coolify handles TLS termination, persistent volume
+provisioning, log aggregation, and the deploy webhook for you.
+
+## What you get
+
+- A public HTTPS endpoint serving the agentmemory REST API behind
+  Coolify's built-in Traefik/Caddy proxy. The container port (`3111`)
+  is exposed to the proxy network only — never bound to the host — so
+  TLS termination and domain routing stay under proxy control.
+- A persistent Docker volume backing `/data` for memories, BM25 index,
+  and stream backlog. Coolify auto-prefixes the volume name with the
+  application's UUID so the data survives redeploys.
+- An HTTP health-check at `/agentmemory/livez` declared in the
+  Dockerfile (`HEALTHCHECK` directive). Coolify reuses it for
+  rolling-deploy decisions.
+
+## One-time setup
+
+1. **Open your Coolify dashboard** and click **+ New → Application**.
+2. **Source**: pick *Public Repository*. Paste:
+   ```
+   https://github.com/rohitg00/agentmemory
+   ```
+   Branch: `main`.
+3. **Build Pack**: select *Docker Compose*.
+4. **Base Directory**: `deploy/coolify`
+5. **Compose Path**: `docker-compose.yml`
+6. Click **Save**, then on the application settings screen set a
+   **Domain** in the form `https://<your-fqdn>:3111` (the `:3111`
+   suffix tells Coolify's proxy which container port to forward to;
+   it still serves over 443/80 publicly).
+7. Click **Deploy**.
+
+That's it. Coolify clones the repo, builds the Dockerfile under
+`deploy/coolify/`, provisions the `agentmemory-data` named volume on
+the host, attaches Traefik (or Caddy) for the public domain, and starts
+the service. The container is reachable only through the proxy — there
+is no published host port.
+
+## Capture the HMAC secret
+
+Once the deploy logs show the service is up, open the application's
+**Logs** tab in Coolify and search for `AGENTMEMORY_SECRET=`. You will
+see exactly one line of the form `AGENTMEMORY_SECRET=<64 hex chars>`.
+Copy it into your client environment (`~/.bashrc`, Claude Desktop
+config, etc.). The secret is never printed again on subsequent boots.
+
+## Verify the deployment
+
+```bash
+curl "https://<your-coolify-domain>/agentmemory/livez"
+# {"status":"ok"}
+```
+
+For an authenticated call, your client must send
+`Authorization: Bearer <secret>`.
+
+## Viewer access (port 3113 stays internal)
+
+The viewer port is not exposed by the compose file on purpose — it
+holds the unauthenticated admin surface in older releases and the
+proxied surface in current ones, neither of which belongs on the open
+internet. Two paths to reach it:
+
+**Option A — SSH tunnel from the Coolify host.** Coolify gives you SSH
+access to the underlying VPS. From your laptop:
+
+```bash
+ssh -L 3113:127.0.0.1:3113 <user>@<coolify-host>
+# inside the SSH session, find the container:
+docker ps --filter name=agentmemory --format "{{.Names}}"
+# tunnel into the container's port from the host:
+docker exec -it <container-name> sh -c "curl http://localhost:3113"
+```
+
+Cleaner version: bind the container's 3113 to the host's loopback by
+adding `- "127.0.0.1:3113:3113"` to the `ports:` block in
+`docker-compose.yml`, redeploy, then `ssh -L 3113:127.0.0.1:3113
+<user>@<host>` is enough.
+
+**Option B — expose 3113 as a second Coolify domain protected by HTTP
+basic auth.** Coolify's per-service routing supports adding a second
+public endpoint with basic-auth middleware. Useful if you want to
+share the viewer with a teammate without giving them SSH.
+
+## Rotate the HMAC secret
+
+```bash
+ssh <user>@<coolify-host>
+docker exec -it <container-name> sh -c "rm /data/.hmac"
+exit
+```
+
+Then click **Redeploy** in the Coolify dashboard. The next boot prints
+a fresh secret to the logs.
+
+## Back up `/data`
+
+Coolify exposes the named volume on the host filesystem under
+`/var/lib/docker/volumes/<project-id>_agentmemory-data/_data`. Back it
+up with your existing host-level snapshot tooling (Restic, Borg,
+`rsync`, BTRFS snapshots, etc.) or via Coolify's built-in *Backups*
+feature for Docker volumes.
+
+## Cost floor and resources
+
+- **Hardware**: the agentmemory container idles at ~150 MB RSS, climbs
+  to ~400 MB under steady traffic. The bundled iii engine adds another
+  ~80 MB. A 1 vCPU / 1 GB VPS is comfortably enough for a personal
+  install.
+- **VPS providers commonly paired with Coolify**: Hetzner CX22
+  (~€3.79/month), DigitalOcean Basic Droplet ($6/month), Vultr Cloud
+  Compute ($6/month). Coolify itself is free.
+- **Volume storage**: tied to whatever block storage the VPS provides;
+  typically pennies per GB-month.
+
+## Known caveats
+
+- The Dockerfile builds on the Coolify host on every deploy. First
+  deploy takes ~2 minutes; cached layers shrink subsequent rebuilds to
+  under 30 seconds. Pin `AGENTMEMORY_VERSION` and `III_VERSION` in
+  `docker-compose.yml`'s `build.args` block to lock a specific release.
+- Coolify's *Persistent Storage* tab will show `agentmemory-data` as a
+  managed volume — do not delete it from the dashboard if you want
+  your memories to survive a redeploy.
+- arm64 hosts work — the iii binary selection in the Dockerfile uses
+  `uname -m` and downloads the matching tarball.
diff --git a/deploy/coolify/docker-compose.yml b/deploy/coolify/docker-compose.yml
@@ -0,0 +1,30 @@
+services:
+  agentmemory:
+    build:
+      context: .
+      dockerfile: Dockerfile
+      args:
+        AGENTMEMORY_VERSION: "0.9.12"
+        III_VERSION: "0.11.2"
+        III_SDK_VERSION: "0.11.2"
+    restart: unless-stopped
+    environment:
+      - SERVICE_FQDN_AGENTMEMORY_3111
+    expose:
+      - "3111"
+    volumes:
+      - agentmemory-data:/data
+    healthcheck:
+      test: ["CMD-SHELL", "curl -fsS http://127.0.0.1:3111/agentmemory/livez || exit 1"]
+      interval: 30s
+      timeout: 5s
+      start_period: 30s
+      retries: 3
+    logging:
+      driver: json-file
+      options:
+        max-size: "10m"
+        max-file: "3"
+
+volumes:
+  agentmemory-data: