diff --git a/README.md b/README.md
index 0a46a3a..67b583d 100644
--- a/README.md
+++ b/README.md
@@ -87,6 +87,8 @@ For all eight live scenarios with prompts you can copy-paste, see the **[Demo co
 
 ![Architecture](docs/architecture.svg)
 
+> Looking ahead: a Kubernetes-friendly architecture with object-storage-backed user data and squashfs-packaged skills is being designed in [docs/requirements/](docs/requirements/). Docker Compose remains the primary supported path.
+
 ## Ways to try it
 
 | Path | URL | What you need | Best for |
diff --git a/docs/requirements/README.md b/docs/requirements/README.md
new file mode 100644
index 0000000..e857551
--- /dev/null
+++ b/docs/requirements/README.md
@@ -0,0 +1,44 @@
+# Requirements & Architecture Plans
+
+This folder collects forward-looking design documents for Open Computer Use.
+Unlike the rest of `docs/`, which describes how the project works **today**,
+files here describe the architecture we are **planning to ship**.
+
+## Why this folder exists
+
+We want contributors and integrators to know where the project is going
+before code lands. A new deployment target (Kubernetes), a new storage
+model (object-store backed user data) or a new runtime contract
+(`RuntimeBackend` abstraction) is far easier to review when the design is
+written down up front, separate from any single PR.
+
+A document in `requirements/` is a **commitment to a direction**, not a
+finished spec. We expect each one to be revised as prototypes land. When a
+plan is fully delivered the document either moves to the main `docs/` tree
+(now describing reality) or is archived with a note pointing at the code.
+
+## What's here
+
+| File | Status | Topic |
+|------|--------|-------|
+| [`k8s-architecture.md`](k8s-architecture.md) | Draft | Target architecture for Kubernetes deployments — runtime backends, storage tiering, isolation tiers |
+| [`roadmap.md`](roadmap.md) | Draft | Phased delivery plan, what each phase changes, what stays compatible |
+
+## What this folder is **not**
+
+- Not a backlog of bugs or feature requests — those go to GitHub Issues.
+- Not user-facing documentation — see `docs/INSTALL.md`, `docs/FEATURES.md`,
+  `docs/CLOUD.md` for that.
+- Not authoritative until the corresponding code ships. If a doc here
+  conflicts with the running system, the running system wins.
+
+## How to contribute to a plan
+
+1. Open a GitHub Discussion or Issue referencing the document.
+2. PRs that change a plan should explain **what changed and why** in the
+   PR description, not just diff the markdown.
+3. Prototypes that validate (or invalidate) a plan are welcome — link the
+   PR back to the document so the next reader sees the evidence.
+
+The current Docker Compose deployment continues to be supported through
+every phase below. No phase forces existing operators to migrate.
diff --git a/docs/requirements/k8s-architecture.md b/docs/requirements/k8s-architecture.md
new file mode 100644
index 0000000..00e0146
--- /dev/null
+++ b/docs/requirements/k8s-architecture.md
@@ -0,0 +1,260 @@
+# Target Architecture: Kubernetes-friendly Open Computer Use
+
+> **Status:** Draft. No code in this document has shipped yet. The Docker
+> Compose deployment (`docker-compose.yml`, `docker-compose.webui.yml`)
+> remains the primary supported path until phases below land.
+
+This document describes the target architecture for running Open Computer
+Use on Kubernetes alongside the existing Docker Compose stack. It is the
+reference for the phased delivery plan in [`roadmap.md`](roadmap.md).
+
+## Goals
+
+1. **Run unchanged on any Kubernetes cluster** — managed (EKS, GKE, AKS),
+   self-hosted, or bare-metal. No dependency on a specific cloud provider.
+2. **Preserve the Docker Compose path** — single-node operators keep the
+   existing experience; nothing they rely on is removed.
+3. **Avoid exotic infrastructure** — no `ReadWriteMany` (RWX) storage,
+   no proprietary CSI drivers, no cluster-wide privileged daemons.
+4. **Stay open-source friendly** — every dependency has a public,
+   permissively licensed implementation that contributors can run locally.
+5. **Leave a clear path to stronger isolation** — Phase 2 can opt into
+   VM-class sandboxing without rewriting the orchestrator.
+
+## Non-goals (for the initial Kubernetes work)
+
+- Building our own object-storage service. Use S3-compatible backends
+  (AWS S3, MinIO, Cloudflare R2, Backblaze B2, Ceph RGW, …).
+- Building a custom guest agent. The existing entrypoint plus MCP server
+  inside the workspace image is sufficient.
+- Live migration of running workspaces between nodes.
+- L7 egress filtering. `NetworkPolicy` covers L3/L4; richer policy is
+  out of scope until an enterprise need shows up.
+
+## Architecture overview
+
+```
+                        ┌──────────────────────────────┐
+                        │  Open WebUI (or any MCP host)│
+                        └──────────────┬───────────────┘
+                                       │ MCP / HTTP
+                                       ▼
+                        ┌──────────────────────────────┐
+                        │  computer-use-server         │
+                        │  (orchestrator, FastAPI)     │
+                        │                              │
+                        │  RuntimeBackend interface    │
+                        │   ├── DockerBackend          │
+                        │   └── K8sBackend             │
+                        │                              │
+                        │  Warm-pool manager           │
+                        │  Skill registry / squashfs   │
+                        │  S3 client (boto3)           │
+                        └─┬──────────────┬─────────────┘
+                          │              │
+                K8s API   │              │  S3 API
+                          ▼              ▼
+              ┌────────────────────┐  ┌──────────────────────┐
+              │ Workspace Pod      │  │ Object Store         │
+              │  (one per chat)    │  │  - skills/*.squashfs │
+              │                    │  │  - chats/<id>/*      │
+              │  rootfs (eph/PVC)  │  │  - lifecycle TTL     │
+              │  /opt/skills/*    ←┼──┘                      │
+              │  /mnt/user-data/* ←│  via FUSE sidecar       │
+              │                    │                          │
+              │  entrypoint + MCP  │                          │
+              └────────────────────┘                          │
+                                                              ▼
+                                                    cleanup via
+                                                    bucket lifecycle
+```
+
+## Storage model
+
+The single most important architectural decision is how data is laid out
+across four tiers, each with a clear lifetime and access pattern.
+
+| Tier | Purpose | Lifetime | Access | Implementation |
+|------|---------|----------|--------|----------------|
+| **1. Image** | Base OS, language runtimes, browsers, MCP server | Pinned to image tag | RO, baked in | OCI image, pulled and cached by node kubelet |
+| **2. Skills** | Runtime tools: pptx, docx, xlsx, sub-agent, … | Per-version, immutable | RO, mounted | Each skill packaged as a `squashfs` blob, fetched from object store at workspace start |
+| **3. Workspace home** | Per-chat working directory (`/home/assistant`) | Per-chat | RW, exclusive | Ephemeral by default (Pod ephemeral storage); optional RWO `PersistentVolumeClaim` for chats that must survive a pod restart |
+| **4. User data** | `uploads`, `outputs`, archived results | Per-chat | RO/WO depending on subdir | Object storage with chat-scoped key prefix, mounted via FUSE sidecar (e.g. `rclone mount`, `mountpoint-s3`) |
+
+### Why these choices
+
+- **No RWX storage anywhere.** Every mount is either RO (skills, uploads,
+  tool-results) or single-writer (workspace home, outputs). Single-writer
+  patterns work on `ReadWriteOnce` block storage that every cloud and
+  every CSI driver supports out of the box.
+- **Skills are immutable artifacts.** A skill at version *v* is the same
+  bytes everywhere. Promoting a new version is a registry push, not a
+  filesystem mutation. This makes hot-reload semantics simple ("attach
+  the new blob to the next workspace that starts") and makes per-tenant
+  skill sets trivial (different blob list in `WorkspaceSpec`).
+- **User data is namespaced by path, not by volume.** A single bucket
+  holds all chats; per-chat isolation is the bucket prefix
+  `chats/<chat-id>/`. Cleanup is a lifecycle policy on the bucket; we
+  never have to enumerate K8s `PersistentVolumeClaim` objects for it.
+- **Workspace home is the only thing that may need real persistence.**
+  Most chats are short-lived enough that ephemeral storage is correct.
+  Long-lived "saved" chats can opt into a RWO PVC; the orchestrator
+  controls this via `WorkspaceSpec.home_persistence`.
+
+### Object store compatibility
+
+Every storage interaction is over the S3 protocol with a configurable
+endpoint:
+
+```
+S3_ENDPOINT_URL=http://minio:9000          # local docker-compose
+S3_ENDPOINT_URL=https://s3.amazonaws.com    # AWS
+S3_ENDPOINT_URL=https://<acct>.r2.cloudflarestorage.com   # R2
+S3_ENDPOINT_URL=https://storage.googleapis.com            # GCS interop
+```
+
+The Docker Compose stack ships a MinIO service so single-node operators
+get the same code paths as cloud deployments without registering for an
+external account.
+
+## Runtime backends
+
+The orchestrator talks to its workspace runtime through a `RuntimeBackend`
+interface. Two implementations ship:
+
+- **`DockerBackend`** — the existing Docker-socket-based code path,
+  refactored behind the interface. Default on Compose.
+- **`K8sBackend`** — Kubernetes Python client; creates `Pod`s (or
+  optionally `Deployment`s with `replicas=0/1`) and uses
+  `connect_get_namespaced_pod_exec` for the same `exec` channel the
+  Docker backend uses today.
+
+```python
+class RuntimeBackend(Protocol):
+    async def ensure_workspace(chat_id: str, spec: WorkspaceSpec) -> Workspace
+    async def get_workspace(chat_id: str) -> Workspace | None
+    async def get_address(chat_id: str) -> str | None    # routable IP for CDP/ttyd proxy
+    async def exec(chat_id: str, cmd: list[str], **opts) -> ExecResult
+    async def exec_stream(chat_id: str, cmd: list[str]) -> AsyncIterator[bytes]
+    async def remove(chat_id: str) -> None
+    async def list_workspaces() -> list[Workspace]
+```
+
+`WorkspaceSpec` is a backend-agnostic description:
+
+```python
+@dataclass
+class WorkspaceSpec:
+    image: str
+    env: dict[str, str]
+    cpu: float                       # vCPU
+    memory: str                      # "2Gi"
+    skills: list[SkillRef]           # squashfs blobs to attach
+    user_data_namespace: str         # chat_id; becomes object-store prefix
+    home_persistence: Literal["ephemeral", "persistent"]
+    runtime_class: str | None        # e.g. "kata-fc" in Phase 2
+```
+
+The browser CDP and terminal proxies in `app.py` already work against a
+routable IP. They keep working unchanged: the Docker backend returns the
+container's network address, the Kubernetes backend returns
+`pod.status.podIP`. No proxy logic moves.
+
+## Workspace lifecycle and warm pool
+
+Cold-creating a Kubernetes Pod typically takes several seconds even after
+image pull is cached, dominated by scheduling, CSI volume attach, and
+container start. To match the responsiveness of `docker run`, the
+`K8sBackend` maintains a small **warm pool** of pre-started, idle Pods.
+
+- The pool runs *N* (default 2–5) workspace Pods with a generic identity
+  and no chat assigned.
+- When a new chat arrives, the orchestrator atomically claims one Pod
+  from the pool by relabelling it (`chat-id=<id>`), injects per-chat
+  environment, and returns it as ready. Time-to-ready is dominated by
+  the relabel + env injection round-trip, typically a few hundred
+  milliseconds.
+- A background task replenishes the pool to size *N*.
+- On image change (new workspace tag), the pool is drained and rebuilt.
+
+This is independent of the storage tier and works with both the
+ephemeral and persistent home choices.
+
+## Isolation tiers
+
+Two runtime classes are supported, selected per workspace via
+`WorkspaceSpec.runtime_class`:
+
+| Tier | When to use | Trade-offs |
+|------|-------------|------------|
+| `runc` (default) | Trusted code, internal teams, dev/test | Shared kernel, fastest cold start, broadest compatibility |
+| `kata-fc` (Kata Containers on Firecracker) | Untrusted code, public multi-tenant deployments | Real VM boundary per workspace, requires Kata installed on nodes; slightly slower start, occasional driver/feature gaps |
+
+`gVisor` is intentionally not in the matrix: its compatibility envelope
+is too narrow for the workloads Open Computer Use runs (Chromium with
+sandbox flags, Playwright, browser downloads). If a deployment needs
+hardware-level isolation, `kata-fc` is the right tool.
+
+Selecting a runtime class is a `PodSpec.runtimeClassName` field — no
+code changes are required beyond plumbing the value through
+`WorkspaceSpec`. Cluster admins install the runtime once via DaemonSet
+or pick a managed offering that includes it.
+
+## Network and security
+
+- **Per-namespace `NetworkPolicy`** denies workspace-to-workspace
+  traffic and workspace-to-Kubernetes-API traffic by default. Egress
+  to the public internet plus the orchestrator's port is allowed.
+- **`ResourceQuota` and `LimitRange`** cap blast radius per namespace.
+- **ServiceAccount per workspace** (or one shared, RBAC-empty SA) so the
+  workspace cannot enumerate or modify cluster state.
+- **`securityContext`**: `runAsNonRoot`, `allowPrivilegeEscalation: false`,
+  drop all capabilities, `seccompProfile: RuntimeDefault`. The Docker
+  setup's `security_opt: no-new-privileges` translates directly.
+- **Secrets** for API keys (Anthropic, vision, GitLab, …) come from
+  `Secret` objects via `envFrom`, created per chat or shared per
+  namespace depending on tenancy model.
+
+## Cleanup
+
+The current `cron/` reaper container moves into the orchestrator process
+as a background asyncio task. It uses
+`RuntimeBackend.list_workspaces()` to enumerate and
+`RuntimeBackend.remove(chat_id)` to terminate, so the same code drives
+both backends.
+
+Object-store cleanup is handled out-of-band by the bucket's lifecycle
+policy (default: expire `chats/*` after 7 days). The orchestrator never
+walks the bucket itself.
+
+## What does **not** change
+
+- The MCP tool surface and JSON-RPC protocol.
+- The system-prompt rendering pipeline.
+- The browser CDP and terminal WebSocket proxies in `app.py`.
+- `cli_runtime.py` and the multi-CLI sub-agent code path.
+- The workspace `Dockerfile`. The same image is used by both backends;
+  the registry it is pulled from is a deployment concern.
+- The Open WebUI integration (`openwebui/` directory).
+
+## Open questions
+
+These decisions are deferred until prototypes provide evidence:
+
+- **Squashfs mount mechanism on `runc`**: kernel `mount -t squashfs`
+  needs `CAP_SYS_ADMIN`, while `squashfuse` works in user space at the
+  cost of needing FUSE in the Pod. Validate which is preferred for
+  Phase 1 once we have a measurement.
+- **FUSE sidecar choice for user data**: `rclone mount`, `mountpoint-s3`,
+  and `geesefs` have different write semantics. `mountpoint-s3` only
+  supports sequential writes which suits `outputs/` but may break some
+  tools that write atomic temp files. To be measured before commitment.
+- **Warm pool sizing heuristics**: static *N* is fine to start; whether
+  to scale with cluster load is a Phase 2 question.
+- **PVC pool vs. per-chat dynamic provisioning** for the persistent
+  home option. Pool reduces volume-attach latency at the cost of
+  pre-allocated capacity.
+
+Each open question becomes a small prototype PR with measurements
+attached. None blocks the Phase 1 refactor that exposes
+`RuntimeBackend`.
diff --git a/docs/requirements/roadmap.md b/docs/requirements/roadmap.md
new file mode 100644
index 0000000..8aac902
--- /dev/null
+++ b/docs/requirements/roadmap.md
@@ -0,0 +1,180 @@
+# Roadmap: Kubernetes-friendly Open Computer Use
+
+> **Status:** Draft. Phases below are ordered, but each ships
+> independently. The Docker Compose stack continues to be supported in
+> every phase. Operators are never forced to migrate.
+
+This roadmap delivers the architecture in
+[`k8s-architecture.md`](k8s-architecture.md) in small, reviewable steps.
+Every phase produces a useful artifact on its own; if work pauses
+between phases, the project is still in a coherent state.
+
+## Guiding principles
+
+- **No flag day.** Each phase is additive and gated by configuration.
+  Existing users see no behavioural change unless they opt in.
+- **Refactor before features.** The `RuntimeBackend` interface lands
+  before any new backend, so the diff that adds Kubernetes is small and
+  isolated.
+- **Prove with prototypes.** Open questions in
+  [`k8s-architecture.md`](k8s-architecture.md#open-questions) are
+  resolved with measured prototypes, not by debate.
+- **Single workspace image.** Both backends pull from the same
+  registry. We do not maintain a separate "Kubernetes image".
+
+## Phase 1 — Runtime abstraction (no behaviour change)
+
+**Goal:** Make the orchestrator backend-agnostic without altering
+runtime behaviour. Existing Docker Compose users see no difference.
+
+- Extract the `RuntimeBackend` `Protocol` and the `WorkspaceSpec` /
+  `Workspace` data classes from the current `docker_manager.py`.
+- Move the existing logic into `runtime/docker_backend.py` as the
+  default implementation. Same code, new home.
+- Plumb a `RUNTIME_BACKEND` environment variable; `docker` is the
+  default and only valid value at the end of this phase.
+- Update `app.py`, `mcp_tools.py`, `cli_runtime.py` callers to go
+  through `runtime.get_backend()` instead of importing
+  `docker_manager` directly.
+- Tests pass unchanged.
+
+**Exit criteria:** All current functionality works against the new
+abstraction. No new dependencies. PR is a pure refactor.
+
+## Phase 2 — Object-store backed user data (Compose first)
+
+**Goal:** Replace per-chat host bind mounts under
+`/tmp/computer-use-data/<chat-id>/` with S3-compatible object storage.
+Land it on Docker Compose first, where the change is contained, and
+keep the `K8sBackend` work decoupled.
+
+- Add a MinIO service to `docker-compose.yml`. Use it as the default
+  backend for local development and CI.
+- Introduce `S3_ENDPOINT_URL`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`,
+  `S3_BUCKET_DATA` configuration. The orchestrator uses `boto3` against
+  the configured endpoint.
+- Refactor uploads/outputs handlers in `app.py` to read and write
+  through the S3 client; the public HTTP API
+  (`/api/uploads/...`, `/files/...`) is unchanged.
+- Workspace containers receive `uploads`/`outputs`/`tool-results`
+  through a FUSE sidecar (or init-pull, behind a config flag) that
+  scopes the mount to `chats/<chat-id>/`.
+- `BASE_DATA_DIR` becomes either a local path (legacy) or an S3 prefix
+  (new path), selected by configuration.
+
+**Exit criteria:** Compose deployment runs end-to-end against MinIO.
+Bucket lifecycle policy replaces filesystem cleanup for user data. The
+host-bind code path is still available for one release as a fallback.
+
+## Phase 3 — Skills as squashfs blobs
+
+**Goal:** Replace per-skill ZIPs and host-path bind mounts with
+immutable squashfs artifacts in object storage.
+
+- Build skills as `.squashfs` blobs at release time
+  (`mksquashfs skills/<name> skills/<name>.squashfs`).
+- Push blobs to the same object store, under `skills/<name>/<version>/`.
+- Update `skill_manager.py` to fetch the blob list per user / per
+  tenant, attach blobs at workspace start (Compose: bind into the
+  container; Kubernetes: emerge in Phase 4).
+- Drop the ZIP-cache + atomic-replace mechanism in
+  `skill_manager.py`. Hot-reload becomes "next workspace start picks
+  up the new version".
+- Document the immutability contract: a skill version is the same
+  bytes everywhere, forever.
+
+**Exit criteria:** All bundled skills ship as squashfs blobs. Compose
+users transparently get the new pipeline. Per-user skill selection
+(today's `get_user_skills_sync`) continues to work.
+
+## Phase 4 — Kubernetes backend
+
+**Goal:** Ship the `K8sBackend` and a Helm chart that operators can
+use to deploy the full stack on any Kubernetes cluster.
+
+- Implement `K8sBackend` against `kubernetes-asyncio`. Pod creation,
+  exec, log streaming, removal.
+- Mount squashfs skill blobs in workspace Pods (`squashfuse` sidecar
+  or init-container, depending on Phase 1's open-question result).
+- Mount user-data prefix via FUSE sidecar.
+- Expose CDP and ttyd through the existing orchestrator-side
+  WebSocket proxies. Pod IP is the routable target; no per-Pod
+  Service is created.
+- Helm chart `charts/open-computer-use/` covering: orchestrator,
+  Open WebUI, MinIO (optional), workspace Pod template (ConfigMap),
+  `NetworkPolicy`, RBAC, `ResourceQuota`.
+- CI test against `kind` running the same end-to-end test suite that
+  Compose uses.
+
+**Exit criteria:** A new operator can `helm install` the chart and
+get the same UX as `docker compose up`, on any cluster. Both
+backends share more than 90 % of their code path.
+
+## Phase 5 — Warm pool + cleanup migration
+
+**Goal:** Bring cold-start latency on Kubernetes down to parity with
+Compose, and replace the standalone cleanup cron.
+
+- Implement the warm-pool manager described in
+  [`k8s-architecture.md`](k8s-architecture.md#workspace-lifecycle-and-warm-pool).
+- Move the reaper loop from `cron/` into the orchestrator as an
+  asyncio background task. It calls
+  `RuntimeBackend.list_workspaces()` so the same code drives both
+  backends.
+- Surface metrics (`prometheus_client`): pool size, claim latency
+  histogram, workspace lifetime, reap counts.
+
+**Exit criteria:** Pool hit gives sub-second time-to-ready in a real
+cluster. Compose users get cleanup-via-orchestrator and can remove
+the `cleanup` service from their compose file.
+
+## Phase 6 — Optional: VM-class isolation
+
+**Goal:** Offer hardware-level isolation for deployments that execute
+fully untrusted code, without changing the orchestrator API.
+
+- Document supported runtime classes (`kata-fc`, possibly others).
+- Add `WorkspaceSpec.runtime_class`; `K8sBackend` plumbs it to
+  `PodSpec.runtimeClassName`.
+- Provide example values files for clusters with Kata Containers
+  preinstalled.
+- Validate Chromium / Playwright / ttyd compatibility under
+  `kata-fc`. Document any flags or limitations.
+
+This phase is opt-in. Most deployments will keep `runc`. Operators
+who need a real VM boundary can flip a single field.
+
+**Exit criteria:** A cluster admin who installs Kata Containers can
+set `runtime_class: kata-fc` in chart values and run workloads in
+microVMs without further code changes.
+
+## Tracking
+
+- Each phase opens a tracking GitHub Issue with the corresponding
+  document section linked.
+- Phase status (`planned` / `in progress` / `delivered`) is reflected
+  in the table below as PRs land.
+- When a phase is fully delivered, its content moves into the main
+  `docs/` tree (e.g. `INSTALL.md` gets a Kubernetes section) and the
+  draft section here is shortened to a pointer.
+
+| Phase | Topic | Status |
+|-------|-------|--------|
+| 1 | `RuntimeBackend` refactor | Planned |
+| 2 | Object-store user data (Compose) | Planned |
+| 3 | Skills as squashfs blobs | Planned |
+| 4 | Kubernetes backend + Helm chart | Planned |
+| 5 | Warm pool + cleanup migration | Planned |
+| 6 | Optional VM-class isolation | Planned |
+
+## What stays compatible across all phases
+
+- `docker compose up` keeps working from Phase 1 through Phase 6.
+- The MCP protocol surface is untouched.
+- Existing skill packages keep working through Phase 2; Phase 3
+  changes the *packaging format* but not the *authoring experience*.
+- The Open WebUI integration is unchanged. It only sees the
+  orchestrator HTTP API, which does not move.
+
+If a phase ever requires a breaking change for Compose users, it will
+be called out explicitly in `CHANGELOG.md` with a migration path.