Current architecture in hosted mode keeps per-user app state in a process-local dictionary named _session_cache, keyed by a session_id stored in Flask session cookies. This can break using Gunicorn with workers > 1. Should be fine for modest number of users (10-25 concurrent).
Currently:
- Upload flow writes temp files and stores lightweight session info in Flask session.
- A session_id is generated and used as key into _session_cache.
- Each key points to a TaterApp instance with loaded docs/widgets.
- Callbacks resolve current app at request time via session_id lookup.
Benefits:
- Very low complexity.
- Fast lookup and no network round trips.
- Easy callback model for Dash because state is in memory.
Limitations:
- Multi-worker mismatch: with Gunicorn workers > 1, each worker has its own _session_cache, so a request may land on a worker that does not have that user’s TaterApp.
- Horizontal scaling mismatch: multiple containers/pods do not share this cache.
- Volatile state: process restart clears cache; users lose in-memory session state.
- Memory growth: no eviction/TTL means long-lived server can accumulate stale sessions.
- Session affinity dependency: behavior appears fine only when traffic consistently returns to same process.
- Concurrency risk: mutable in-memory objects can be touched by concurrent requests/threads unless carefully controlled.
Implications:
- Safe default is single worker/process for hosted mode.
- Rolling deploys or restarts will disrupt active annotation sessions.
- Autoscaling without shared session storage will produce intermittent “works/fails” behavior.
For now:
- Run one Gunicorn worker.
- Add cache cleanup policy (TTL + max entries).
- Add explicit logging/metrics for cache hits/misses and session rebuilds.
- Document that hosted mode is single-process unless shared state is added.
Future:
- Move session app state out of process memory:
- Redis or database-backed session/app state.
- Shared file/object store for temp artifacts.
- Keep only small identifiers in cookie session.
- Rebuild or fetch TaterApp state deterministically per request/session.
- Then scale workers/replicas safely.
Current architecture in hosted mode keeps per-user app state in a process-local dictionary named _session_cache, keyed by a session_id stored in Flask session cookies. This can break using Gunicorn with workers > 1. Should be fine for modest number of users (10-25 concurrent).
Currently:
Benefits:
Limitations:
Implications:
For now:
Future: