Live Stream Inference Engine β Machine Learning Framework
LSIE-MLF is a containerized monorepo for real-time multimodal inference during live-stream sessions. It combines tethered mobile audio/video capture with external telemetry, synchronizes those inputs into fixed-duration segments, and runs ML analysis for transcription, facial action units, acoustic features, semantic evaluation, and downstream analytics.
The repository is organized around a clear runtime split:
apihandles external ingress and lightweight application logicorchestratorassembles synchronized inference segmentsworkerexecutes compute-heavy ML tasks and analytics- shared packages provide schemas and reusable ML utilities
| Container | Responsibility | Image | Port |
|---|---|---|---|
api |
REST endpoints and webhook ingress | python:3.11-slim |
8000 |
worker |
Celery task consumers for ML inference and analytics | nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04 |
β |
orchestrator |
Segment assembly, synchronization, dispatch | same image as worker |
β |
redis |
Message broker / queue | redis:7-alpine |
internal |
postgres |
Persistent analytical store | postgres:16-alpine |
internal |
stream_scrcpy |
USB media capture container | ubuntu:24.04 + scrcpy |
β |
| Module | Primary responsibility |
|---|---|
| A β Hardware & Transport | Capture raw audio/video from tethered mobile hardware |
| B β Ground Truth Ingestion | Accept external event/telemetry inputs |
| C β Orchestration & Synchronization | Align timestamps, assemble segments, attach context |
| D β Multimodal ML Processing | Run transcription, facial, acoustic, and semantic inference |
| E β Experimentation & Analytics | Persist metrics, run analytics, manage experimentation state |
| F β Context Enrichment | Run asynchronous metadata enrichment workflows |
Android device
-> stream_scrcpy
-> orchestrator
-> worker
-> postgres
External telemetry / webhook ingress
-> api
-> redis
-> orchestrator
-> worker
-> postgres
/
βββ docker-compose.yml # Root service topology and shared runtime wiring
βββ services/
β βββ api/ # API Server: routes, ingress, app wiring
β βββ worker/ # Celery workers, ML execution, analytics
β β βββ pipeline/ # Orchestration + analytics pipeline code
β βββ stream_ingest/ # Capture container entrypoints and device ingest
βββ packages/
β βββ ml_core/ # Shared ML utilities and math
β βββ schemas/ # Pydantic models and schema contracts
βββ data/
β βββ raw/ # Transient/debug media buffers
β βββ interim/ # Intermediate processing artifacts
β βββ processed/ # Structured outputs prior to ingestion
β βββ sql/ # PostgreSQL schema and seed data
βββ requirements/
βββ base.txt # Shared dependencies
βββ api.txt # API-only dependencies
βββ worker.txt # Worker/orchestrator + heavy ML dependencies
- All Dockerfiles under
services/build from the monorepo root - Shared code in
packages/is available to all services - Heavy ML dependencies belong in worker-side requirements, not the API image
cp .env.example .env
# Edit .env with the required credentials and runtime settingsRecommended local prerequisites:
- Docker Engine
- Docker Compose v2
- NVIDIA Container Toolkit (for GPU-backed inference)
- ADB / Android device connectivity if using live USB capture
docker compose up --buildOnce the stack is running, the API server is available at:
http://localhost:8000
docker compose downTo remove volumes as well:
docker compose down -vOperator Console is a PySide6 desktop app that runs
on the operator's host β not in a container. It polls the API
Server's /api/v1/operator/* aggregate routes.
pip install -r requirements/cli.txt
python -m services.operator_consoleEnvironment variables (all optional; sensible defaults apply):
| Variable | Purpose |
|---|---|
LSIE_OPERATOR_API_BASE_URL |
API Server base URL (default http://localhost:8000) |
LSIE_OPERATOR_API_TIMEOUT_SECONDS |
Per-request timeout, default 5 |
LSIE_OPERATOR_ENVIRONMENT_LABEL |
Free-text label shown in the statusline (e.g. dev, staging) |
LSIE_OPERATOR_*_POLL_MS |
Per-surface poll cadences (overview, sessions, health, β¦) β see services/operator_console/config.py for the full list |
The console ships six pages: Overview, Live Session, Experiments, Physiology, Health, and Sessions. Page behavior traces to the spec:
- Live Session's reward explanation uses
p90_intensity,semantic_gate,gated_reward,n_frames_in_window, andbaseline_b_neutral(Β§7B). - Physiology surfaces
fresh/stale/absent/no-rmssdas four distinct states (Β§4.C.4). - Co-modulation
nullis rendered as a legitimatenull-validoutcome with itsnull_reason, not as an error (Β§7C). - Health distinguishes
degraded/recovering/errorwith operator-action hints on the error summary card (Β§12).
| If you need to change... | Start here |
|---|---|
| API routes, request handling, webhook ingress | services/api/ |
| Worker task execution or ML runtime behavior | services/worker/ |
| Segment assembly, synchronization, analytics pipeline | services/worker/pipeline/ |
| Shared inference helpers or math | packages/ml_core/ |
| Schemas and data contracts | packages/schemas/ |
| Database tables / initialization SQL | data/sql/ |
| Dependency placement | requirements/base.txt, requirements/api.txt, requirements/worker.txt |
Use the requirements/ split intentionally:
- put shared packages in
requirements/base.txt - put API-only packages in
requirements/api.txt - put worker/orchestrator / ML-heavy packages in
requirements/worker.txt
This keeps the API image lightweight and improves Docker layer caching.
Before opening a PR, run the full local check suite β it mirrors .github/workflows/ci.yml exactly, so a green local run predicts a green CI run:
bash scripts/check.sh # macOS / Linux / Git Bash on Windows
pwsh scripts/check.ps1 # PowerShell on WindowsThe gate runs ruff lint, ruff format, mypy (strict, packages/ services/ tests/), pytest, the Β§0.3 canonical-terminology audit, docker compose config, schema consistency, and the dependency-pin check. Any drift between the local script and CI is treated as a bug in the script.
At a minimum, changes touching worker or analytics code should be validated against the full worker test path.
Raw media and inbound telemetry should be treated as processing inputs, not long-term application records. Persistent storage is intended for structured analytical outputs, experiment state, and derived metrics.
Keep README-level guidance brief and put detailed governance, retention, and security rules in the technical specification and implementation docs.
LSIE-MLF is implemented against the specification docs/tech-spec-v3.2.pdf.
This README is intentionally operational. It explains how the repository is organized, how to run it locally, and where to make changes. Detailed contracts, mathematical formulas, failure handling, and version history belong in the specification and amendment log.
If this README and the specification differ, the specification is authoritative.
Confidential. All rights reserved.