CtrlAltElite-Devs
diff --git a/‎docs/architecture.mdx‎
Lines changed: 90 additions & 0 deletions b/‎docs/architecture.mdx‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎docs/contract.mdx‎
Lines changed: 139 additions & 0 deletions b/‎docs/contract.mdx‎
Lines changed: 139 additions & 0 deletions
diff --git a/‎docs/deployment.mdx‎
Lines changed: 118 additions & 0 deletions b/‎docs/deployment.mdx‎
Lines changed: 118 additions & 0 deletions
@@ -0,0 +1,90 @@
+---
+title: "Architecture"
+description: "File structure, module responsibilities, and data flow through the topic modeling worker."
+---
+
+## File Structure
+
+```
+src/
+├── handler.py       # RunPod entry point — validation, orchestration, error handling
+├── config.py        # Constants: model name, device, version, default hyperparameters
+├── models.py        # Pydantic request/response schemas (mirrors Zod DTOs in API)
+├── topic_model.py   # BERTopic pipeline: UMAP → HDBSCAN → c-TF-IDF → KeyBERTInspired
+└── evaluate.py      # Quality metrics: NPMI, diversity, silhouette, embedding coherence
+```
+
+## Module Responsibilities
+
+### `handler.py` — Entry Point
+
+The RunPod serverless handler. Performs:
+
+1. **Input parsing** — extracts `input` from the RunPod event envelope, validates with Pydantic
+2. **Parameter merging** — overlays request params onto RUN 012 defaults
+3. **Auto-scaling** — adjusts `min_topic_size` and `umap_n_neighbors` for small datasets
+4. **Validation** — checks minimum item count, embedding dimensionality, zero vectors
+5. **Orchestration** — calls `run_bertopic()`, `extract_topic_info()`, `get_assignments()`, `compute_metrics()`
+6. **Error routing** — domain errors return `status: "failed"` (no BullMQ retry); unexpected exceptions propagate to RunPod (triggers retry)
+
+### `config.py` — Configuration
+
+Static configuration, no environment variables:
+
+```python
+LABSE_MODEL = "sentence-transformers/LaBSE"
+DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
+WORKER_VERSION = "1.0.0"
+
+# RUN 012 defaults — proven optimal from experimentation
+DEFAULT_PARAMS = {
+    "min_topic_size": 15,
+    "nr_topics": 20,
+    "umap_n_neighbors": 20,
+    "umap_n_components": 10,
+}
+```
+
+### `models.py` — Schemas
+
+Pydantic models that mirror the Zod schemas in `api.faculytics/src/modules/analysis/dto/topic-model-worker.dto.ts`. All models use `ConfigDict(extra="ignore")` to tolerate envelope fields (`jobId`, `version`, `type`, `metadata`, `publishedAt`) without validation errors.
+
+### `topic_model.py` — BERTopic Pipeline
+
+The core ML pipeline. See [Pipeline](/docs/pipeline) for details.
+
+### `evaluate.py` — Quality Metrics
+
+Computes five quality metrics on the fitted model. See [Metrics](/docs/metrics) for details.
+
+## Data Flow
+
+```mermaid
+flowchart TD
+    A[RunPod event] --> B[handler.py]
+    B --> C{Validate input}
+    C -- Invalid --> D[Return status: failed]
+    C -- Valid --> E[Auto-scale params for small datasets]
+    E --> F[Extract texts, embeddings, submission IDs]
+    F --> G{Zero vector check}
+    G -- All zero --> D
+    G -- Some zero --> H[Filter zero vectors]
+    G -- None zero --> I[run_bertopic]
+    H --> I
+    I --> J[extract_topic_info]
+    J --> K{0 topics?}
+    K -- Yes --> D
+    K -- No --> L[get_assignments]
+    L --> M[compute_metrics]
+    M --> N[Return TopicModelResponse]
+```
+
+## Global State
+
+The LaBSE model is loaded once at module import time (container start) and shared across all handler invocations:
+
+```python
+embed_model = SentenceTransformer(LABSE_MODEL, device=DEVICE)
+```
+
+This avoids cold-start latency on subsequent requests. The model is ~1.8 GB and is baked into the Docker image during build.
@@ -0,0 +1,139 @@
+---
+title: "API Contract"
+description: "Request and response schemas for the topic modeling worker — field definitions, types, and examples."
+---
+
+**Source of truth:** `api.faculytics/src/modules/analysis/dto/topic-model-worker.dto.ts` (Zod schemas)
+
+**Worker schemas:** `src/models.py` (Pydantic, must stay in sync with Zod)
+
+## Endpoint
+
+`POST {TOPIC_MODEL_WORKER_URL}`
+
+When deployed on RunPod, the actual endpoint is:
+
+```
+POST https://api.runpod.ai/v2/<endpoint-id>/runsync
+Headers: { Authorization: Bearer <RUNPOD_API_KEY> }
+Body: { input: <request payload> }
+```
+
+The RunPod envelope (`input` wrapper, `output` unwrapping) is handled by the API's `RunPodBatchProcessor`.
+
+## Request
+
+```json
+{
+  "items": [
+    {
+      "submissionId": "uuid-string",
+      "text": "The pace was too fast, couldn't follow along.",
+      "embedding": [0.123, -0.456, 0.789, "... (768 floats)"]
+    }
+  ],
+  "params": {
+    "min_topic_size": 15,
+    "nr_topics": 20,
+    "umap_n_neighbors": 20,
+    "umap_n_components": 10
+  }
+}
+```
+
+### Request Fields
+
+| Field | Type | Required | Default | Description |
+| --- | --- | --- | --- | --- |
+| `items` | array | Yes | — | Submissions that passed the sentiment gate |
+| `items[].submissionId` | string | Yes | — | Unique submission identifier |
+| `items[].text` | string | Yes | — | Pre-cleaned qualitative comment (`cleanedComment`) |
+| `items[].embedding` | number[768] | Yes | — | Pre-computed LaBSE 768-dim embedding |
+| `params` | object | No | RUN 012 defaults | BERTopic hyperparameters |
+| `params.min_topic_size` | int | No | 15 | Minimum documents per topic cluster |
+| `params.nr_topics` | int | No | 20 | Target topic count (merges until reached) |
+| `params.umap_n_neighbors` | int | No | 20 | UMAP local neighborhood size |
+| `params.umap_n_components` | int | No | 10 | UMAP output dimensions |
+
+The worker uses `ConfigDict(extra="ignore")` on all Pydantic models, so additional envelope fields sent by the API (`jobId`, `version`, `type`, `metadata`, `publishedAt`) are silently ignored during validation.
+
+## Response — Success
+
+```json
+{
+  "version": "1.0.0",
+  "status": "completed",
+  "topics": [
+    {
+      "topicIndex": 0,
+      "rawLabel": "0_fast_rushed_pace",
+      "keywords": ["fast", "rushed", "pace", "speed", "hurry", "quick", "follow", "slow", "behind", "catch"],
+      "docCount": 45
+    }
+  ],
+  "assignments": [
+    {
+      "submissionId": "uuid-string",
+      "topicIndex": 0,
+      "probability": 0.7234
+    }
+  ],
+  "metrics": {
+    "npmi_coherence": 0.1523,
+    "topic_diversity": 0.8200,
+    "outlier_ratio": 0.1150,
+    "silhouette_score": 0.2341,
+    "embedding_coherence": 0.6102
+  },
+  "outlierCount": 12,
+  "completedAt": "2026-03-21T10:35:00.000Z"
+}
+```
+
+## Response — Failure
+
+```json
+{
+  "version": "1.0.0",
+  "status": "failed",
+  "error": "Received 8 items, need at least 15 (min_topic_size) for topic modeling",
+  "completedAt": "2026-03-21T10:35:00.000Z"
+}
+```
+
+### Response Fields
+
+| Field | Type | Present | Description |
+| --- | --- | --- | --- |
+| `version` | string | Always | Worker version (from `config.WORKER_VERSION`) |
+| `status` | `"completed"` \| `"failed"` | Always | Outcome status |
+| `topics` | array | On success | Discovered topic clusters |
+| `topics[].topicIndex` | int | — | BERTopic topic ID (0, 1, 2, ...) |
+| `topics[].rawLabel` | string | — | Auto-generated label (e.g., `"0_fast_rushed_pace"`) |
+| `topics[].keywords` | string[] | — | Top 10 keywords from KeyBERTInspired |
+| `topics[].docCount` | int | — | Documents in this cluster |
+| `assignments` | array | On success | Per-document topic assignments |
+| `assignments[].submissionId` | string | — | Matches input `submissionId` |
+| `assignments[].topicIndex` | int | — | Assigned topic index |
+| `assignments[].probability` | number (0-1) | — | Assignment confidence (4 decimal places) |
+| `metrics` | object | On success | Model quality metrics (see [Metrics](/docs/metrics)) |
+| `outlierCount` | int | On success | Documents assigned to topic -1 |
+| `error` | string | On failure | Human-readable error message |
+| `completedAt` | ISO datetime | Always | Processing completion timestamp |
+
+## API-Side Processing
+
+After receiving the response, the `TopicModelProcessor` in the API:
+
+1. Validates the response against `topicModelWorkerResponseSchema` (Zod)
+2. Creates `Topic` entities for each topic (with `rawLabel`, `keywords`, `docCount`)
+3. Creates `TopicAssignment` entities — filters out assignments with probability ≤ 0.01
+4. Marks the highest-probability assignment per submission as `isDominant`
+5. Persists metrics on the `TopicModelRun` entity
+6. Calls the orchestrator to advance the pipeline to topic labeling
+
+## Notes
+
+- Outlier documents (topic -1) are **not** included in the `assignments` array
+- The `rawLabel` is later enriched with a human-readable `label` by the topic labeling stage (LLM)
+- Embeddings must be 768-dim LaBSE vectors — the same model used by the embedding worker
@@ -0,0 +1,118 @@
+---
+title: "Deployment"
+description: "Docker image build, RunPod serverless configuration, and production deployment."
+---
+
+The worker is deployed as a RunPod serverless endpoint running on GPU instances.
+
+## Docker Image
+
+The Dockerfile uses RunPod's PyTorch base image with CUDA support:
+
+```dockerfile
+FROM runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
+
+COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
+
+WORKDIR /app
+
+COPY pyproject.toml .python-version ./
+COPY uv.loc[k] ./
+
+RUN uv sync --frozen --no-dev --no-install-project || uv sync --no-dev --no-install-project
+
+# Bake LaBSE into image (~1.8 GB) to avoid cold-start download
+RUN uv run python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/LaBSE')"
+
+COPY src/ src/
+
+CMD ["uv", "run", "python", "-m", "src.handler"]
+```
+
+### Key Build Decisions
+
+- **LaBSE baked in** — the model is downloaded during build (`~1.8 GB`). This eliminates cold-start latency from model downloads on fresh containers.
+- **uv for dependency management** — faster than pip, with lockfile support. Falls back to non-frozen install if no lockfile exists.
+- **No dev dependencies** — `--no-dev` keeps the image lean (no pytest, ruff).
+- **Source copied last** — Docker layer caching means dependency installation only reruns when `pyproject.toml` or `uv.lock` change.
+
+### Building
+
+```bash
+docker build -t topic-worker .
+```
+
+The image is ~8-10 GB due to CUDA runtime + PyTorch + LaBSE model.
+
+### Pushing to Registry
+
+```bash
+docker tag topic-worker <registry>/topic-worker:latest
+docker push <registry>/topic-worker:latest
+```
+
+## RunPod Configuration
+
+### Serverless Endpoint Setup
+
+1. Create a serverless endpoint on [RunPod](https://www.runpod.io/)
+2. Point it to the Docker image in your registry
+3. Configure GPU type (any CUDA-capable GPU works; 16GB+ VRAM recommended)
+4. Set the endpoint URL in the API's `.env`:
+
+```bash
+TOPIC_MODEL_WORKER_URL=https://api.runpod.ai/v2/<endpoint-id>/runsync
+RUNPOD_API_KEY=<your-key>
+```
+
+### Request Flow
+
+```
+API → POST /v2/<endpoint-id>/runsync
+     Body: { input: { items: [...], params: {...} } }
+     Headers: { Authorization: Bearer <RUNPOD_API_KEY> }
+
+RunPod → Starts container (or uses warm instance)
+       → Calls handler({ input: { items: [...], params: {...} } })
+
+Worker → Returns result dict
+
+RunPod → Wraps in { id, status: "COMPLETED", output: <result> }
+       → Returns to API
+```
+
+### Scaling
+
+| Setting | Recommended |
+| --- | --- |
+| Min workers | 0 (scale to zero when idle) |
+| Max workers | 1-2 (topic modeling is a batch operation, not high-throughput) |
+| Idle timeout | 30s (keep warm for short periods between pipeline stages) |
+| Execution timeout | 300s (matches `BULLMQ_TOPIC_MODEL_HTTP_TIMEOUT_MS`) |
+
+## Configuration
+
+The worker has no environment variables — all configuration is in `src/config.py`:
+
+| Config | Value | Purpose |
+| --- | --- | --- |
+| `LABSE_MODEL` | `sentence-transformers/LaBSE` | Embedding model for KeyBERTInspired |
+| `DEVICE` | `cuda` or `cpu` (auto-detected) | PyTorch device |
+| `WORKER_VERSION` | `1.0.0` | Returned in responses, stored on `TopicModelRun.workerVersion` |
+| `DEFAULT_PARAMS` | RUN 012 values | Hyperparameter defaults |
+
+## Dependencies
+
+Core runtime dependencies (`pyproject.toml`):
+
+| Package | Version | Purpose |
+| --- | --- | --- |
+| `runpod` | ≥ 1.7.0 | RunPod serverless handler framework |
+| `pydantic` | ≥ 2.0 | Request/response validation |
+| `sentence-transformers` | ≥ 3.0 | LaBSE model loading |
+| `bertopic` | ≥ 0.16.0 | Topic modeling pipeline |
+| `umap-learn` | ≥ 0.5.6 | Dimensionality reduction |
+| `hdbscan` | ≥ 0.8.33 | Density-based clustering |
+| `scikit-learn` | ≥ 1.4.0 | Silhouette score, CountVectorizer |
+| `gensim` | ≥ 4.3.0 | NPMI coherence computation |
+| `numpy` | ≥ 1.26.0 | Array operations |