Embedding Worker — Faculytics

LaBSE embedding worker for the Faculytics analysis pipeline. Receives text via HTTP, returns 768-dimensional L2-normalized embeddings using sentence-transformers with ONNX backend for CPU-optimized inference.

API Contract

`POST /embeddings`

Request:

{
  "jobId": "uuid",
  "version": "1.0",
  "type": "embedding",
  "text": "The professor explains concepts clearly.",
  "metadata": {
    "submissionId": "uuid",
    "facultyId": "faculty-001",
    "versionId": "version-001"
  },
  "publishedAt": "2026-03-14T00:00:00.000Z"
}

Success response (HTTP 200):

{
  "jobId": "uuid",
  "version": "1.0",
  "status": "completed",
  "result": {
    "embedding": [0.01, 0.02, "... (768 floats)"],
    "modelName": "LaBSE"
  },
  "completedAt": "2026-03-14T00:01:00.000Z"
}

Error response (HTTP 200 — domain errors avoid BullMQ retries):

{
  "jobId": "uuid",
  "version": "1.0",
  "status": "failed",
  "error": "description",
  "completedAt": "2026-03-14T00:01:00.000Z"
}

`GET /health`

Returns 200 {"status": "ok", "model": "LaBSE"} when ready, 503 otherwise.

Quick Start

Local development

# Install dependencies
uv sync

# Run dev server
uv run uvicorn src.main:app --reload

# Run tests
uv run pytest

# Lint & format
uv run ruff check src/ tests/
uv run ruff format src/ tests/

Docker

docker build -t embedding-worker .
docker run -p 8000:8000 embedding-worker

Configuration

Variable	Default	Description
`HOST`	`0.0.0.0`	Server bind address
`PORT`	`8000`	Server port
`MODEL_NAME`	`sentence-transformers/LaBSE`	Hugging Face model ID
`MODEL_BACKEND`	`onnx`	Inference backend (`onnx` or `torch`)
`LOG_LEVEL`	`INFO`	Python log level
`OPENAPI_MODE`	`false`	Enable Swagger UI at `/docs`

Copy .env.sample to .env to get started.

Architecture

src/
├── config.py       # pydantic-settings configuration
├── models.py       # Pydantic request/response schemas (camelCase aliases)
├── embedding.py    # EmbeddingService: model loading and inference
└── main.py         # FastAPI app, lifespan, routes

Model loading happens once at startup via FastAPI's lifespan context manager
ONNX backend provides 2-4x faster CPU inference compared to PyTorch
Domain errors return HTTP 200 with status: "failed" to prevent BullMQ from retrying bad input — only unexpected server failures return 5xx
Contract compliance — Pydantic schemas use camelCase field aliases matching the Zod schemas in the NestJS API

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding Worker — Faculytics

API Contract

`POST /embeddings`

`GET /health`

Quick Start

Local development

Docker

Configuration

Architecture

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Embedding Worker — Faculytics

API Contract

POST /embeddings

GET /health

Quick Start

Local development

Docker

Configuration

Architecture

`POST /embeddings`

`GET /health`