Microservice for audio ingestion, metadata extraction, and AI enrichment. Receives audio assets, validates them, extracts technical metadata, and orchestrates AI capabilities like transcription. Exposes job status and results through a stable API.
- Upload by remote URL or presigned S3 flow
- Async job processing with BullMQ and Redis
- Metadata extraction separated from AI enrichment
- Partial success model for resilient processing
- Clean architecture layering across domain, application, infrastructure, and HTTP
- Test suite with integration and service-level coverage
Client → HTTP API (Express) → AudioService (fast path: validate + enqueue)
↓
BullMQ Queue (Redis)
↓
AudioJobWorker → S3 + metadata + transcription
↓
MongoDB (job state + results)
↓
Client ← HTTP API (polling) ← AudioService (read path)
Layers:
| Layer | Directory | Responsibility |
|---|---|---|
| Domain | src/domain/ |
Pure interfaces, types, port definitions. Zero external dependencies. |
| Application | src/application/ |
Use case orchestration. Depends only on domain ports. |
| Infrastructure | src/infrastructure/ |
MongoDB repos, S3 storage, BullMQ queue/worker, OpenAI adapter. |
| HTTP | src/http/ |
Express controllers, validators, routes. |
All audio endpoints live under /v1/audio.
POST /v1/audio/jobs
{ "sourceUrl": "https://...", "capabilities": ["transcription"] }
→ 202 { "data": { "_id": "...", "status": "pending", ... } }
POST /v1/audio/jobs/upload-url
{ "filename": "episode.mp3", "contentType": "audio/mpeg", "capabilities": ["transcription"] }
→ 201 { "data": { "uploadUrl": "...", "storageKey": "...", "jobId": "..." } }
POST /v1/audio/jobs/:id/confirm
→ 202 { "data": { "_id": "...", "status": "pending", ... } }
POST /v1/audio/jobs/batch
{ "items": [{ "sourceUrl": "...", "capabilities": ["transcription"] }, ...] }
→ 202 { "data": { "batchId": "...", "jobs": [...] } }
GET /v1/audio/jobs/:id → 200 job object
GET /v1/audio/jobs/:id/result → 200 metadata + enrichment status
GET /v1/audio/transcripts/:jobId → 200 transcript
GET /health → 200 (liveness)
GET /ready → 200/503 (readiness, checks MongoDB)
When API_KEY is configured, requests to /v1/* must include:
Authorization: Bearer <API_KEY>Job status: pending → processing → ready | failed
Enrichment: pending → processing → completed | failed
readymeans metadata extracted successfully. Capabilities may still be pending or failed.failedonly happens when the file can't be downloaded or metadata can't be extracted.- Capability failures (e.g. transcription) do NOT fail the job — partial success model.
- Node.js 20+
- MongoDB 7+
- Redis 7+
# Required
MONGO_URL=mongodb://127.0.0.1:27017/media
S3_REGION=us-east-1
S3_BUCKET=your-bucket
S3_ACCESS_KEY_ID=your-key
S3_SECRET_ACCESS_KEY=your-secret
# Optional
APP_PORT=3000
API_KEY=your-api-key
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
QUEUE_NAME=audio-processing
QUEUE_CONCURRENCY=3
QUEUE_MAX_RETRIES=3
QUEUE_RETRY_DELAY=5000
PRESIGNED_URL_EXPIRATION=3600
MAX_TRANSCRIPTION_FILE_SIZE=26214400
OPENAI_API_KEY=sk-...
OPENAI_WHISPER_MODEL=whisper-1curl -X POST http://localhost:3000/v1/audio/jobs \
-H 'content-type: application/json' \
-H 'authorization: Bearer your-api-key' \
-d '{
"sourceUrl": "https://example.com/audio/episode.mp3",
"capabilities": ["transcription"]
}'# Start MongoDB + Redis
docker compose -f docker-compose.dev.yaml up -d
# Install dependencies
npm install
# Run in dev mode
npm run devnpm run build
npm startdocker compose upnpm testsrc/
domain/ # Core entities and ports
application/ # Use-case orchestration
infrastructure/ # MongoDB, S3, BullMQ, OpenAI adapters
http/ # Express routes, controllers, validators
providers/ # Capability registry
shared/ # Middleware, errors, health checks
AI providers are behind capability-oriented interfaces. The application layer depends on abstract ports, not vendor SDKs.
Currently implemented:
- Transcription: OpenAI Whisper API (
whisper-1)
Planned:
- Summarization
- Topic/keyword extraction
- Language detection
- Embeddings
To add a new provider, implement the corresponding interface in src/infrastructure/ and register it in src/server.ts.
- Runtime: Node.js 20, TypeScript
- Framework: Express
- Database: MongoDB (Mongoose)
- Queue: BullMQ + Redis
- Storage: AWS S3 (presigned URLs)
- AI: OpenAI Whisper API
- Validation: Zod
- Tests: Vitest + Supertest