Audio Intelligence Service

Microservice for audio ingestion, metadata extraction, and AI enrichment. Receives audio assets, validates them, extracts technical metadata, and orchestrates AI capabilities like transcription. Exposes job status and results through a stable API.

Highlights

Upload by remote URL or presigned S3 flow
Async job processing with BullMQ and Redis
Metadata extraction separated from AI enrichment
Partial success model for resilient processing
Clean architecture layering across domain, application, infrastructure, and HTTP
Test suite with integration and service-level coverage

Architecture

Client → HTTP API (Express) → AudioService (fast path: validate + enqueue)
                                         ↓
                                   BullMQ Queue (Redis)
                                         ↓
                                   AudioJobWorker → S3 + metadata + transcription
                                         ↓
                                   MongoDB (job state + results)
                                         ↓
Client ← HTTP API (polling) ← AudioService (read path)

Layers:

Layer	Directory	Responsibility
Domain	`src/domain/`	Pure interfaces, types, port definitions. Zero external dependencies.
Application	`src/application/`	Use case orchestration. Depends only on domain ports.
Infrastructure	`src/infrastructure/`	MongoDB repos, S3 storage, BullMQ queue/worker, OpenAI adapter.
HTTP	`src/http/`	Express controllers, validators, routes.

API

All audio endpoints live under /v1/audio.

Job creation

POST /v1/audio/jobs
{ "sourceUrl": "https://...", "capabilities": ["transcription"] }
→ 202 { "data": { "_id": "...", "status": "pending", ... } }

POST /v1/audio/jobs/upload-url
{ "filename": "episode.mp3", "contentType": "audio/mpeg", "capabilities": ["transcription"] }
→ 201 { "data": { "uploadUrl": "...", "storageKey": "...", "jobId": "..." } }

POST /v1/audio/jobs/:id/confirm
→ 202 { "data": { "_id": "...", "status": "pending", ... } }

POST /v1/audio/jobs/batch
{ "items": [{ "sourceUrl": "...", "capabilities": ["transcription"] }, ...] }
→ 202 { "data": { "batchId": "...", "jobs": [...] } }

Query

GET /v1/audio/jobs/:id           → 200 job object
GET /v1/audio/jobs/:id/result    → 200 metadata + enrichment status
GET /v1/audio/transcripts/:jobId → 200 transcript

System

GET /health → 200 (liveness)
GET /ready  → 200/503 (readiness, checks MongoDB)

Auth

When API_KEY is configured, requests to /v1/* must include:

Authorization: Bearer <API_KEY>

Job lifecycle

Job status:    pending → processing → ready | failed
Enrichment:    pending → processing → completed | failed

ready means metadata extracted successfully. Capabilities may still be pending or failed.
failed only happens when the file can't be downloaded or metadata can't be extracted.
Capability failures (e.g. transcription) do NOT fail the job — partial success model.

Setup

Prerequisites

Node.js 20+
MongoDB 7+
Redis 7+

Environment variables

# Required
MONGO_URL=mongodb://127.0.0.1:27017/media
S3_REGION=us-east-1
S3_BUCKET=your-bucket
S3_ACCESS_KEY_ID=your-key
S3_SECRET_ACCESS_KEY=your-secret

# Optional
APP_PORT=3000
API_KEY=your-api-key
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
QUEUE_NAME=audio-processing
QUEUE_CONCURRENCY=3
QUEUE_MAX_RETRIES=3
QUEUE_RETRY_DELAY=5000
PRESIGNED_URL_EXPIRATION=3600
MAX_TRANSCRIPTION_FILE_SIZE=26214400
OPENAI_API_KEY=sk-...
OPENAI_WHISPER_MODEL=whisper-1

Example request

curl -X POST http://localhost:3000/v1/audio/jobs \
  -H 'content-type: application/json' \
  -H 'authorization: Bearer your-api-key' \
  -d '{
    "sourceUrl": "https://example.com/audio/episode.mp3",
    "capabilities": ["transcription"]
  }'

Development

# Start MongoDB + Redis
docker compose -f docker-compose.dev.yaml up -d

# Install dependencies
npm install

# Run in dev mode
npm run dev

Production

npm run build
npm start

Docker

docker compose up

Tests

npm test

Project structure

src/
  domain/           # Core entities and ports
  application/      # Use-case orchestration
  infrastructure/   # MongoDB, S3, BullMQ, OpenAI adapters
  http/             # Express routes, controllers, validators
  providers/        # Capability registry
  shared/           # Middleware, errors, health checks

Provider strategy

AI providers are behind capability-oriented interfaces. The application layer depends on abstract ports, not vendor SDKs.

Currently implemented:

Transcription: OpenAI Whisper API (whisper-1)

Planned:

Summarization
Topic/keyword extraction
Language detection
Embeddings

To add a new provider, implement the corresponding interface in src/infrastructure/ and register it in src/server.ts.

Tech stack

Runtime: Node.js 20, TypeScript
Framework: Express
Database: MongoDB (Mongoose)
Queue: BullMQ + Redis
Storage: AWS S3 (presigned URLs)
AI: OpenAI Whisper API
Validation: Zod
Tests: Vitest + Supertest

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.dev.yaml		docker-compose.dev.yaml
docker-compose.yaml		docker-compose.yaml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Intelligence Service

Highlights

Architecture

API

Job creation

Query

System

Auth

Job lifecycle

Setup

Prerequisites

Environment variables

Example request

Development

Production

Docker

Tests

Project structure

Provider strategy

Tech stack

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Intelligence Service

Highlights

Architecture

API

Job creation

Query

System

Auth

Job lifecycle

Setup

Prerequisites

Environment variables

Example request

Development

Production

Docker

Tests

Project structure

Provider strategy

Tech stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages