Deployment Guide

This guide is for operators deploying and managing a Sentinel instance. It covers infrastructure, configuration, database setup, lexicon management, electoral phases, deployment stages, OAuth, and monitoring.

Architecture overview

Sentinel has three runtime components:

┌─────────────┐     ┌──────────────────┐     ┌─────────┐
│  Your App   │────>│  Sentinel API    │────>│ Postgres │
│  (client)   │<────│  (FastAPI)       │<────│ pgvector │
└─────────────┘     │                  │     └──────────┘
                    │                  │────>┌─────────┐
                    └──────────────────┘<────│  Redis   │
                                            └──────────┘

Sentinel API — FastAPI application serving moderation, admin, and monitoring endpoints
PostgreSQL with pgvector — Stores lexicon entries, releases, embeddings, appeals, transparency data, model artifacts. Required for production.
Redis — Distributed rate limiting, hot-trigger caching, and optional moderation result caching. Optional; Sentinel degrades gracefully if unavailable.

Default moderation routing is active for en, sw, and sh. Additional language-pack artifacts may exist in the repository for staged rollout but are not automatically active in hot-path enforcement.

Without Postgres, Sentinel runs in file-based fallback mode (lexicon loaded from data/lexicon_seed.json, appeals stored in-memory). This is suitable for development but not production.

Environment variables

This section focuses on API runtime and operator-facing variables. A few script-only actor variables are intentionally omitted.

Core

Variable	Required	Default	Description
`SENTINEL_API_KEY`	Yes	—	API key for authenticating `POST /v1/moderate` requests
`SENTINEL_DATABASE_URL`	No	—	Postgres connection string (e.g., `postgresql://user:pass@host:5432/sentinel`). Enables lexicon DB, vector search, appeals, transparency.
`SENTINEL_REDIS_URL`	No	—	Redis connection string. Enables distributed rate limiting, hot-trigger caching, and optional moderation result caching.
`SENTINEL_POLICY_CONFIG_PATH`	No	auto-detected	Path to policy configuration file (`config/policy/default.json` when present)

Electoral and deployment

Variable	Required	Default	Description
`SENTINEL_ELECTORAL_PHASE`	No	`null` (from config)	Override electoral phase: `pre_campaign`, `campaign`, `silence_period`, `voting_day`, `results_period`
`SENTINEL_DEPLOYMENT_STAGE`	No	`supervised`	Override deployment stage: `shadow`, `advisory`, `supervised`

Rate limiting

Variable	Required	Default	Description
`SENTINEL_RATE_LIMIT_PER_MINUTE`	No	`120`	Max requests per API key per minute
`SENTINEL_RATE_LIMIT_STORAGE_URI`	No	—	Rate limit storage URI (alternative to `SENTINEL_REDIS_URL`)

Redis hot triggers

Variable	Required	Default	Description
`SENTINEL_REDIS_HOT_TRIGGER_KEY_PREFIX`	No	`sentinel:hot-triggers`	Redis key prefix for cached hot-trigger terms
`SENTINEL_REDIS_HOT_TRIGGER_TTL_SECONDS`	No	—	Optional TTL for hot-trigger cache keys
`SENTINEL_REDIS_SOCKET_TIMEOUT_SECONDS`	No	`0.05`	Redis socket timeout in seconds

Moderation result cache

Variable	Required	Default	Description
`SENTINEL_RESULT_CACHE_ENABLED`	No	`false`	Enable Redis-backed caching of `POST /v1/moderate` responses (adds `X-Cache: HIT
`SENTINEL_RESULT_CACHE_TTL_SECONDS`	No	`60`	Cache TTL in seconds

OAuth (admin endpoints)

Variable	Required	Default	Description
`SENTINEL_OAUTH_TOKENS_JSON`	No	—	JSON object mapping static tokens to `{client_id, scopes}`
`SENTINEL_OAUTH_JWT_SECRET`	No	—	JWT signing secret. If set, enables JWT bearer auth instead of static tokens.
`SENTINEL_OAUTH_JWT_ALGORITHM`	No	`HS256`	JWT algorithm
`SENTINEL_OAUTH_JWT_AUDIENCE`	No	—	Expected JWT audience claim (optional verification)
`SENTINEL_OAUTH_JWT_ISSUER`	No	—	Expected JWT issuer claim (optional verification)

ML and vector matching

Variable	Required	Default	Description
`SENTINEL_VECTOR_MATCH_ENABLED`	No	`true`	Enable/disable vector similarity search
`SENTINEL_VECTOR_MATCH_THRESHOLD`	No	`0.82`	Minimum cosine similarity for vector matches
`SENTINEL_VECTOR_STATEMENT_TIMEOUT_MS`	No	`60`	Postgres statement timeout for vector queries
`SENTINEL_LID_MODEL_PATH`	No	—	Path to FastText language identification model
`SENTINEL_LID_CONFIDENCE_THRESHOLD`	No	`0.80`	Minimum confidence for language detection
`SENTINEL_EMBEDDING_PROVIDER`	No	`hash-bow-v1`	Embedding provider ID
`SENTINEL_CLASSIFIER_PROVIDER`	No	`none-v1`	Multi-label classifier provider ID
`SENTINEL_CLAIM_SCORER_PROVIDER`	No	`claim-heuristic-v1`	Claim scorer provider ID

Classifier shadow mode

Variable	Required	Default	Description
`SENTINEL_CLASSIFIER_SHADOW_ENABLED`	No	`0`	Enable shadow classifier predictions (only in SHADOW/ADVISORY stages)
`SENTINEL_SHADOW_PREDICTIONS_PATH`	No	—	File path for shadow prediction logs
`SENTINEL_CLASSIFIER_TIMEOUT_MS`	No	`40`	Classifier timeout in milliseconds
`SENTINEL_CLASSIFIER_MIN_SCORE`	No	`0.55`	Minimum score to emit a shadow prediction
`SENTINEL_CLASSIFIER_CIRCUIT_FAILURE_THRESHOLD`	No	`3`	Consecutive failures before circuit breaker opens
`SENTINEL_CLASSIFIER_CIRCUIT_RESET_SECONDS`	No	`120`	Seconds before circuit breaker resets

Docker Compose setup (development/staging)

# Start Postgres + Redis
docker compose up -d --build postgres redis

# Run migrations
make apply-migrations

# Load seed lexicon
make seed-lexicon

# Start API
export SENTINEL_API_KEY='your-key'
make run

The included docker-compose.yml defines API, PostgreSQL 16 (pgvector), and Redis services. The command above starts only Postgres and Redis so you can run make run locally with hot reload.

Production deployment guidance

For production:

Reverse proxy: Place Sentinel behind nginx, Caddy, or a cloud load balancer. Terminate TLS at the proxy.
Managed database: Use a managed PostgreSQL service with pgvector support. Ensure pg_trgm and vector extensions are available.
Replicas: Sentinel is stateless (all state lives in Postgres/Redis). Run multiple API replicas behind a load balancer.
TLS: All traffic between clients and the API should use HTTPS. Database connections should use SSL.
Admin endpoint isolation: Consider running admin endpoints on a separate internal network or port, not exposed to the public internet.
Secrets: Use a secrets manager for SENTINEL_API_KEY, SENTINEL_DATABASE_URL, and OAuth credentials. Do not pass secrets as command-line arguments.

Database migrations

Sentinel includes migration files in migrations/. Run them with:

make apply-migrations

Or directly:

python scripts/apply_migrations.py --database-url "$SENTINEL_DATABASE_URL"

Migration	Description
`0001_lexicon_entries.sql`	Core lexicon entries table
`0002_lexicon_releases.sql`	Lexicon release lifecycle (draft/active/deprecated)
`0003_lexicon_release_audit.sql`	Audit trail for release state transitions
`0004_async_monitoring_core.sql`	Async monitoring queue tables
`0005_lexicon_release_audit_proposal_promote.sql`	Proposal promotion audit support
`0006_retention_legal_hold_primitives.sql`	Data retention and legal hold primitives
`0007_lexicon_entry_embeddings.sql`	pgvector embeddings for lexicon entries
`0008_appeals_core.sql`	Appeals state machine tables
`0009_appeals_original_decision_id_backfill.sql`	Backfill original decision IDs on appeals
`0010_monitoring_queue_event_uniqueness.sql`	Queue event deduplication
`0011_lexicon_entry_metadata_hardening.sql`	Metadata validation constraints
`0012_model_artifact_lifecycle.sql`	Model artifact version tracking
`0013_multi_model_embeddings.sql`	Multi-model embedding storage and indexes (v2)

Migrations are ordered and tracked via Alembic revision history. Running make apply-migrations repeatedly is safe.

Lexicon lifecycle

The lexicon is Sentinel's primary enforcement mechanism. Terms are organized into versioned releases with a governed lifecycle.

Release states

Draft  ──>  Active  ──>  Deprecated

Only one release can be active at a time (enforced by a database unique index). The active release is what the moderation endpoint uses.

Seed lexicon

Sentinel ships with a 7-term demonstration seed (data/lexicon_seed.json):

Term	Action	Label	Language
kill	BLOCK	INCITEMENT_VIOLENCE	en
burn them	BLOCK	INCITEMENT_VIOLENCE	en
mchome	BLOCK	ETHNIC_CONTEMPT	sw
hunt you down	BLOCK	HARASSMENT_THREAT	en
deal with them	REVIEW	DOGWHISTLE_WATCH	en
wataona	REVIEW	DOGWHISTLE_WATCH	sw
rigged	REVIEW	DISINFO_RISK	en

This seed is a demonstration dataset only. Production deployment requires building a comprehensive lexicon with domain-expert annotation covering the specific hate speech, incitement, and disinformation patterns relevant to your context.

Lifecycle commands

# Load seed and activate (first-time setup)
make seed-lexicon

# Create a new release
make release-create VERSION=hatelex-v2.2

# Ingest terms into a draft release
make release-ingest VERSION=hatelex-v2.2 INPUT=data/lexicon_seed.json

# Validate a release against quality gates
make release-validate VERSION=hatelex-v2.2

# Activate a release (deactivates the previous active release)
make release-activate VERSION=hatelex-v2.2

# Deprecate an old release
make release-deprecate VERSION=hatelex-v2.1

Electoral phase configuration

Sentinel adjusts moderation sensitivity based on the electoral cycle. Five phases are supported:

Phase	Vector threshold	No-match action	Behavior
`pre_campaign`	0.82	ALLOW	Baseline sensitivity
`campaign`	0.85	ALLOW	Slightly tightened
`silence_period`	0.88	REVIEW	Unmatched content goes to review
`voting_day`	0.90	REVIEW	Maximum sensitivity
`results_period`	0.88	REVIEW	Maintained high sensitivity

During silence_period, voting_day, and results_period, the no_match_action changes to REVIEW, meaning content that doesn't match any lexicon entry still gets flagged for human review.

Setting the phase

Set the electoral phase via environment variable:

export SENTINEL_ELECTORAL_PHASE=campaign

Or in the policy config file (config/policy/default.json):

{
  "electoral_phase": "campaign"
}

The environment variable takes precedence over the config file. If neither is set, no phase-specific overrides are applied.

Updating the phase at runtime

For operator workflows that need a phase change without restarting the API, Sentinel also exposes POST /admin/policy/phase (OAuth scope admin:policy:write).

This override is in-process only. In multi-worker or multi-replica deployments, use environment/config management or a shared store rather than relying on per-process overrides.

Phase safety constraint

Phase overrides cannot lower the BLOCK toxicity threshold below the baseline value. This prevents accidental weakening of the most critical moderation threshold during heightened periods.

Deployment stages

Deployment stages control enforcement behavior. Use them to roll out Sentinel safely.

Stage	BLOCK behavior	REVIEW behavior	ALLOW behavior	Use case
`shadow`	Downgraded to ALLOW	Downgraded to ALLOW	No change	Observe decisions without enforcement; log-only mode
`advisory`	Downgraded to REVIEW	No change	No change	Enforcement active but no content is blocked; human review required for all blocks
`supervised`	Full enforcement	No change	No change	Production mode; all actions enforced as-is

Recommended rollout path

SHADOW — Deploy Sentinel and route traffic. Log all decisions but enforce nothing. Analyze decision quality.
ADVISORY — Enable enforcement but cap at REVIEW. Human moderators see what Sentinel would block. Build confidence.
SUPERVISED — Full enforcement. Sentinel blocks content autonomously based on lexicon matches.

Set the stage:

export SENTINEL_DEPLOYMENT_STAGE=advisory

Default is supervised. The policy version string encodes the active stage (e.g., policy-2026.11@campaign#advisory).

OAuth setup for admin endpoints

Admin endpoints (appeals, transparency, release proposals, internal monitoring) require OAuth bearer token authentication.

Option 1: Static token registry (simple)

Create a JSON file mapping tokens to client identities and scopes:

{
  "token-abc-123": {
    "client_id": "admin-dashboard",
    "scopes": ["admin:appeal:read", "admin:appeal:write", "admin:transparency:read"]
  },
  "token-def-456": {
    "client_id": "ci-pipeline",
    "scopes": ["admin:proposal:read", "admin:proposal:review"]
  }
}

Set the environment variable:

export SENTINEL_OAUTH_TOKENS_JSON='{"token-abc-123": {"client_id": "admin-dashboard", "scopes": ["admin:appeal:read", "admin:appeal:write"]}}'

Option 2: JWT bearer tokens

For production, configure JWT validation:

export SENTINEL_OAUTH_JWT_SECRET='your-jwt-signing-secret'
export SENTINEL_OAUTH_JWT_ALGORITHM='HS256'          # optional, default HS256
export SENTINEL_OAUTH_JWT_AUDIENCE='sentinel-admin'   # optional
export SENTINEL_OAUTH_JWT_ISSUER='your-auth-server'   # optional

JWTs must include client_id (or sub) and scopes (or scope as space-delimited string) claims.

OAuth scopes

Scope	Grants access to
`admin:appeal:read`	List appeals, reconstruct appeal audit trail
`admin:appeal:write`	Create appeals, transition appeal states
`admin:transparency:read`	Aggregate transparency reports and read the audit stream
`admin:transparency:export`	Raw appeals data export
`admin:transparency:identifiers`	Include identifier fields (`request_id`, `original_decision_id`) in exports
`admin:policy:write`	Update the effective electoral phase override
`admin:proposal:read`	View release proposal permissions
`admin:proposal:review`	Submit, approve, reject, promote release proposals
`internal:queue:read`	Internal monitoring queue metrics

Monitoring

Health check

curl http://localhost:8000/health
# {"status": "ok"}

curl http://localhost:8000/health/live
# {"status": "ok"}

curl http://localhost:8000/health/ready
# {"status": "ready", "checks": {"lexicon": "ok", "db": "ok", "redis": "ok"}}

Metrics

# JSON format
curl http://localhost:8000/metrics

# Prometheus text format
curl http://localhost:8000/metrics/prometheus

The metrics endpoint returns action counts, HTTP status counts, latency histogram buckets, and validation error counts.

For public deployments, do not expose these endpoints on the public internet. Use network controls or a reverse proxy to restrict /metrics* to internal monitoring systems. The repository includes docker-compose.hardened.yml, which routes only /health and /v1/moderate through a proxy and blocks operator surfaces by default.

Structured logging

Sentinel propagates X-Request-ID headers through all requests. If the client provides one, Sentinel uses it; otherwise one is generated. Use this ID to correlate logs across your infrastructure.

Internal monitoring

With the internal:queue:read OAuth scope:

curl -H "Authorization: Bearer $TOKEN" http://localhost:8000/internal/monitoring/queue/metrics

Returns queue depth and SLA breach snapshot data.

Go-live readiness gate

Before production rollout, run the readiness gate:

python scripts/check_go_live_readiness.py --bundle-dir releases/go-live/<release-id>

This validates that all required artifacts (lexicon release, policy config, migration state, launch profile) are present and consistent. See templates/go-live/ for the template bundle structure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment Guide

Architecture overview

Environment variables

Core

Electoral and deployment

Rate limiting

Redis hot triggers

Moderation result cache

OAuth (admin endpoints)

ML and vector matching

Classifier shadow mode

Docker Compose setup (development/staging)

Production deployment guidance

Database migrations

Lexicon lifecycle

Release states

Seed lexicon

Lifecycle commands

Electoral phase configuration

Setting the phase

Updating the phase at runtime

Phase safety constraint

Deployment stages

Recommended rollout path

OAuth setup for admin endpoints

Option 1: Static token registry (simple)

Option 2: JWT bearer tokens

OAuth scopes

Monitoring

Health check

Metrics

Structured logging

Internal monitoring

Go-live readiness gate

FilesExpand file tree

deployment.md

Latest commit

History

deployment.md

File metadata and controls

Deployment Guide

Architecture overview

Environment variables

Core

Electoral and deployment

Rate limiting

Redis hot triggers

Moderation result cache

OAuth (admin endpoints)

ML and vector matching

Classifier shadow mode

Docker Compose setup (development/staging)

Production deployment guidance

Database migrations

Lexicon lifecycle

Release states

Seed lexicon

Lifecycle commands

Electoral phase configuration

Setting the phase

Updating the phase at runtime

Phase safety constraint

Deployment stages

Recommended rollout path

OAuth setup for admin endpoints

Option 1: Static token registry (simple)

Option 2: JWT bearer tokens

OAuth scopes

Monitoring

Health check

Metrics

Structured logging

Internal monitoring

Go-live readiness gate