Skip to content

Configuration

Nick edited this page Nov 20, 2025 · 3 revisions

Configuration

PATAS uses environment variables for configuration via Pydantic Settings. All settings can be configured through environment variables or a .env file.


Core Settings

Database

DATABASE_URL=sqlite:///./data/patas.db
# Or for PostgreSQL:
DATABASE_URL=postgresql+asyncpg://user:pass@localhost/patas

Safety Profile

AGGRESSIVENESS_PROFILE=balanced  # conservative, balanced, aggressive

Profiles:

Profile min_precision max_coverage min_sample_size max_ham_hits
conservative 0.95 0.05 100 5
balanced 0.90 0.10 50 10
aggressive 0.85 0.20 30 20

Recommendations:

  • conservative: For production auto-actions (low false positive rate)
  • balanced: For signals and research (default)
  • aggressive: Signals only, not for hard bans

LLM Configuration (Optional)

Pattern Discovery

LLM_PROVIDER=openai  # openai, local, none, disabled
LLM_MODEL=gpt-4o-mini
LLM_BASE_URL=  # For local: http://localhost:8000/v1
LLM_API_KEY=your-key-here  # Required for OpenAI, optional for local
LLM_TIMEOUT_SECONDS=30.0

Notes:

  • LLM is used only for offline pattern discovery
  • Not used for real-time classification
  • Can be disabled (LLM_PROVIDER=none)
  • For on-premise deployments, use LLM_PROVIDER=local with LLM_BASE_URL
  • See Local Model Integration for details

Semantic Mining

ENABLE_SEMANTIC_MINING=true
EMBEDDING_PROVIDER=openai  # openai, local, none
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_BASE_URL=  # For local: http://localhost:8080/v1
EMBEDDING_API_KEY=  # Required for OpenAI, optional for local
EMBEDDING_TIMEOUT_SECONDS=30.0
SEMANTIC_SIMILARITY_THRESHOLD=0.75
SEMANTIC_MIN_CLUSTER_SIZE=3

Notes:

  • Semantic mining is enabled by default
  • Uses embedding engine for clustering similar messages
  • Requires OPENAI_API_KEY if EMBEDDING_PROVIDER=openai
  • For on-premise deployments, use EMBEDDING_PROVIDER=local with EMBEDDING_BASE_URL
  • See Local Model Integration for details

Pattern Mining

PATTERN_MINING_CHUNK_SIZE=10000

Controls the chunk size for processing messages during pattern mining.


Privacy Settings

PRIVACY_MODE=STANDARD  # STANDARD or STRICT

STANDARD Mode:

  • External LLM providers can be used (if configured)
  • Full logging available
  • Message texts included in reports

STRICT Mode:

  • External LLM providers disabled by default
  • Minimal logging (only IDs and counts)
  • Message texts not stored in logs

See Privacy and Data Protection for details.


API Settings

API_HOST=0.0.0.0
API_PORT=8000
API_RELOAD=false  # Auto-reload in development

Logging

LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR
LOG_FILE=logs/app.log
LOG_RETENTION_DAYS=30

Complete Example

# Database
DATABASE_URL=sqlite:///./data/patas.db

# Safety Profile
AGGRESSIVENESS_PROFILE=conservative

# LLM (optional)
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=your-key-here

# Semantic Mining
ENABLE_SEMANTIC_MINING=true
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

# Privacy
PRIVACY_MODE=STRICT

# API
API_HOST=0.0.0.0
API_PORT=8000

# Logging
LOG_LEVEL=INFO
LOG_RETENTION_DAYS=30

Related Documentation

Clone this wiki locally