Skip to content

CONFIGURATION_GUIDE

Nick edited this page Mar 10, 2026 · 1 revision

PATAS Configuration Guide

Complete PATAS Configuration Guide


Environment Variables

PATAS uses Pydantic Settings for configuration. All settings can be configured via environment variables or .env file.

PATAS v2 Settings

Aggressiveness Profile

AGGRESSIVENESS_PROFILE=balanced  # conservative, balanced, aggressive

Profiles:

Profile min_precision max_coverage min_sample_size max_ham_hits
conservative 0.95 0.05 100 5
balanced 0.90 0.10 50 10
aggressive 0.85 0.20 30 20

Recommendations:

  • conservative: For production auto-actions (low false positive rate)
  • balanced: For signals and research (by default)
  • aggressive: Signals only, not for hard bans

LLM Configuration

LLM_PROVIDER=openai  # openai, none, disabled
LLM_MODEL=gpt-4o-mini
PATAS_OPENAI_API_KEY=your-key-here  # or OPENAI_API_KEY

Notes:

  • LLM is used only for offline pattern discovery
  • Not used for real-time classification
  • Can be disabled (LLM_PROVIDER=none)

Semantic Mining

ENABLE_SEMANTIC_MINING=true
EMBEDDING_PROVIDER=openai  # openai, local, none
EMBEDDING_MODEL=text-embedding-3-small
SEMANTIC_SIMILARITY_THRESHOLD=0.75
SEMANTIC_MIN_CLUSTER_SIZE=3

Notes:

  • Semantic mining is enabled by default
  • Uses embedding engine for clustering similar messages
  • Requires OPENAI_API_KEY if EMBEDDING_PROVIDER=openai

Pattern Mining

PATTERN_MINING_CHUNK_SIZE=1000

Notes:

  • Chunk size for processing messages
  • Larger chunk = fewer LLM calls, but more memory

Privacy Mode

PRIVACY_MODE=STANDARD  # STANDARD or STRICT

STRICT mode:

  • External LLM providers are disabled by default
  • Logs do not store full message texts
  • No telemetry or external calls

STANDARD mode:

  • Full functionality
  • LLM can be used (if configured)

Database

DATABASE_URL=sqlite+aiosqlite:///./data/spamapi.db

Supported Databases:

  • SQLite: sqlite+aiosqlite:///./data/spamapi.db
  • PostgreSQL: postgresql+asyncpg://user:pass@localhost/dbname

API Settings

API_HOST=0.0.0.0
API_PORT=8000
API_RELOAD=false  # Auto-reload in development

Example .env File

# Aggressiveness Profile
AGGRESSIVENESS_PROFILE=balanced

# LLM (optional)
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-...

# Semantic Mining
ENABLE_SEMANTIC_MINING=true
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
SEMANTIC_SIMILARITY_THRESHOLD=0.75

# Pattern Mining
PATTERN_MINING_CHUNK_SIZE=1000

# Privacy
PRIVACY_MODE=STANDARD

# Database
DATABASE_URL=sqlite+aiosqlite:///./data/spamapi.db

# API
API_HOST=0.0.0.0
API_PORT=8000

Production Recommendations

Conservative Profile for Production

AGGRESSIVENESS_PROFILE=conservative
PRIVACY_MODE=STRICT
LLM_PROVIDER=none  # or use on-prem LLM

With LLM (if needed)

AGGRESSIVENESS_PROFILE=balanced
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-...
ENABLE_SEMANTIC_MINING=true

Without LLM (recommended for first deployment)

AGGRESSIVENESS_PROFILE=conservative
LLM_PROVIDER=none
ENABLE_SEMANTIC_MINING=false
PRIVACY_MODE=STRICT

Configuration via Code

from app.config import settings

# Access settings
print(settings.aggressiveness_profile)
print(settings.llm_provider)
print(settings.privacy_mode)

Check Configuration

# Via CLI
patas --help

# Via API
curl http://localhost:8000/api/v1/health

Troubleshooting

LLM does not work

  1. Check OPENAI_API_KEY
  2. Check LLM_PROVIDER (must be openai, not none)
  3. Check network access to OpenAI API

Semantic mining does not work

  1. Check ENABLE_SEMANTIC_MINING=true
  2. Check EMBEDDING_PROVIDER and OPENAI_API_KEY
  3. Check SEMANTIC_SIMILARITY_THRESHOLD (too high = fewer clusters)

Rules Not Being Promoted

  1. Check AGGRESSIVENESS_PROFILE (too strict = fewer promotions)
  2. Check rule metrics via GET /api/v1/rules?include_evaluation=true
  3. Ensure rules were evaluated (POST /api/v1/rules/eval-shadow)

Additional Resources

Clone this wiki locally