Skip to content

INTEGRATION_GUIDE

Nick edited this page Mar 10, 2026 · 1 revision

PATAS Integration Guide

PATAS Integration Guide


Step-by-Step Integration

Step 1: Installation and Configuration

1.1. Install dependencies:

poetry install
# or
pip install -r requirements.txt

1.2. Configuration:

cp .env.example .env
# Edit .env with your settings

1.3. Start API server:

poetry run uvicorn app.api.main:app --host 0.0.0.0 --port 8000
# or
poetry run patas-api

1.4. Verification:

curl http://localhost:8000/api/v1/health

Step 2: Basic Integration

2.1. Ingest messages:

import requests

messages = [
    {
        "id": "msg_001",
        "text": "Your message text",
        "is_spam": True,
        "meta": {"sender": "user123", "source": "chat456"}
    }
]

response = requests.post(
    "http://localhost:8000/api/v1/messages/ingest",
    json={"messages": messages}
)

2.2. Pattern mining:

response = requests.post(
    "http://localhost:8000/api/v1/patterns/mine",
    json={"days": 7, "use_llm": False, "min_spam_count": 10}
)

2.3. Get rules:

response = requests.get(
    "http://localhost:8000/api/v1/rules",
    params={"status": "active", "include_evaluation": True}
)
rules = response.json()

2.4. Export rules:

response = requests.get(
    "http://localhost:8000/api/v1/rules/export",
    params={"backend": "sql"}
)
sql_rules = response.text

Step 3: Applying rules in your system

3.1. SQL rules:

# Get SQL rules
response = requests.get(
    "http://localhost:8000/api/v1/rules/export",
    params={"backend": "sql"}
)
sql_rules = response.text

# Apply to your database
import sqlite3
conn = sqlite3.connect("your_database.db")
cursor = conn.cursor()

for rule_sql in sql_rules.split(";"):
    rule_sql = rule_sql.strip()
    if rule_sql:
        cursor.execute(rule_sql)
        results = cursor.fetchall()
        # Process results

3.2. ROL rules:

response = requests.get(
    "http://localhost:8000/api/v1/rules/export",
    params={"backend": "rol"}
)
rol_rules = response.json()

# Apply ROL rules in your system
for rule in rol_rules["rules"]:
    # Your application logic
    pass

Паттерны интеграции

Pattern 1: Batch Processing

for large datasets use batch processing:

def process_large_dataset(messages_batches):
    # 1. Ingest in batches
    for batch in messages_batches:
        requests.post(
            "http://localhost:8000/api/v1/messages/ingest",
            json={"messages": batch}
        )
    
    # 2. Mine patterns
    requests.post(
        "http://localhost:8000/api/v1/patterns/mine",
        json={"days": 7}
    )
    
    # 3. Evaluate and promote
    requests.post("http://localhost:8000/api/v1/rules/eval-shadow")
    requests.post("http://localhost:8000/api/v1/rules/promote")
    
    # 4. Export
    response = requests.get(
        "http://localhost:8000/api/v1/rules/export",
        params={"backend": "sql"}
    )
    return response.text

Pattern 2: Continuous Pipeline

for continuous processing:

import schedule
import time

def daily_pattern_mining():
    # Ingest new messages
    requests.post("http://localhost:8000/api/v1/messages/ingest", json={...})
    
    # Mine patterns
    requests.post("http://localhost:8000/api/v1/patterns/mine", json={"days": 1})
    
    # Evaluate and promote
    requests.post("http://localhost:8000/api/v1/rules/eval-shadow")
    requests.post("http://localhost:8000/api/v1/rules/promote")
    
    # Export and apply in system
    export_and_apply_rules()

# Start daily
schedule.every().day.at("02:00").do(daily_pattern_mining)

while True:
    schedule.run_pending()
    time.sleep(60)

Pattern 3: Real-time Integration

for real-time integration:

from queue import Queue
import threading

message_queue = Queue()

def ingest_worker():
    while True:
        messages = []
        # Collect batch from queue
        while len(messages) < 1000 and not message_queue.empty():
            messages.append(message_queue.get())
        
        if messages:
            requests.post(
                "http://localhost:8000/api/v1/messages/ingest",
                json={"messages": messages}
            )

# Start worker
threading.Thread(target=ingest_worker, daemon=True).start()

# Adding messages to queue
def on_new_message(message):
    message_queue.put(message)

Performance Optimization

1. Batch Sizes

Recommendations:

  • /api/v1/analyze: Up to 10,000 messages
  • /api/v1/messages/ingest: Up to 5,000 messages at a time
  • Use pagination for large datasets

2. Async Processing

import asyncio
import aiohttp

async def analyze_batch_async(session, messages):
    async with session.post(
        "http://localhost:8000/api/v1/analyze",
        json={"messages": messages, "run_mining": True}
    ) as response:
        return await response.json()

async def process_multiple_batches(messages_batches):
    async with aiohttp.ClientSession() as session:
        tasks = [
            analyze_batch_async(session, batch)
            for batch in messages_batches
        ]
        results = await asyncio.gather(*tasks)
        return results

3. Caching

from functools import lru_cache
import requests

@lru_cache(maxsize=100)
def get_active_rules():
    response = requests.get(
        "http://localhost:8000/api/v1/rules",
        params={"status": "active"}
    )
    return response.json()

Monitoring

1. Health checks

import requests
import time

def monitor_health():
    while True:
        try:
            response = requests.get(
                "http://localhost:8000/api/v1/health",
                timeout=5
            )
            if response.json()["status"] == "ok":
                print("PATAS is healthy")
            else:
                print("PATAS is unhealthy")
        except Exception as e:
            print(f"Health check failed: {e}")
        
        time.sleep(60)

# Start Monitoringа
monitor_health()

2. Rule Metrics

def monitor_rule_metrics():
    response = requests.get(
        "http://localhost:8000/api/v1/rules",
        params={"status": "active", "include_evaluation": True}
    )
    rules = response.json()
    
    for rule in rules:
        eval_data = rule.get("evaluation")
        if eval_data:
            precision = eval_data.get("precision", 0)
            ham_hits = eval_data.get("ham_hits", 0)
            
            # Alert if precision drops
            if precision < 0.90:
                send_alert(f"Rule {rule['id']} precision is {precision:.2%}")
            
            # Alert if ham hits increase
            if ham_hits > 10:
                send_alert(f"Rule {rule['id']} has {ham_hits} ham hits")

3. Prometheus метрики

if prometheus_client is installed, PATAS exports metrics:

# Metrics available at /metrics endpoint (if configured)
# or via prometheus_client registry

Best Practices

  1. Используйте batch endpoints for больших датасетов
  2. Добавьте retry logic For production
  3. Мониторьте метрики правил регулярно
  4. Используйте conservative profile For production auto-actions
  5. Тестируйте правила в shadow mode перед применением
  6. Логируйте all операции for debugging
  7. Используйте async for множественных запросов

Troubleshooting

Problem: Rules not being promoted

Solution:

  1. Check metrics: GET /api/v1/rules?include_evaluation=true
  2. Проверьте AGGRESSIVENESS_PROFILE (too strict = fewer promotions)
  3. Ensure rules were evaluated: POST /api/v1/rules/eval-shadow

Problem: Pattern mining is slow

Solution:

  1. Уменьшите PATTERN_MINING_CHUNK_SIZE
  2. Disable LLM if not needed: use_llm=false
  3. Use lower-level endpoints for large datasets

Problem: LLM not working

Solution:

  1. Проверьте OPENAI_API_KEY
  2. Проверьте LLM_PROVIDER (must be openai, not none)
  3. Check network access

Дополнительные ресурсы

Clone this wiki locally