API Quickstart

PATAS API Quickstart

Quick guide to integrating PATAS into your system via the REST API.

Base URL

http://localhost:8000/api/v1

For on-premise deployments, use your internal PATAS server URL.

Authentication

Currently, PATAS API does not require authentication for local development. For production, implement API key authentication.

Main Endpoint: `/api/v1/analyze`

The /api/v1/analyze endpoint bundles the complete workflow:

Ingest messages
Run pattern mining (optional)
Evaluate rules (optional)
Export rules (optional)

Perfect for: Small-medium batch analysis and quick prototyping.

Request

POST /api/v1/analyze
Content-Type: application/json

{
  "messages": [
    {
      "id": "msg_001",
      "text": "Your message text here",
      "is_spam": true,
      "meta": {"lang": "en"}
    }
  ],
  "run_mining": true,
  "run_evaluation": true,
  "export_backend": "sql"
}

Response

{
  "patterns": [
    {
      "id": 1,
      "type": "URL",
      "description": "URL pattern: http://spam.com",
      "group_size": 45,
      "sources_count": 3,
      "senders_count": 2,
      "similarity_reason": "Messages contain the same suspicious URL: http://spam.com",
      "example_report_ids": ["msg_001", "msg_002", "msg_003"],
      "bot_likelihood": 0.85,
      "sql_query": "SELECT id, message_content, sender, source FROM reports WHERE LOWER(message_content) LIKE '%http://spam.com%'"
    }
  ],
  "rules": [
    {
      "id": 1,
      "pattern_id": 1,
      "status": "candidate",
      "sql_expression": "SELECT id FROM messages WHERE ...",
      "precision": 0.95,
      "coverage": 0.15,
      "hits_total": 100
    }
  ],
  "export": "SELECT id FROM messages WHERE ...",
  "meta": {
    "ingested_count": 10,
    "patterns_created": 3,
    "rules_created": 3,
    "evaluation_count": 3,
    "timings": {
      "ingest_seconds": 0.123,
      "mining_seconds": 2.456,
      "evaluation_seconds": 0.789,
      "total_seconds": 3.368
    }
  }
}

Pattern Groups and SQL Output

The /api/v1/analyze endpoint returns pattern groups - groups of similar spam messages with detailed statistics and SQL queries.

Pattern Response Fields

Each pattern in the response includes:

group_size: Number of messages in this pattern group
sources_count: Number of unique sources/chats where these messages appeared
senders_count: Number of unique senders who sent these messages
similarity_reason: Human-readable explanation of why messages are similar (e.g., "Job offers with identical structure and varied salary amounts")
example_report_ids: Sample message IDs from the group (up to 10)
bot_likelihood: Bot probability score (0.0-1.0), where higher values indicate automated/spam behavior
sql_query: SQL SELECT query against generic reports table that matches this pattern

SQL Query Format

The sql_query field contains a SQL SELECT query that can be executed against a generic reports table with the following schema:

CREATE TABLE reports (
    id VARCHAR PRIMARY KEY,
    message_content TEXT NOT NULL,
    sender VARCHAR,
    source VARCHAR,
    country VARCHAR,
    has_media BOOLEAN,
    label_spam BOOLEAN,
    created_at TIMESTAMP
);

Field Mapping:

PATAS messages.text → reports.message_content
PATAS messages.meta.sender → reports.sender
PATAS messages.meta.source → reports.source
PATAS messages.meta.country → reports.country
PATAS messages.meta.has_media → reports.has_media
PATAS messages.is_spam → reports.label_spam

Example SQL Query:

SELECT id, message_content, sender, source 
FROM reports 
WHERE LOWER(message_content) LIKE '%http://spam.com%'

This query can be used to:

Identify all reports matching the pattern
Apply blocking rules in your system
Generate reports and analytics

Use Cases

Batch Analysis: Send a batch of labeled reports, get pattern groups with SQL queries
Rule Generation: Use sql_query to create blocking rules in your database
Pattern Discovery: Understand why messages are similar via similarity_reason
Bot Detection: Use bot_likelihood to prioritize automated spam patterns

Example: cURL

curl -X POST http://localhost:8000/api/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "id": "test_001",
        "text": "Buy now! Click here: http://spam.com",
        "is_spam": true,
        "meta": {"lang": "en"}
      }
    ],
    "run_mining": true,
    "run_evaluation": true,
    "export_backend": "sql"
  }'

Example: Python

import requests

# Prepare messages
messages = [
    {
        "id": "msg_001",
        "text": "Buy now! Click here: http://spam.com",
        "is_spam": True,
        "meta": {"lang": "en"}
    }
]

# Send request
response = requests.post(
    'http://localhost:8000/api/v1/analyze',
    json={
        'messages': messages,
        'run_mining': True,
        'run_evaluation': True,
        'export_backend': 'sql'
    }
)

# Process results
result = response.json()

print(f"Discovered {len(result['patterns'])} patterns")
print(f"Generated {len(result['rules'])} rules")

# Export SQL rules
if result.get('export'):
    print("\nSQL Rules:")
    print(result['export'])

Example: JavaScript/Node.js

const fetch = require('node-fetch');

// Prepare messages
const messages = [
  {
    id: 'msg_001',
    text: 'Buy now! Click here: http://spam.com',
    is_spam: true,
    meta: { lang: 'en' }
  }
];

// Send request
fetch('http://localhost:8000/api/v1/analyze', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages,
    run_mining: true,
    run_evaluation: true,
    export_backend: 'sql'
  })
})
  .then(res => res.json())
  .then(result => {
    console.log(`Discovered ${result.patterns.length} patterns`);
    console.log(`Generated ${result.rules.length} rules`);
    
    if (result.export) {
      console.log('\nSQL Rules:');
      console.log(result.export);
    }
  });

Lower-Level Endpoints (For Large-Scale Pipelines)

For large-scale or continuous pipelines, use the lower-level endpoints:

1. Ingest Messages

POST /api/v1/messages/ingest
Content-Type: application/json

{
  "messages": [...]
}

2. Run Pattern Mining

POST /api/v1/patterns/mine
Content-Type: application/json

{
  "from_timestamp": "2025-01-01T00:00:00Z",
  "to_timestamp": "2025-01-07T00:00:00Z"
}

3. Evaluate Rules

POST /api/v1/rules/eval-shadow
Content-Type: application/json

{
  "rule_ids": [1, 2, 3],  // Optional: specific rules, or empty for all shadow rules
  "days": 7
}

4. Promote Rules

POST /api/v1/rules/promote

5. Export Rules

GET /api/v1/rules/export?backend=sql
GET /api/v1/rules/export?backend=rol

6. List Patterns

GET /api/v1/patterns?limit=100&offset=0

7. List Rules

GET /api/v1/rules?status=active&include_evaluation=true

Request Limits

Recommended limits:

Batch size: Up to 10,000 messages per /api/v1/analyze request
Message length: Up to 8,192 characters per message
Rate: Adjust based on your server capacity

For larger datasets:

Use lower-level endpoints with pagination
Process in batches
Use async processing if available

Response Formats

Export Backends

SQL (export_backend: "sql"):

SELECT id FROM messages WHERE text LIKE '%spam-pattern%';
SELECT id FROM messages WHERE text REGEXP 'spam-regex';

ROL (export_backend: "rol"):

{
  "rules": [
    {
      "id": 1,
      "pattern": "...",
      "action": "block"
    }
  ]
}

Error Handling

400 Bad Request

{
  "detail": "Empty messages list"
}

500 Internal Server Error

{
  "detail": "Pattern mining failed: ..."
}

Best Practice: Always check response status and handle errors gracefully.

Typical Workflow

For Small Batches (< 10,000 messages)

Use /api/v1/analyze with all options enabled
Get patterns, rules, and export in one response
Deploy exported rules to your filtering system

For Large Batches (> 10,000 messages)

Ingest messages in batches via /api/v1/messages/ingest
Mine patterns via /api/v1/patterns/mine
Evaluate rules via /api/v1/rules/eval-shadow
Promote good rules via /api/v1/rules/promote
Export active rules via /api/v1/rules/export

For Continuous Pipelines

Set up scheduled ingestion (cron, task queue, etc.)
Run pattern mining periodically (daily, weekly)
Evaluate shadow rules regularly
Auto-promote based on metrics (or manual review)
Export and deploy active rules

Next Steps

Architecture - Understand PATAS architecture
Code Overview - Code structure and navigation

API Documentation

Full OpenAPI documentation available at:

http://localhost:8000/docs

Interactive Swagger UI for exploring all endpoints.

Tip: Start with /api/v1/analyze for quick integration, then move to lower-level endpoints as your needs grow.

API Quickstart

PATAS API Quickstart

Base URL

Authentication

Main Endpoint: /api/v1/analyze

Request

Response

Pattern Groups and SQL Output

Pattern Response Fields

SQL Query Format

Use Cases

Example: cURL

Example: Python

Example: JavaScript/Node.js

Lower-Level Endpoints (For Large-Scale Pipelines)

1. Ingest Messages

2. Run Pattern Mining

3. Evaluate Rules

4. Promote Rules

5. Export Rules

6. List Patterns

7. List Rules

Request Limits

Response Formats

Export Backends

Error Handling

400 Bad Request

500 Internal Server Error

Typical Workflow

For Small Batches (< 10,000 messages)

For Large Batches (> 10,000 messages)

For Continuous Pipelines

Next Steps

API Documentation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Main Endpoint: `/api/v1/analyze`