-
Notifications
You must be signed in to change notification settings - Fork 0
Public API Quickstart
Quick guide to integrating PATAS into your system via the REST API.
http://localhost:8000/api/v1
For production deployments, replace with your PATAS server URL.
Currently, PATAS API does not require authentication for local development. For production, implement API key authentication.
The /api/v1/analyze endpoint bundles the complete workflow:
- Ingest messages
- Run pattern mining (optional)
- Evaluate rules (optional)
- Export rules (optional)
Perfect for: Small-medium batch analysis, RapidAPI integration, quick prototyping.
POST /api/v1/analyze
Content-Type: application/json
{
"messages": [
{
"id": "msg_001",
"text": "Your message text here",
"is_spam": true,
"meta": {"lang": "en"}
}
],
"run_mining": true,
"run_evaluation": true,
"export_backend": "sql"
}{
"patterns": [
{
"id": 1,
"type": "URL",
"description": "URL pattern: http://spam.com",
"group_size": 45,
"sources_count": 3,
"senders_count": 2,
"similarity_reason": "Messages contain the same suspicious URL: http://spam.com",
"example_report_ids": ["msg_001", "msg_002", "msg_003"],
"bot_likelihood": 0.85,
"sql_query": "SELECT id, message_content, sender, source FROM reports WHERE LOWER(message_content) LIKE '%http://spam.com%'"
}
],
"rules": [
{
"id": 1,
"pattern_id": 1,
"status": "candidate",
"sql_expression": "SELECT id FROM messages WHERE ...",
"precision": 0.95,
"coverage": 0.15,
"hits_total": 100
}
],
"export": "SELECT id FROM messages WHERE ...",
"meta": {
"ingested_count": 10,
"patterns_created": 3,
"rules_created": 3,
"evaluation_count": 3,
"timings": {
"ingest_seconds": 0.123,
"mining_seconds": 2.456,
"evaluation_seconds": 0.789,
"total_seconds": 3.368
}
}
}The /api/v1/analyze endpoint returns pattern groups - groups of similar spam messages with detailed statistics and SQL queries.
Each pattern in the response includes:
-
group_size: Number of messages in this pattern group -
sources_count: Number of unique sources/chats where these messages appeared -
senders_count: Number of unique senders who sent these messages -
similarity_reason: Human-readable explanation of why messages are similar (e.g., "Job offers with identical structure and varied salary amounts") -
example_report_ids: Sample message IDs from the group (up to 10) -
bot_likelihood: Bot probability score (0.0-1.0), where higher values indicate automated/spam behavior -
sql_query: SQL SELECT query against genericreportstable that matches this pattern
The sql_query field contains a SQL SELECT query that can be executed against a generic reports table with the following schema:
CREATE TABLE reports (
id VARCHAR PRIMARY KEY,
message_content TEXT NOT NULL,
sender VARCHAR,
source VARCHAR,
country VARCHAR,
has_media BOOLEAN,
label_spam BOOLEAN,
created_at TIMESTAMP
);Field Mapping:
- PATAS
messages.text→reports.message_content - PATAS
messages.meta.sender→reports.sender - PATAS
messages.meta.source→reports.source - PATAS
messages.meta.country→reports.country - PATAS
messages.meta.has_media→reports.has_media - PATAS
messages.is_spam→reports.label_spam
Example SQL Query:
SELECT id, message_content, sender, source
FROM reports
WHERE LOWER(message_content) LIKE '%http://spam.com%'This query can be used to:
- Identify all reports matching the pattern
- Apply blocking rules in your system
- Generate reports and analytics
- Batch Analysis: Send a batch of labeled reports, get pattern groups with SQL queries
-
Rule Generation: Use
sql_queryto create blocking rules in your database -
Pattern Discovery: Understand why messages are similar via
similarity_reason -
Bot Detection: Use
bot_likelihoodto prioritize automated spam patterns
curl -X POST http://localhost:8000/api/v1/analyze \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"id": "test_001",
"text": "Buy now! Click here: http://spam.com",
"is_spam": true,
"meta": {"lang": "en"}
}
],
"run_mining": true,
"run_evaluation": true,
"export_backend": "sql"
}'import requests
# Prepare messages
messages = [
{
"id": "msg_001",
"text": "Buy now! Click here: http://spam.com",
"is_spam": True,
"meta": {"lang": "en"}
}
]
# Send request
response = requests.post(
'http://localhost:8000/api/v1/analyze',
json={
'messages': messages,
'run_mining': True,
'run_evaluation': True,
'export_backend': 'sql'
}
)
# Process results
result = response.json()
print(f"Discovered {len(result['patterns'])} patterns")
print(f"Generated {len(result['rules'])} rules")
# Export SQL rules
if result.get('export'):
print("\nSQL Rules:")
print(result['export'])const fetch = require('node-fetch');
// Prepare messages
const messages = [
{
id: 'msg_001',
text: 'Buy now! Click here: http://spam.com',
is_spam: true,
meta: { lang: 'en' }
}
];
// Send request
fetch('http://localhost:8000/api/v1/analyze', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
messages,
run_mining: true,
run_evaluation: true,
export_backend: 'sql'
})
})
.then(res => res.json())
.then(result => {
console.log(`Discovered ${result.patterns.length} patterns`);
console.log(`Generated ${result.rules.length} rules`);
if (result.export) {
console.log('\nSQL Rules:');
console.log(result.export);
}
});For large-scale or continuous pipelines, use the lower-level endpoints:
POST /api/v1/messages/ingest
Content-Type: application/json
{
"messages": [...]
}POST /api/v1/patterns/mine
Content-Type: application/json
{
"from_timestamp": "2025-01-01T00:00:00Z",
"to_timestamp": "2025-01-07T00:00:00Z"
}POST /api/v1/rules/eval-shadow
Content-Type: application/json
{
"rule_ids": [1, 2, 3], // Optional: specific rules, or empty for all shadow rules
"days": 7
}POST /api/v1/rules/promoteGET /api/v1/rules/export?backend=sql
GET /api/v1/rules/export?backend=rolGET /api/v1/patterns?limit=100&offset=0GET /api/v1/rules?status=active&include_evaluation=trueRecommended limits:
-
Batch size: Up to 10,000 messages per
/api/v1/analyzerequest - Message length: Up to 8,192 characters per message
- Rate: Adjust based on your server capacity
For larger datasets:
- Use lower-level endpoints with pagination
- Process in batches
- Use async processing if available
SQL (export_backend: "sql"):
SELECT id FROM messages WHERE text LIKE '%spam-pattern%';
SELECT id FROM messages WHERE text REGEXP 'spam-regex';ROL (export_backend: "rol"):
{
"rules": [
{
"id": 1,
"pattern": "...",
"action": "block"
}
]
}{
"detail": "Empty messages list"
}{
"detail": "Pattern mining failed: ..."
}Best Practice: Always check response status and handle errors gracefully.
- Use
/api/v1/analyzewith all options enabled - Get patterns, rules, and export in one response
- Deploy exported rules to your filtering system
-
Ingest messages in batches via
/api/v1/messages/ingest -
Mine patterns via
/api/v1/patterns/mine -
Evaluate rules via
/api/v1/rules/eval-shadow -
Promote good rules via
/api/v1/rules/promote -
Export active rules via
/api/v1/rules/export
- Set up scheduled ingestion (cron, task queue, etc.)
- Run pattern mining periodically (daily, weekly)
- Evaluate shadow rules regularly
- Auto-promote based on metrics (or manual review)
- Export and deploy active rules
- Demo Guide - Run a complete demo
- Use Cases - See real-world integration examples
- Overview - Understand PATAS architecture
Full OpenAPI documentation available at:
http://localhost:8000/docs
Interactive Swagger UI for exploring all endpoints.
Tip: Start with /api/v1/analyze for quick integration, then move to lower-level endpoints as your needs grow.