-
Notifications
You must be signed in to change notification settings - Fork 0
SQL_LLM_VALIDATION
PATAS includes an optional LLM-based quality validator for SQL rules to detect false positive risks before rules are activated.
After a SQL rule passes deterministic safety checks (validate_sql_rule, check_rule_coverage), an optional LLM validation step can assess the rule for false positive risks and quality issues.
Deterministic validation catches:
- ✅ SQL syntax errors
- ✅ Dangerous operations (UPDATE, DELETE, etc.)
- ✅ Overly broad rules (matching >80% of messages)
- ✅ Whitelist violations
LLM validation adds:
- ✅ Semantic false positive risk assessment
- ✅ Context-aware quality checks
- ✅ Suggestions for improvement
LLM validation runs after deterministic checks but before rule creation:
LLM generates SQL rule
↓
Deterministic validation (validate_sql_rule)
↓
Coverage check (check_rule_coverage)
↓
LLM quality validation (optional, if LLM available)
↓
Rule creation (if all checks pass)
The validator is designed to be token-efficient:
- Reuses existing LLM client - Uses the same OpenAI client instance as pattern mining (no new API key needed)
- Compact prompt - Minimal prompt focusing only on false positive risk
- Limited examples - Only 3 spam examples, 2 ham examples (truncated to 60 chars)
- Short response - Max 800 tokens for response
- Low temperature - 0.1 for consistent validation
The validation prompt is compact and focused:
SQL rule quality check:
SQL: <sql_expression>
Pattern: <description>
SPAM examples: <3 truncated examples>
HAM examples: <2 truncated examples> (optional)
Check false positive risk (low/medium/high). JSON:
{"is_safe": bool, "risk_level": "low|medium|high",
"false_positive_risks": ["scenario"],
"suggestions": ["suggestion"],
"reasoning": "brief"}
Token estimate: ~200-400 tokens per validation (vs 1000-2000 for pattern generation)
LLM validation is automatically enabled when:
-
use_llm=Truein pattern mining - LLM engine is available
- SQL rule passes deterministic checks
from app.v2_sql_llm_validator import create_sql_validator
# Reuse existing LLM engine
validator = create_sql_validator(
llm_engine=mining_engine, # Reuse from pattern mining
model="gpt-4o-mini",
)
if validator:
result = await validator.validate_rule_quality(
sql_expression="SELECT id FROM messages WHERE text LIKE '%spam%'",
pattern_description="Spam keyword pattern",
example_spam_messages=["Buy now!", "Click here!"],
example_ham_messages=["Hello, how are you?"],
)
if result['risk_level'] == 'high':
# Reject rule
pass
elif result['risk_level'] == 'medium':
# Log warning, proceed with caution
pass{
"is_safe": true,
"risk_level": "low|medium|high",
"false_positive_risks": [
"Legitimate messages discussing prices might be blocked",
"Messages with 'earn' in educational context might match"
],
"suggestions": [
"Add context check: exclude messages from trusted senders",
"Require multiple indicators (not just single keyword)"
],
"reasoning": "Rule is generally safe but may match some legitimate price discussions"
}-
risk_level == "high": Rule is rejected, not created -
risk_level == "medium": Rule is created but warning is logged -
risk_level == "low": Rule is created normally
- Per validation: ~200-400 tokens (input + output)
- Compared to pattern generation: ~10-20% of pattern generation cost
- Total cost: Minimal when reusing existing LLM client
LLM validation can be skipped if:
- LLM is not available (graceful degradation)
- Cost is a concern (validation is optional)
- Deterministic checks are sufficient
LLM validation uses the same configuration as pattern mining:
- Same API key (from
PATAS_OPENAI_API_KEYorOPENAI_API_KEY) - Same model (default:
gpt-4o-mini) - Same client instance (reused for efficiency)
- Use for high-stakes rules: Enable validation for rules that will be used in Conservative profile (auto-actions)
- Review medium-risk rules: Manually review rules flagged as medium risk
- Monitor costs: Track validation token usage if cost-sensitive
- Combine with deterministic checks: LLM validation complements, doesn't replace deterministic validation
# In pattern mining pipeline
if sql_expr:
# Deterministic checks
is_valid, error = validate_sql_rule(sql_expr)
if is_valid:
coverage_valid, _, _ = await check_rule_coverage(...)
if coverage_valid:
# LLM validation (optional)
if mining_engine and use_llm:
validator = create_sql_validator(mining_engine)
if validator:
result = await validator.validate_rule_quality(...)
if result['risk_level'] == 'high':
# Reject
continue
# Create rule
rule = await lifecycle.create_candidate_rule(...)- Not deterministic: LLM validation can vary between runs
- Token cost: Adds ~200-400 tokens per rule (minimal but not zero)
- Requires LLM: Only works if LLM is available
- False positives possible: LLM may flag safe rules or miss risky ones
- Cache validation results for identical SQL rules
- Batch validation for multiple rules
- Fine-tuned model for SQL rule validation
- Integration with rule evaluation metrics (precision/recall)