Skip to content

Alerting

Nick edited this page Nov 21, 2025 · 1 revision

PATAS Alerting Guide

This guide describes the alerting rules configured for PATAS monitoring.

Alert Rules

High False Positive Rate

Alert: HighFalsePositiveRate
Severity: Warning
Condition: False positive rate > 10% over 5 minutes
Description: Indicates that rules are incorrectly flagging legitimate messages as spam.

Action:

  • Review recent rule evaluations
  • Check for rules with high ham_hits
  • Consider deprecating problematic rules

Rule Precision Degradation

Alert: RulePrecisionDegradation
Severity: Critical
Condition: Rule precision drops by >10% compared to previous evaluation
Description: A rule's precision has significantly degraded, indicating it may need to be deprecated.

Action:

  • Review the specific rule's evaluation metrics
  • Check for changes in spam patterns
  • Consider deprecating the rule if degradation persists

Service Unavailable

Alert: PATASServiceDown
Severity: Critical
Condition: PATAS API endpoint is not responding
Description: The PATAS service is down or unreachable.

Action:

  • Check service health and logs
  • Verify database connectivity
  • Check for resource constraints (CPU, memory, disk)

High API Latency

Alert: HighAPILatency
Severity: Warning
Condition: 95th percentile API latency > 2 seconds over 5 minutes
Description: API responses are slower than expected.

Action:

  • Check database query performance
  • Review LLM/embedding API response times
  • Check for resource constraints
  • Consider scaling horizontally

Pattern Mining Failure

Alert: PatternMiningFailure
Severity: Warning
Condition: Pattern mining errors detected in last 5 minutes
Description: Pattern mining operations are encountering errors.

Action:

  • Check pattern mining logs
  • Verify database connectivity
  • Check LLM/embedding service availability
  • Review checkpoint status

Low Rule Coverage

Alert: LowRuleCoverage
Severity: Info
Condition: Average rule coverage < 1% for 30 minutes
Description: Rules are not matching a significant portion of spam traffic.

Action:

  • Review pattern mining results
  • Check if new spam patterns have emerged
  • Consider running pattern mining more frequently

Configuration

Alert rules are defined in alerts.yml and can be customized based on your requirements.

Thresholds

Adjust thresholds in alerts.yml based on your use case:

  • False Positive Rate: Default 10% (0.1) - adjust based on acceptable FP rate
  • Precision Drop: Default 10% (0.1) - adjust based on acceptable degradation
  • API Latency: Default 2 seconds - adjust based on SLA requirements
  • Rule Coverage: Default 1% (0.01) - adjust based on expected coverage

AlertManager Integration

Configure AlertManager to route alerts to your notification channels:

  1. Update alerts.yml with your notification receivers
  2. Configure email, Slack, PagerDuty, or other integrations
  3. Test alert routing

Best Practices

  1. Start Conservative: Begin with higher thresholds and adjust based on false positives
  2. Monitor Trends: Watch for gradual degradation, not just threshold breaches
  3. Document Actions: Keep a runbook for common alert scenarios
  4. Regular Review: Periodically review and adjust alert thresholds based on historical data

Clone this wiki locally