Skip to content

Load Testing Guide

PATAS Bot edited this page Nov 19, 2025 · 1 revision

Load Testing Guide

Guide for conducting load testing of PATAS API.

Quick Start

Basic Load Test

python scripts/load_test.py --url http://localhost:8000 --endpoint /api/v1/health --rps 100 --duration 60

Analyze Endpoint Test

python scripts/load_test.py \
  --url http://localhost:8000 \
  --endpoint /api/v1/analyze \
  --rps 500 \
  --duration 120 \
  --output load_test_500rps.json

Parameters

  • --url - API base URL (default: http://localhost:8000)
  • --endpoint - API endpoint to test (default: /api/v1/health)
  • --rps - Requests per second (default: 100)
  • --duration - Test duration in seconds (default: 60)
  • --output - Output JSON report file
  • --api-key - Optional API key for authentication

Output

JSON report with:

  • Total requests, success/error rates
  • Latency metrics (P50, P95, P99, avg, min, max)
  • Status code distribution
  • Error messages (first 10)

Performance Targets

  • P95 latency: ≤ 200ms (rules-only), ≤ 700ms (with LLM)
  • Error rate: < 1%
  • Stable throughput: Maintain target RPS without degradation

Interpreting Results

Good Performance:

  • P95 latency within targets
  • Error rate < 1%
  • Consistent latency across test duration

Issues to Watch:

  • P95 latency exceeding targets
  • Error rate > 1%
  • Increasing latency over time
  • High P99 latency

Troubleshooting

High Latency: Check database pool, CPU/memory, slow queries, LLM API response times

High Error Rate: Check API logs, database connectivity, rate limiting, external API status

Connection Errors: Verify API server running, network connectivity, firewall rules

Clone this wiki locally