-
Notifications
You must be signed in to change notification settings - Fork 0
Load Testing Guide
PATAS Bot edited this page Nov 19, 2025
·
1 revision
Guide for conducting load testing of PATAS API.
python scripts/load_test.py --url http://localhost:8000 --endpoint /api/v1/health --rps 100 --duration 60python scripts/load_test.py \
--url http://localhost:8000 \
--endpoint /api/v1/analyze \
--rps 500 \
--duration 120 \
--output load_test_500rps.json-
--url- API base URL (default: http://localhost:8000) -
--endpoint- API endpoint to test (default: /api/v1/health) -
--rps- Requests per second (default: 100) -
--duration- Test duration in seconds (default: 60) -
--output- Output JSON report file -
--api-key- Optional API key for authentication
JSON report with:
- Total requests, success/error rates
- Latency metrics (P50, P95, P99, avg, min, max)
- Status code distribution
- Error messages (first 10)
- P95 latency: ≤ 200ms (rules-only), ≤ 700ms (with LLM)
- Error rate: < 1%
- Stable throughput: Maintain target RPS without degradation
Good Performance:
- P95 latency within targets
- Error rate < 1%
- Consistent latency across test duration
Issues to Watch:
- P95 latency exceeding targets
- Error rate > 1%
- Increasing latency over time
- High P99 latency
High Latency: Check database pool, CPU/memory, slow queries, LLM API response times
High Error Rate: Check API logs, database connectivity, rate limiting, external API status
Connection Errors: Verify API server running, network connectivity, firewall rules