Small reproducible benchmark for OpenAI-compatible AI API gateways.
The default target is API NODE:
Base URL: https://apinode.pro
Model: gpt-5.5
API: OpenAI Responses API
Latest generated summary:
Latest detailed results:
Latest 7 day trend:
The latest committed benchmark result is generated by GitHub Actions and committed back to this repository.
Use the summary for a quick health check:
- Success rate: did the endpoint return usable responses?
- Average latency: broad request timing across the sample.
- P50 latency: typical request timing.
- P95 latency: slower-end request timing in the sample.
- Output present: did the response include text that an app can consume?
This benchmark records:
- Success rate.
- Average latency.
- P50 latency.
- P95 latency.
- HTTP status.
- Whether output text is present.
Treat a failed run as a signal to investigate, not as a marketing claim.
Common causes:
- Missing or expired
APINODE_API_KEYsecret. - Temporary provider or network error.
- Rate limit or timeout during the sample window.
- Model name mismatch.
- Response shape changed and the benchmark could not find output text.
If a run fails, check results/last-run.md first, then inspect the GitHub Actions log.
export APINODE_API_KEY="your_api_key"
export APINODE_BASE_URL="https://apinode.pro"
export APINODE_MODEL="gpt-5.5"
export BENCHMARK_SAMPLES="5"
npm run benchmarkResults are written to:
SUMMARY.md
TREND.md
results/last-run.json
results/last-run.md
history/YYYY-MM-DD.json
Add APINODE_API_KEY as a repository secret, then run the benchmark workflow manually.
The scheduled workflow is intentionally conservative. Increase sample size only after confirming cost and rate limits.
When sharing results externally, link to the committed files instead of copying isolated numbers:
- SUMMARY.md
- TREND.md
- results/last-run.md
- benchmark.mjs
- How to interpret benchmark results
- How to cite benchmark results
- Latency watch notes
- Latency watch note: 2026-05-09
- P95 watch procedure
- Watch status: 2026-05-10
- Benchmark incident response runbook
- Embedding benchmark plan
- Weekly benchmark review template
- Daily health digest template
- Benchmark annotation policy
- No new run status: 2026-05-11
- Traffic to content decisions: 2026-05-11
- Daily benchmark status: 2026-05-11
- Traffic snapshot: 2026-05-11 05:47 UTC
That gives readers the sample size, endpoint, model, and benchmark code in one place.
When publishing benchmark results, include:
- Date and time.
- Runtime region.
- Sample size.
- Timeout.
- Model name.
- Endpoint.
- Whether requests were streaming or non-streaming.
Do not publish inflated claims. Developers trust benchmarks that are boring, transparent, and repeatable.