Skip to content

docs: revamp README and docs with production-grade styling#188

Open
devin-ai-integration[bot] wants to merge 3 commits into
masterfrom
docs/production-grade-readme-and-docs
Open

docs: revamp README and docs with production-grade styling#188
devin-ai-integration[bot] wants to merge 3 commits into
masterfrom
docs/production-grade-readme-and-docs

Conversation

@devin-ai-integration
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot commented May 19, 2026

Summary

Revamps README.md, CONTRIBUTING.md, docs/ARCHITECTURE.md, docs/telemetry.md, and docs/dev-environment.md with polished, production-grade formatting. Rebrands all user-facing documentation from "EigenInference" to "Darkbloom". Adds centered headers, CI/release badges, table of contents, structured tables, cURL examples, and comprehensive API/CLI reference sections.

Linked issue

N/A — docs improvement

Test plan

  • Verified all internal markdown links resolve correctly
  • Confirmed no EigenInference references remain in user-facing docs (API key prefixes and config paths like eigeninference- and ~/.config/eigeninference/ are intentionally preserved)
  • Reviewed rendered markdown for table formatting, code blocks, and badge rendering

Components touched

  • coordinator (Go)
  • provider (Rust, legacy)
  • provider-swift (Swift CLI)
  • console-ui (Next.js)
  • enclave (Swift)
  • infra / CI / release
  • docs

Protocol / interface changes

  • No protocol/interface changes
  • Yes — described above and matching side updated

Notes for reviewers

  • All five docs files were rewritten for consistency: table of contents, horizontal rule separators, table-based layouts for reference data, and consistent heading hierarchy.
  • The "EigenInference" → "Darkbloom" rename only applies to user-facing prose. Internal identifiers (eigeninference- API key prefix, ~/.config/eigeninference/ config path, EIGENINFERENCE_* env vars) are intentionally untouched since they're part of the running system.
  • README now includes a cURL example alongside the Python SDK example, a full endpoint table, and a hardware compatibility matrix.
  • Architecture doc now has a detailed box-diagram with coordinator subsystems and provider internals.

Link to Devin session: https://app.devin.ai/sessions/10a1fa11eeac4d6aa78f5e92f937fc4b


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Open in Devin Review

- Rewrite README.md with centered header, badges, table of contents,
  structured sections, cURL examples, and full API/CLI reference
- Rebrand all user-facing docs from EigenInference to Darkbloom
- Restructure docs/ARCHITECTURE.md with detailed system diagram,
  component tables, and organized security/privacy sections
- Polish CONTRIBUTING.md with table-based layout, prerequisites matrix,
  and clear protocol-change sync-point reference
- Improve docs/telemetry.md with table of contents, structured
  endpoint/emission-site tables, and explicit allowlist documentation
- Reformat docs/dev-environment.md with infrastructure overview table,
  numbered setup steps, secrets mapping table, and cost breakdown
@devin-ai-integration
Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@vercel
Copy link
Copy Markdown

vercel Bot commented May 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
d-inference Ready Ready Preview May 19, 2026 10:12pm
d-inference-console-ui-dev Ready Ready Preview May 19, 2026 10:12pm
d-inference-landing Ready Ready Preview May 19, 2026 10:12pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-19 17:54 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 4.301s
Throughput 0.9 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.107s 2.107s 2.107s 2.107s
parse 4 73µs 81µs 119µs 119µs
reserve 4 3ms 4ms 5ms 5ms
route 4 420µs 489µs 495µs 495µs
coordinator_to_provider 4 2.099s 2.1s 2.101s 2.101s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=72.75µs (threshold=1ms)
parse:p95<=5ms PASS p95=119µs (threshold=5ms)
reserve:mean<=50ms PASS mean=3.47825ms (threshold=50ms)
reserve:p95<=200ms PASS p95=5.043ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 20
Success 4
Errors 16
Total Duration 2.852s
Throughput 1.4 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.849s 2.85s 2.852s 2.852s
parse 4 22µs 24µs 34µs 34µs
reserve 4 3ms 3ms 4ms 4ms
route 4 391µs 402µs 416µs 416µs
coordinator_to_provider 4 2.02s 2.021s 2.022s 2.022s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=21.5µs (threshold=1ms)
parse:p95<=5ms PASS p95=34µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.947ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.375ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 4 0.5 GB
mlx-community/gemma-3-270m-4bit 3 0.2 GB
Metric Value
Total Requests 50
Success 50
Errors 0
Total Duration 9.826s
Throughput 5.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 50 603ms 3ms 3.529s 3.579s
parse 48 27µs 18µs 44µs 341µs
reserve 48 2ms 1ms 4ms 5ms
route 48 0s 0s 1ms 1ms
coordinator_to_provider 50 498ms 1ms 3.52s 3.567s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=26.52µs (threshold=1ms)
parse:p95<=5ms PASS p95=44µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.511083ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.051ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 12
Errors 48
Total Duration 3.263s
Throughput 3.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.163s 2.16s 2.175s 2.175s
parse 12 15µs 14µs 23µs 23µs
reserve 12 3ms 3ms 5ms 5ms
route 12 1ms 1ms 1ms 1ms
coordinator_to_provider 12 2.154s 2.151s 2.168s 2.168s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=14.833µs (threshold=1ms)
parse:p95<=5ms PASS p95=23µs (threshold=5ms)
reserve:mean<=50ms PASS mean=3.28475ms (threshold=50ms)
reserve:p95<=200ms PASS p95=5.253ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 40
Success 4
Errors 36
Total Duration 3.073s
Throughput 1.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.141s 2.141s 2.141s 2.141s
parse 4 18µs 18µs 25µs 25µs
reserve 4 2ms 2ms 2ms 2ms
route 4 411µs 446µs 510µs 510µs
coordinator_to_provider 4 2.135s 2.135s 2.135s 2.135s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=18µs (threshold=1ms)
parse:p95<=5ms PASS p95=25µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.95175ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.106ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 60
Errors 0
Total Duration 5.317s
Throughput 11.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 60 374ms 4ms 2.245s 2.245s
parse 60 16µs 14µs 28µs 41µs
reserve 60 1ms 1ms 2ms 3ms
route 60 404µs 387µs 663µs 796µs
coordinator_to_provider 60 371ms 2ms 2.238s 2.24s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15.516µs (threshold=1ms)
parse:p95<=5ms PASS p95=28µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.194366ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.36ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 2.782s
Throughput 1.4 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.056s 2.056s 2.056s 2.056s
parse 4 59µs 60µs 89µs 89µs
reserve 4 5ms 5ms 6ms 6ms
route 4 1ms 1ms 1ms 1ms
coordinator_to_provider 4 2.04s 2.04s 2.041s 2.041s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=59.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=89µs (threshold=5ms)
reserve:mean<=50ms PASS mean=5.0075ms (threshold=50ms)
reserve:p95<=200ms PASS p95=5.745ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 30
Success 22
Errors 8
Total Duration 3.924s
Throughput 5.6 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 22 986ms 9ms 2.178s 2.178s
parse 22 17µs 14µs 27µs 62µs
reserve 22 2ms 2ms 3ms 5ms
route 22 1ms 0s 1ms 1ms
coordinator_to_provider 22 981ms 6ms 2.171s 2.171s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.909µs (threshold=1ms)
parse:p95<=5ms PASS p95=27µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.91059ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.424ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 5 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 3.604s
Throughput 8.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 660ms 3ms 1.988s 1.988s
parse 30 18µs 17µs 35µs 48µs
reserve 30 2ms 1ms 4ms 4ms
route 30 542µs 479µs 887µs 954µs
coordinator_to_provider 30 654ms 1ms 1.974s 1.976s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=18.366µs (threshold=1ms)
parse:p95<=5ms PASS p95=35µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.8584ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.118ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 100
Success 13
Errors 87
Total Duration 3.395s
Throughput 3.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 13 2.307s 2.23s 3.242s 3.242s
parse 13 75µs 76µs 106µs 106µs
reserve 13 7ms 7ms 9ms 9ms
route 13 263ms 18ms 3.213s 3.213s
queue_wait 1 3.213s 3.213s 3.213s 3.213s
dispatch 1 46µs 46µs 46µs 46µs
coordinator_to_provider 13 2.025s 2.194s 2.201s 2.201s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=74.769µs (threshold=1ms)
parse:p95<=5ms PASS p95=106µs (threshold=5ms)
reserve:mean<=50ms PASS mean=7.308384ms (threshold=50ms)
reserve:p95<=200ms PASS p95=9.243ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:mean<=5ms PASS mean=46µs (threshold=5ms)
dispatch:p95<=50ms PASS p95=46µs (threshold=50ms)

Comment thread README.md Outdated
Comment thread docs/ARCHITECTURE.md Outdated
Comment thread docs/ARCHITECTURE.md Outdated
Comment thread docs/ARCHITECTURE.md Outdated
Comment thread docs/ARCHITECTURE.md Outdated
Comment thread CONTRIBUTING.md
## Project Layout

See [CLAUDE.md](CLAUDE.md) for the full layout and architectural decisions. The short version:
For the full layout and architectural decisions, see [`CLAUDE.md`](CLAUDE.md).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude MD is for agents, we should decouple routing between humans and agents here

Comment thread CONTRIBUTING.md
| Rust (stable) | Latest | Legacy provider |
| Swift | 5.9+ (Xcode 15+) | Swift provider, enclave, macOS app |
| Node.js | 20+ | Console UI |
| Python | 3.11+ | Image bridge, crypto interop tests |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image bridge gone

Comment thread README.md
Comment on lines 126 to 136
@@ -38,66 +135,84 @@ Models are selected from a curated catalog. The coordinator only routes requests
| Qwen3.5 122B MoE 8-bit | 122B MoE, 10B active | 122 GB | 128 GB | Best quality reasoning |
| MiniMax M2.5 8-bit | 239B MoE, 11B active | 243 GB | 256 GB | SOTA coding, ~100 tok/s |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is gonna be dynamic and change a bunch, easiest to host via dashboard and include link in the future. @hankbobtheresearchoor file an issue on linear about this please

Comment thread README.md Outdated
| MiniMax M2.5 | $0.06 / 1M tokens | $0.50 / 1M tokens |

0% platform fee. Providers keep 100%.
**0% platform fee. Providers keep 100% of revenue.**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's correct, I get this a structural change but you fail to identify the semantical discrepancies

Comment thread README.md Outdated
**0% platform fee. Providers keep 100% of revenue.**

## Architecture
Payments settled via **Solana USDC** on-chain.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete, Solana support is no longer relevant for this project

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you delete?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — removed all Solana/USDC/BIP39 references across README.md, docs/ARCHITECTURE.md, and docs/dev-environment.md in b2c5380.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-19 22:13 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 3.831s
Throughput 1.0 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 1.736s 1.736s 1.736s 1.736s
parse 4 31µs 25µs 58µs 58µs
reserve 4 4ms 5ms 7ms 7ms
route 4 694µs 753µs 844µs 844µs
coordinator_to_provider 4 1.725s 1.726s 1.727s 1.727s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=31.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=58µs (threshold=5ms)
reserve:mean<=50ms PASS mean=4.39725ms (threshold=50ms)
reserve:p95<=200ms PASS p95=6.617ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 20
Success 4
Errors 16
Total Duration 2.363s
Throughput 1.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.36s 2.361s 2.363s 2.363s
parse 4 23µs 24µs 25µs 25µs
reserve 4 3ms 4ms 5ms 5ms
route 4 488µs 508µs 511µs 511µs
coordinator_to_provider 4 1.777s 1.777s 1.778s 1.778s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=22.5µs (threshold=1ms)
parse:p95<=5ms PASS p95=25µs (threshold=5ms)
reserve:mean<=50ms PASS mean=3.23325ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.813ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 4 0.5 GB
mlx-community/gemma-3-270m-4bit 3 0.2 GB
Metric Value
Total Requests 50
Success 50
Errors 0
Total Duration 9.471s
Throughput 5.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 50 540ms 3ms 3.254s 3.269s
parse 50 21µs 17µs 52µs 81µs
reserve 50 2ms 1ms 5ms 5ms
route 50 1ms 0s 1ms 4ms
coordinator_to_provider 50 535ms 1ms 3.238s 3.253s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=21.06µs (threshold=1ms)
parse:p95<=5ms PASS p95=52µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.53142ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.995ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 12
Errors 48
Total Duration 3.396s
Throughput 3.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.198s 2.204s 2.222s 2.222s
parse 12 19µs 13µs 72µs 72µs
reserve 12 3ms 2ms 4ms 4ms
route 12 566µs 537µs 854µs 854µs
coordinator_to_provider 12 2.189s 2.196s 2.214s 2.214s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=19µs (threshold=1ms)
parse:p95<=5ms PASS p95=72µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.579666ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.552ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 40
Success 4
Errors 36
Total Duration 3.366s
Throughput 1.2 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.235s 2.235s 2.235s 2.235s
parse 4 14µs 17µs 22µs 22µs
reserve 4 2ms 3ms 3ms 3ms
route 4 715µs 794µs 902µs 902µs
coordinator_to_provider 4 2.229s 2.229s 2.229s 2.229s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=14.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=22µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.3285ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.543ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 60
Errors 0
Total Duration 5.187s
Throughput 11.6 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 60 361ms 5ms 2.15s 2.151s
parse 60 17µs 16µs 28µs 39µs
reserve 60 1ms 1ms 2ms 3ms
route 60 429µs 399µs 709µs 846µs
coordinator_to_provider 60 358ms 2ms 2.143s 2.145s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.866µs (threshold=1ms)
parse:p95<=5ms PASS p95=28µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.266933ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.32ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 3.54s
Throughput 1.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.423s 2.423s 2.423s 2.423s
parse 4 49µs 65µs 85µs 85µs
reserve 4 5ms 5ms 6ms 6ms
route 4 1ms 1ms 1ms 1ms
coordinator_to_provider 4 2.411s 2.411s 2.412s 2.412s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=48.5µs (threshold=1ms)
parse:p95<=5ms PASS p95=85µs (threshold=5ms)
reserve:mean<=50ms PASS mean=5.433ms (threshold=50ms)
reserve:p95<=200ms PASS p95=6.143ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 4.002s
Throughput 7.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 727ms 6ms 2.196s 2.196s
parse 30 16µs 14µs 38µs 57µs
reserve 30 2ms 1ms 5ms 5ms
route 30 0s 0s 1ms 1ms
coordinator_to_provider 30 723ms 4ms 2.189s 2.191s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.4µs (threshold=1ms)
parse:p95<=5ms PASS p95=38µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.806133ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.76ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 5 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 4.088s
Throughput 7.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 758ms 4ms 2.301s 2.302s
parse 30 18µs 15µs 32µs 57µs
reserve 30 2ms 1ms 5ms 5ms
route 30 439µs 445µs 585µs 675µs
coordinator_to_provider 30 753ms 1ms 2.292s 2.293s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=18.233µs (threshold=1ms)
parse:p95<=5ms PASS p95=32µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.989433ms (threshold=50ms)
reserve:p95<=200ms PASS p95=5.294ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 100
Success 13
Errors 87
Total Duration 3.54s
Throughput 3.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 13 2.375s 2.296s 3.352s 3.352s
parse 13 93µs 90µs 163µs 163µs
reserve 13 8ms 8ms 9ms 9ms
route 13 272ms 18ms 3.314s 3.314s
queue_wait 1 3.314s 3.314s 3.314s 3.314s
dispatch 1 70µs 70µs 70µs 70µs
coordinator_to_provider 13 2.085s 2.258s 2.27s 2.27s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=93.23µs (threshold=1ms)
parse:p95<=5ms PASS p95=163µs (threshold=5ms)
reserve:mean<=50ms PASS mean=8.092923ms (threshold=50ms)
reserve:p95<=200ms PASS p95=8.787ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:mean<=5ms PASS mean=70µs (threshold=5ms)
dispatch:p95<=50ms PASS p95=70µs (threshold=50ms)

Copy link
Copy Markdown
Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread docs/telemetry.md
(configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older
than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are
deleted.
`coordinator/telemetry/retention.go` runs a prune loop hourly (configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are deleted.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Telemetry retention docs reference a nonexistent file and describe removed functionality

The updated docs/telemetry.md:99 states that coordinator/telemetry/retention.go runs a prune loop hourly. This file does not exist — the only file in coordinator/telemetry/ is emitter.go. Furthermore, the described prune-loop functionality has been removed entirely; coordinator/cmd/coordinator/main.go:497 explicitly notes "Telemetry retention is handled by Datadog; no local retention loop needed." The env vars EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL and EIGENINFERENCE_TELEMETRY_MAX_AGE referenced in this line also do not appear anywhere in the codebase. This will mislead contributors looking for retention configuration.

Suggested change
`coordinator/telemetry/retention.go` runs a prune loop hourly (configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are deleted.
Telemetry retention is handled by **Datadog** — no local prune loop runs in the coordinator. Datadog handles durable persistence, querying, and retention policies.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant