docs: revamp README and docs with production-grade styling by devin-ai-integration[bot] · Pull Request #188 · Layr-Labs/d-inference

devin-ai-integration · 2026-05-19T16:38:30Z

Summary

Revamps README.md, CONTRIBUTING.md, docs/ARCHITECTURE.md, docs/telemetry.md, and docs/dev-environment.md with polished, production-grade formatting. Rebrands all user-facing documentation from "EigenInference" to "Darkbloom". Adds centered headers, CI/release badges, table of contents, structured tables, cURL examples, and comprehensive API/CLI reference sections.

Linked issue

N/A — docs improvement

Test plan

Verified all internal markdown links resolve correctly
Confirmed no EigenInference references remain in user-facing docs (API key prefixes and config paths like eigeninference- and ~/.config/eigeninference/ are intentionally preserved)
Reviewed rendered markdown for table formatting, code blocks, and badge rendering

Components touched

Protocol / interface changes

No protocol/interface changes
Yes — described above and matching side updated

Notes for reviewers

All five docs files were rewritten for consistency: table of contents, horizontal rule separators, table-based layouts for reference data, and consistent heading hierarchy.
The "EigenInference" → "Darkbloom" rename only applies to user-facing prose. Internal identifiers (eigeninference- API key prefix, ~/.config/eigeninference/ config path, EIGENINFERENCE_* env vars) are intentionally untouched since they're part of the running system.
README now includes a cURL example alongside the Python SDK example, a full endpoint table, and a hardware compatibility matrix.
Architecture doc now has a detailed box-diagram with coordinator subsystems and provider internals.

Link to Devin session: https://app.devin.ai/sessions/10a1fa11eeac4d6aa78f5e92f937fc4b

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

- Rewrite README.md with centered header, badges, table of contents, structured sections, cURL examples, and full API/CLI reference - Rebrand all user-facing docs from EigenInference to Darkbloom - Restructure docs/ARCHITECTURE.md with detailed system diagram, component tables, and organized security/privacy sections - Polish CONTRIBUTING.md with table-based layout, prerequisites matrix, and clear protocol-change sync-point reference - Improve docs/telemetry.md with table of contents, structured endpoint/emission-site tables, and explicit allowlist documentation - Reformat docs/dev-environment.md with infrastructure overview table, numbered setup steps, secrets mapping table, and cost breakdown

devin-ai-integration · 2026-05-19T16:38:33Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

vercel · 2026-05-19T16:38:35Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
d-inference	Ready	Preview	May 19, 2026 10:12pm
d-inference-console-ui-dev	Ready	Preview	May 19, 2026 10:12pm
d-inference-landing	Ready	Preview	May 19, 2026 10:12pm

github-actions · 2026-05-19T16:49:11Z

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-19 17:54 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	30
Success	4
Errors	26
Total Duration	4.301s
Throughput	0.9 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.107s	2.107s	2.107s	2.107s
parse	4	73µs	81µs	119µs	119µs
reserve	4	3ms	4ms	5ms	5ms
route	4	420µs	489µs	495µs	495µs
coordinator_to_provider	4	2.099s	2.1s	2.101s	2.101s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=72.75µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=119µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=3.47825ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=5.043ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	20
Success	4
Errors	16
Total Duration	2.852s
Throughput	1.4 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.849s	2.85s	2.852s	2.852s
parse	4	22µs	24µs	34µs	34µs
reserve	4	3ms	3ms	4ms	4ms
route	4	391µs	402µs	416µs	416µs
coordinator_to_provider	4	2.02s	2.021s	2.022s	2.022s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=21.5µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=34µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=2.947ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.375ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	4	0.5 GB
mlx-community/gemma-3-270m-4bit	3	0.2 GB

Metric	Value
Total Requests	50
Success	50
Errors	0
Total Duration	9.826s
Throughput	5.1 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	50	603ms	3ms	3.529s	3.579s
parse	48	27µs	18µs	44µs	341µs
reserve	48	2ms	1ms	4ms	5ms
route	48	0s	0s	1ms	1ms
coordinator_to_provider	50	498ms	1ms	3.52s	3.567s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=26.52µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=44µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.511083ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.051ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	60
Success	12
Errors	48
Total Duration	3.263s
Throughput	3.7 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	12	2.163s	2.16s	2.175s	2.175s
parse	12	15µs	14µs	23µs	23µs
reserve	12	3ms	3ms	5ms	5ms
route	12	1ms	1ms	1ms	1ms
coordinator_to_provider	12	2.154s	2.151s	2.168s	2.168s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=14.833µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=23µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=3.28475ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=5.253ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	40
Success	4
Errors	36
Total Duration	3.073s
Throughput	1.3 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.141s	2.141s	2.141s	2.141s
parse	4	18µs	18µs	25µs	25µs
reserve	4	2ms	2ms	2ms	2ms
route	4	411µs	446µs	510µs	510µs
coordinator_to_provider	4	2.135s	2.135s	2.135s	2.135s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=18µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=25µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.95175ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=2.106ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	60
Success	60
Errors	0
Total Duration	5.317s
Throughput	11.3 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	60	374ms	4ms	2.245s	2.245s
parse	60	16µs	14µs	28µs	41µs
reserve	60	1ms	1ms	2ms	3ms
route	60	404µs	387µs	663µs	796µs
coordinator_to_provider	60	371ms	2ms	2.238s	2.24s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=15.516µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=28µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.194366ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=2.36ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	30
Success	4
Errors	26
Total Duration	2.782s
Throughput	1.4 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.056s	2.056s	2.056s	2.056s
parse	4	59µs	60µs	89µs	89µs
reserve	4	5ms	5ms	6ms	6ms
route	4	1ms	1ms	1ms	1ms
coordinator_to_provider	4	2.04s	2.04s	2.041s	2.041s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=59.25µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=89µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=5.0075ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=5.745ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	30
Success	22
Errors	8
Total Duration	3.924s
Throughput	5.6 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	22	986ms	9ms	2.178s	2.178s
parse	22	17µs	14µs	27µs	62µs
reserve	22	2ms	2ms	3ms	5ms
route	22	1ms	0s	1ms	1ms
coordinator_to_provider	22	981ms	6ms	2.171s	2.171s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=16.909µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=27µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.91059ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=3.424ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	5	0.5 GB

Metric	Value
Total Requests	30
Success	30
Errors	0
Total Duration	3.604s
Throughput	8.3 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	30	660ms	3ms	1.988s	1.988s
parse	30	18µs	17µs	35µs	48µs
reserve	30	2ms	1ms	4ms	4ms
route	30	542µs	479µs	887µs	954µs
coordinator_to_provider	30	654ms	1ms	1.974s	1.976s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=18.366µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=35µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.8584ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.118ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	100
Success	13
Errors	87
Total Duration	3.395s
Throughput	3.8 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	13	2.307s	2.23s	3.242s	3.242s
parse	13	75µs	76µs	106µs	106µs
reserve	13	7ms	7ms	9ms	9ms
route	13	263ms	18ms	3.213s	3.213s
queue_wait	1	3.213s	3.213s	3.213s	3.213s
dispatch	1	46µs	46µs	46µs	46µs
coordinator_to_provider	13	2.025s	2.194s	2.201s	2.201s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=74.769µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=106µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=7.308384ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=9.243ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:mean<=5ms	PASS	mean=46µs (threshold=5ms)
dispatch:p95<=50ms	PASS	p95=46µs (threshold=50ms)

ethenotethan · 2026-05-19T17:34:42Z

+## Project Layout

-See [CLAUDE.md](CLAUDE.md) for the full layout and architectural decisions. The short version:
+For the full layout and architectural decisions, see [`CLAUDE.md`](CLAUDE.md).


Claude MD is for agents, we should decouple routing between humans and agents here

ethenotethan · 2026-05-19T17:34:57Z

+| Rust (stable) | Latest | Legacy provider |
+| Swift | 5.9+ (Xcode 15+) | Swift provider, enclave, macOS app |
+| Node.js | 20+ | Console UI |
+| Python | 3.11+ | Image bridge, crypto interop tests |


image bridge gone

ethenotethan · 2026-05-19T17:36:45Z

@@ -38,66 +135,84 @@ Models are selected from a curated catalog. The coordinator only routes requests
 | Qwen3.5 122B MoE 8-bit | 122B MoE, 10B active | 122 GB | 128 GB | Best quality reasoning |
 | MiniMax M2.5 8-bit | 239B MoE, 11B active | 243 GB | 256 GB | SOTA coding, ~100 tok/s |


this is gonna be dynamic and change a bunch, easiest to host via dashboard and include link in the future. @hankbobtheresearchoor file an issue on linear about this please

ethenotethan · 2026-05-19T17:39:01Z

+| MiniMax M2.5 | $0.06 / 1M tokens | $0.50 / 1M tokens |

-0% platform fee. Providers keep 100%.
+**0% platform fee. Providers keep 100% of revenue.**


I don't think that's correct, I get this a structural change but you fail to identify the semantical discrepancies

ethenotethan · 2026-05-19T17:39:19Z

+**0% platform fee. Providers keep 100% of revenue.**

-## Architecture
+Payments settled via **Solana USDC** on-chain.


delete, Solana support is no longer relevant for this project

did you delete?

Done — removed all Solana/USDC/BIP39 references across README.md, docs/ARCHITECTURE.md, and docs/dev-environment.md in b2c5380.

…paths

github-actions · 2026-05-19T17:57:38Z

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-19 22:13 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	30
Success	4
Errors	26
Total Duration	3.831s
Throughput	1.0 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	1.736s	1.736s	1.736s	1.736s
parse	4	31µs	25µs	58µs	58µs
reserve	4	4ms	5ms	7ms	7ms
route	4	694µs	753µs	844µs	844µs
coordinator_to_provider	4	1.725s	1.726s	1.727s	1.727s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=31.25µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=58µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=4.39725ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=6.617ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	20
Success	4
Errors	16
Total Duration	2.363s
Throughput	1.7 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.36s	2.361s	2.363s	2.363s
parse	4	23µs	24µs	25µs	25µs
reserve	4	3ms	4ms	5ms	5ms
route	4	488µs	508µs	511µs	511µs
coordinator_to_provider	4	1.777s	1.777s	1.778s	1.778s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=22.5µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=25µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=3.23325ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.813ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	4	0.5 GB
mlx-community/gemma-3-270m-4bit	3	0.2 GB

Metric	Value
Total Requests	50
Success	50
Errors	0
Total Duration	9.471s
Throughput	5.3 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	50	540ms	3ms	3.254s	3.269s
parse	50	21µs	17µs	52µs	81µs
reserve	50	2ms	1ms	5ms	5ms
route	50	1ms	0s	1ms	4ms
coordinator_to_provider	50	535ms	1ms	3.238s	3.253s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=21.06µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=52µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.53142ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.995ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	60
Success	12
Errors	48
Total Duration	3.396s
Throughput	3.5 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	12	2.198s	2.204s	2.222s	2.222s
parse	12	19µs	13µs	72µs	72µs
reserve	12	3ms	2ms	4ms	4ms
route	12	566µs	537µs	854µs	854µs
coordinator_to_provider	12	2.189s	2.196s	2.214s	2.214s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=19µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=72µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=2.579666ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=3.552ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	40
Success	4
Errors	36
Total Duration	3.366s
Throughput	1.2 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.235s	2.235s	2.235s	2.235s
parse	4	14µs	17µs	22µs	22µs
reserve	4	2ms	3ms	3ms	3ms
route	4	715µs	794µs	902µs	902µs
coordinator_to_provider	4	2.229s	2.229s	2.229s	2.229s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=14.25µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=22µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=2.3285ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=2.543ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	60
Success	60
Errors	0
Total Duration	5.187s
Throughput	11.6 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	60	361ms	5ms	2.15s	2.151s
parse	60	17µs	16µs	28µs	39µs
reserve	60	1ms	1ms	2ms	3ms
route	60	429µs	399µs	709µs	846µs
coordinator_to_provider	60	358ms	2ms	2.143s	2.145s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=16.866µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=28µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.266933ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=2.32ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	1	0.5 GB

Metric	Value
Total Requests	30
Success	4
Errors	26
Total Duration	3.54s
Throughput	1.1 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	4	2.423s	2.423s	2.423s	2.423s
parse	4	49µs	65µs	85µs	85µs
reserve	4	5ms	5ms	6ms	6ms
route	4	1ms	1ms	1ms	1ms
coordinator_to_provider	4	2.411s	2.411s	2.412s	2.412s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=48.5µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=85µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=5.433ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=6.143ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	30
Success	30
Errors	0
Total Duration	4.002s
Throughput	7.5 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	30	727ms	6ms	2.196s	2.196s
parse	30	16µs	14µs	38µs	57µs
reserve	30	2ms	1ms	5ms	5ms
route	30	0s	0s	1ms	1ms
coordinator_to_provider	30	723ms	4ms	2.189s	2.191s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=16.4µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=38µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.806133ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=4.76ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	5	0.5 GB

Metric	Value
Total Requests	30
Success	30
Errors	0
Total Duration	4.088s
Throughput	7.3 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	30	758ms	4ms	2.301s	2.302s
parse	30	18µs	15µs	32µs	57µs
reserve	30	2ms	1ms	5ms	5ms
route	30	439µs	445µs	585µs	675µs
coordinator_to_provider	30	753ms	1ms	2.292s	2.293s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=18.233µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=32µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=1.989433ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=5.294ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:present	FAIL	no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model	Providers	RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit	3	0.5 GB

Metric	Value
Total Requests	100
Success	13
Errors	87
Total Duration	3.54s
Throughput	3.7 req/s

Latency Decomposition

Segment	Count	Mean	P50	P95	Max
total_e2e	13	2.375s	2.296s	3.352s	3.352s
parse	13	93µs	90µs	163µs	163µs
reserve	13	8ms	8ms	9ms	9ms
route	13	272ms	18ms	3.314s	3.314s
queue_wait	1	3.314s	3.314s	3.314s	3.314s
dispatch	1	70µs	70µs	70µs	70µs
coordinator_to_provider	13	2.085s	2.258s	2.27s	2.27s

Assertion Report: FAIL

Assertion	Result	Detail
parse:mean<=1ms	PASS	mean=93.23µs (threshold=1ms)
parse:p95<=5ms	PASS	p95=163µs (threshold=5ms)
reserve:mean<=50ms	PASS	mean=8.092923ms (threshold=50ms)
reserve:p95<=200ms	PASS	p95=8.787ms (threshold=200ms)
encrypt:present	FAIL	no data for segment encrypt
dispatch:mean<=5ms	PASS	mean=70µs (threshold=5ms)
dispatch:p95<=50ms	PASS	p95=70µs (threshold=50ms)

devin-ai-integration

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

devin-ai-integration · 2026-05-19T17:57:55Z

-(configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older
-than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are
-deleted.
+`coordinator/telemetry/retention.go` runs a prune loop hourly (configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are deleted.


🟡 Telemetry retention docs reference a nonexistent file and describe removed functionality

The updated docs/telemetry.md:99 states that coordinator/telemetry/retention.go runs a prune loop hourly. This file does not exist — the only file in coordinator/telemetry/ is emitter.go. Furthermore, the described prune-loop functionality has been removed entirely; coordinator/cmd/coordinator/main.go:497 explicitly notes "Telemetry retention is handled by Datadog; no local retention loop needed." The env vars EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL and EIGENINFERENCE_TELEMETRY_MAX_AGE referenced in this line also do not appear anywhere in the codebase. This will mislead contributors looking for retention configuration.

Suggested change

`coordinator/telemetry/retention.go` runs a prune loop hourly (configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are deleted.

Telemetry retention is handled by **Datadog** — no local prune loop runs in the coordinator. Datadog handles durable persistence, querying, and retention policies.

Was this helpful? React with 👍 or 👎 to provide feedback.

vercel Bot deployed to Preview – d-inference May 19, 2026 16:39 View deployment

vercel Bot deployed to Preview – d-inference-console-ui-dev May 19, 2026 16:39 View deployment

ethenotethan reviewed May 19, 2026

View reviewed changes

docs: address review feedback — fix endpoints, platform fee, tables, …

3189aa5

…paths

vercel Bot deployed to Preview – d-inference-landing May 19, 2026 17:52 View deployment

vercel Bot deployed to Preview – d-inference May 19, 2026 17:52 View deployment

vercel Bot deployed to Preview – d-inference-console-ui-dev May 19, 2026 17:52 View deployment

devin-ai-integration Bot commented May 19, 2026

View reviewed changes

docs: remove Solana/USDC references — no longer relevant

b2c5380

vercel Bot deployed to Preview – d-inference-landing May 19, 2026 22:11 View deployment

vercel Bot deployed to Preview – d-inference May 19, 2026 22:12 View deployment

vercel Bot deployed to Preview – d-inference-console-ui-dev May 19, 2026 22:12 View deployment

		@@ -38,66 +135,84 @@ Models are selected from a curated catalog. The coordinator only routes requests
		\| Qwen3.5 122B MoE 8-bit \| 122B MoE, 10B active \| 122 GB \| 128 GB \| Best quality reasoning \|
		\| MiniMax M2.5 8-bit \| 239B MoE, 11B active \| 243 GB \| 256 GB \| SOTA coding, ~100 tok/s \|

	`coordinator/telemetry/retention.go` runs a prune loop hourly (configurable via `EIGENINFERENCE_TELEMETRY_PRUNE_INTERVAL`). Events older than 14 days (configurable via `EIGENINFERENCE_TELEMETRY_MAX_AGE`) are deleted.
	Telemetry retention is handled by Datadog — no local prune loop runs in the coordinator. Datadog handles durable persistence, querying, and retention policies.

Conversation

devin-ai-integration Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked issue

Test plan

Components touched

Protocol / interface changes

Notes for reviewers

Uh oh!

devin-ai-integration Bot commented May 19, 2026

🤖 Devin AI Engineer

Uh oh!

vercel Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

1-provider-streaming

Latency Decomposition

Assertion Report: FAIL

1-provider-non-streaming

Latency Decomposition

Assertion Report: FAIL

7-provider-multi-model

Latency Decomposition

Assertion Report: FAIL

3-provider-high-concurrency

Latency Decomposition

Assertion Report: FAIL

1-provider-queue-saturation

Latency Decomposition

Assertion Report: FAIL

3-provider-20-users

Latency Decomposition

Assertion Report: FAIL

1-provider-scaling

Latency Decomposition

Assertion Report: FAIL

3-provider-scaling

Latency Decomposition

Assertion Report: FAIL

5-provider-scaling

Latency Decomposition

Assertion Report: FAIL

3-provider-heavy-100conc-10kb

Latency Decomposition

Assertion Report: FAIL

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ethenotethan May 19, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

1-provider-streaming

devin-ai-integration Bot commented May 19, 2026 •

edited

Loading

vercel Bot commented May 19, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading