Skip to content

Add admin console dashboard#183

Draft
Gajesh2007 wants to merge 3 commits into
masterfrom
admin-console-ui
Draft

Add admin console dashboard#183
Gajesh2007 wants to merge 3 commits into
masterfrom
admin-console-ui

Conversation

@Gajesh2007
Copy link
Copy Markdown
Member

@Gajesh2007 Gajesh2007 commented May 18, 2026

Summary

  • Add a standalone Next.js admin dashboard under admin-console-ui for coordinator admin operations.
  • Add a server-side admin proxy with coordinator origin allowlisting and session-scoped token storage.
  • Make coordinator admin auth consistent for admin key and Privy admin access, with focused API tests.

Verification

  • npm run lint (admin-console-ui)
  • npm run build (admin-console-ui)
  • go test ./api (coordinator)

Notes

  • Draft PR per request.
  • npm install reported 2 moderate vulnerabilities in the new admin-console-ui dependency tree.

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

@vercel
Copy link
Copy Markdown

vercel Bot commented May 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
d-inference Ready Ready Preview May 18, 2026 11:23pm
d-inference-console-ui-dev Ready Ready Preview May 18, 2026 11:23pm
d-inference-landing Ready Ready Preview May 18, 2026 11:23pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-18 21:06 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 1
Errors 29
Total Duration 11.672s
Throughput 0.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 1 5.608s 5.608s 5.608s 5.608s
parse 1 22µs 22µs 22µs 22µs
reserve 1 4ms 4ms 4ms 4ms
route 1 826µs 826µs 826µs 826µs
coordinator_to_provider 1 5.557s 5.557s 5.557s 5.557s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=22µs (threshold=1ms)
parse:p95<=5ms PASS p95=22µs (threshold=5ms)
reserve:mean<=50ms PASS mean=4.139ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.139ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 20
Success 4
Errors 16
Total Duration 5.178s
Throughput 0.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 5.174s 5.175s 5.177s 5.177s
parse 4 59µs 58µs 108µs 108µs
reserve 4 10ms 10ms 13ms 13ms
route 4 1ms 1ms 1ms 1ms
coordinator_to_provider 4 4.104s 4.105s 4.108s 4.108s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=59.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=108µs (threshold=5ms)
reserve:mean<=50ms PASS mean=10.0325ms (threshold=50ms)
reserve:p95<=200ms PASS p95=13.329ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 4 0.5 GB
mlx-community/gemma-3-270m-4bit 3 0.2 GB
Metric Value
Total Requests 50
Success 42
Errors 8
Total Duration 55.593s
Throughput 0.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 42 4.521s 20ms 31.326s 31.58s
parse 41 48µs 31µs 134µs 338µs
reserve 41 8ms 2ms 29ms 41ms
route 41 2ms 1ms 8ms 14ms
coordinator_to_provider 42 4.445s 8ms 31.289s 31.545s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=48.146µs (threshold=1ms)
parse:p95<=5ms PASS p95=134µs (threshold=5ms)
reserve:mean<=50ms PASS mean=7.542902ms (threshold=50ms)
reserve:p95<=200ms PASS p95=28.711ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 11
Errors 49
Total Duration 10.423s
Throughput 1.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 11 6.765s 6.592s 7.166s 7.166s
parse 11 29µs 30µs 54µs 54µs
reserve 11 29ms 32ms 45ms 45ms
route 11 3.665s 5.027s 5.053s 5.053s
coordinator_to_provider 11 3.04s 2.059s 6.954s 6.954s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=29.181µs (threshold=1ms)
parse:p95<=5ms PASS p95=54µs (threshold=5ms)
reserve:mean<=50ms PASS mean=28.897454ms (threshold=50ms)
reserve:p95<=200ms PASS p95=45.088ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 40
Success 4
Errors 36
Total Duration 6.034s
Throughput 0.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 4.787s 4.788s 4.788s 4.788s
parse 4 24µs 33µs 41µs 41µs
reserve 4 28ms 28ms 29ms 29ms
route 4 2ms 2ms 3ms 3ms
coordinator_to_provider 4 4.73s 4.73s 4.73s 4.73s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=24.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=41µs (threshold=5ms)
reserve:mean<=50ms PASS mean=28.06175ms (threshold=50ms)
reserve:p95<=200ms PASS p95=28.601ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 59
Errors 1
Total Duration 16.039s
Throughput 3.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 59 901ms 9ms 5.895s 5.895s
parse 59 54µs 31µs 235µs 378µs
reserve 59 4ms 2ms 17ms 18ms
route 59 517ms 1ms 5.08s 5.098s
coordinator_to_provider 59 375ms 4ms 5.597s 5.822s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=53.694µs (threshold=1ms)
parse:p95<=5ms PASS p95=235µs (threshold=5ms)
reserve:mean<=50ms PASS mean=4.160881ms (threshold=50ms)
reserve:p95<=200ms PASS p95=16.828ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 5.889s
Throughput 0.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 4.609s 4.609s 4.609s 4.609s
parse 4 15µs 14µs 35µs 35µs
reserve 4 10ms 12ms 13ms 13ms
route 4 4ms 5ms 7ms 7ms
coordinator_to_provider 4 4.582s 4.58s 4.588s 4.588s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=35µs (threshold=5ms)
reserve:mean<=50ms PASS mean=10.1535ms (threshold=50ms)
reserve:p95<=200ms PASS p95=12.848ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 30
Success 29
Errors 1
Total Duration 9.704s
Throughput 3.0 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 29 1.697s 13ms 5.485s 5.485s
parse 29 34µs 27µs 64µs 151µs
reserve 29 7ms 2ms 27ms 30ms
route 29 1.044s 1ms 5.041s 5.059s
coordinator_to_provider 29 639ms 9ms 5.38s 5.424s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=33.827µs (threshold=1ms)
parse:p95<=5ms PASS p95=64µs (threshold=5ms)
reserve:mean<=50ms PASS mean=7.192068ms (threshold=50ms)
reserve:p95<=200ms PASS p95=27.115ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 5 0.5 GB
Metric Value
Total Requests 30
Success 25
Errors 5
Total Duration 31.533s
Throughput 0.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 25 5.365s 15ms 21.175s 21.52s
parse 24 67µs 55µs 158µs 182µs
reserve 24 7ms 4ms 27ms 27ms
route 24 840ms 1ms 5.032s 5.034s
coordinator_to_provider 25 4.442s 4ms 21.112s 21.466s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=67.166µs (threshold=1ms)
parse:p95<=5ms PASS p95=158µs (threshold=5ms)
reserve:mean<=50ms PASS mean=7.346ms (threshold=50ms)
reserve:p95<=200ms PASS p95=26.812ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 100
Success 12
Errors 88
Total Duration 8.908s
Throughput 1.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 6.29s 6.266s 6.357s 6.357s
parse 12 179µs 211µs 319µs 319µs
reserve 12 31ms 31ms 35ms 35ms
route 12 51ms 51ms 55ms 55ms
coordinator_to_provider 12 6.14s 6.117s 6.209s 6.209s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=178.75µs (threshold=1ms)
parse:p95<=5ms PASS p95=319µs (threshold=5ms)
reserve:mean<=50ms PASS mean=31.110583ms (threshold=50ms)
reserve:p95<=200ms PASS p95=34.862ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-18 22:24 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 3.91s
Throughput 1.0 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 1.796s 1.796s 1.796s 1.796s
parse 4 25µs 22µs 53µs 53µs
reserve 4 3ms 3ms 4ms 4ms
route 4 434µs 456µs 472µs 472µs
coordinator_to_provider 4 1.789s 1.79s 1.791s 1.791s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=25µs (threshold=1ms)
parse:p95<=5ms PASS p95=53µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.894ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.364ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 20
Success 4
Errors 16
Total Duration 2.745s
Throughput 1.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.719s 2.743s 2.745s 2.745s
parse 4 15µs 17µs 18µs 18µs
reserve 4 3ms 3ms 4ms 4ms
route 4 355µs 366µs 388µs 388µs
coordinator_to_provider 4 1.791s 1.792s 1.792s 1.792s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15µs (threshold=1ms)
parse:p95<=5ms PASS p95=18µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.56575ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.699ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 4 0.5 GB
mlx-community/gemma-3-270m-4bit 3 0.2 GB
Metric Value
Total Requests 50
Success 50
Errors 0
Total Duration 11.052s
Throughput 4.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 50 607ms 4ms 3.604s 3.647s
parse 48 18µs 17µs 33µs 38µs
reserve 48 1ms 1ms 3ms 3ms
route 48 487µs 428µs 795µs 911µs
coordinator_to_provider 50 503ms 1ms 3.573s 3.64s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=17.604µs (threshold=1ms)
parse:p95<=5ms PASS p95=33µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.400729ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.782ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 12
Errors 48
Total Duration 3.723s
Throughput 3.2 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.345s 2.333s 2.374s 2.374s
parse 12 15µs 15µs 24µs 24µs
reserve 12 3ms 3ms 3ms 3ms
route 12 1ms 1ms 2ms 2ms
coordinator_to_provider 12 2.336s 2.324s 2.366s 2.366s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15.416µs (threshold=1ms)
parse:p95<=5ms PASS p95=24µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.591166ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.432ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 40
Success 4
Errors 36
Total Duration 3.618s
Throughput 1.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.289s 2.289s 2.29s 2.29s
parse 4 17µs 17µs 24µs 24µs
reserve 4 3ms 3ms 3ms 3ms
route 4 574µs 560µs 760µs 760µs
coordinator_to_provider 4 2.283s 2.283s 2.283s 2.283s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.75µs (threshold=1ms)
parse:p95<=5ms PASS p95=24µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.863ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.033ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 60
Errors 0
Total Duration 5.725s
Throughput 10.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 60 387ms 4ms 2.302s 2.302s
parse 60 16µs 16µs 29µs 38µs
reserve 60 1ms 1ms 3ms 4ms
route 60 483µs 422µs 949µs 999µs
coordinator_to_provider 60 384ms 2ms 2.295s 2.296s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.416µs (threshold=1ms)
parse:p95<=5ms PASS p95=29µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.430483ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.286ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 2.8s
Throughput 1.4 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.104s 2.104s 2.104s 2.104s
parse 4 19µs 26µs 26µs 26µs
reserve 4 3ms 3ms 3ms 3ms
route 4 675µs 640µs 968µs 968µs
coordinator_to_provider 4 2.097s 2.097s 2.098s 2.098s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=19.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=26µs (threshold=5ms)
reserve:mean<=50ms PASS mean=3.10625ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.465ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 3.793s
Throughput 7.9 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 729ms 7ms 2.19s 2.19s
parse 30 16µs 14µs 26µs 50µs
reserve 30 2ms 1ms 4ms 4ms
route 30 491µs 467µs 886µs 970µs
coordinator_to_provider 30 725ms 4ms 2.183s 2.183s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16µs (threshold=1ms)
parse:p95<=5ms PASS p95=26µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.708933ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.535ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 5 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 4.193s
Throughput 7.2 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 748ms 4ms 2.272s 2.272s
parse 30 16µs 15µs 30µs 30µs
reserve 30 2ms 1ms 3ms 3ms
route 30 435µs 412µs 696µs 766µs
coordinator_to_provider 30 744ms 1ms 2.263s 2.265s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.133µs (threshold=1ms)
parse:p95<=5ms PASS p95=30µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.521466ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.708ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 100
Success 12
Errors 88
Total Duration 3.305s
Throughput 3.6 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.217s 2.217s 2.226s 2.226s
parse 12 94µs 82µs 207µs 207µs
reserve 12 7ms 8ms 8ms 8ms
route 12 18ms 18ms 18ms 18ms
coordinator_to_provider 12 2.182s 2.183s 2.19s 2.19s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=94.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=207µs (threshold=5ms)
reserve:mean<=50ms PASS mean=7.327ms (threshold=50ms)
reserve:p95<=200ms PASS p95=7.76ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Benchmark Results

Runner: macos-15 (M1 Virtual) | Date: 2026-05-18 23:24 UTC

1-provider-streaming

1 providers, 1 users, 30 requests, concurrency=5, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 4
Errors 26
Total Duration 3.708s
Throughput 1.1 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 1.679s 1.679s 1.679s 1.679s
parse 4 34µs 22µs 73µs 73µs
reserve 4 4ms 5ms 6ms 6ms
route 4 605µs 624µs 638µs 638µs
coordinator_to_provider 4 1.669s 1.669s 1.671s 1.671s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=34.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=73µs (threshold=5ms)
reserve:mean<=50ms PASS mean=4.19525ms (threshold=50ms)
reserve:p95<=200ms PASS p95=6.082ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-non-streaming

1 providers, 1 users, 20 requests, concurrency=5, streaming=false

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 20
Success 4
Errors 16
Total Duration 2.182s
Throughput 1.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.179s 2.18s 2.182s 2.182s
parse 4 16µs 19µs 22µs 22µs
reserve 4 3ms 4ms 5ms 5ms
route 4 441µs 441µs 444µs 444µs
coordinator_to_provider 4 1.619s 1.619s 1.62s 1.62s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16µs (threshold=1ms)
parse:p95<=5ms PASS p95=22µs (threshold=5ms)
reserve:mean<=50ms PASS mean=3.40825ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.902ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

7-provider-multi-model

7 providers, 5 users, 50 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 4 0.5 GB
mlx-community/gemma-3-270m-4bit 3 0.2 GB
Metric Value
Total Requests 50
Success 50
Errors 0
Total Duration 10.626s
Throughput 4.7 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 50 600ms 4ms 3.536s 3.565s
parse 48 18µs 15µs 36µs 87µs
reserve 48 1ms 1ms 3ms 4ms
route 48 452µs 441µs 719µs 796µs
coordinator_to_provider 50 496ms 1ms 3.529s 3.558s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=18.145µs (threshold=1ms)
parse:p95<=5ms PASS p95=36µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.311479ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.538ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-high-concurrency

3 providers, 10 users, 60 requests, concurrency=20, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 12
Errors 48
Total Duration 3.405s
Throughput 3.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.149s 2.135s 2.205s 2.205s
parse 12 17µs 15µs 31µs 31µs
reserve 12 3ms 3ms 5ms 5ms
route 12 1ms 1ms 2ms 2ms
coordinator_to_provider 12 2.14s 2.128s 2.199s 2.199s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.5µs (threshold=1ms)
parse:p95<=5ms PASS p95=31µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.94675ms (threshold=50ms)
reserve:p95<=200ms PASS p95=4.775ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-queue-saturation

1 providers, 10 users, 40 requests, concurrency=15, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 40
Success 4
Errors 36
Total Duration 3.021s
Throughput 1.3 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 4 2.087s 2.087s 2.087s 2.087s
parse 4 20µs 22µs 28µs 28µs
reserve 4 3ms 3ms 3ms 3ms
route 4 703µs 723µs 750µs 750µs
coordinator_to_provider 4 2.08s 2.08s 2.08s 2.08s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=20.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=28µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.504ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.703ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-20-users

3 providers, 20 users, 60 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 60
Success 60
Errors 0
Total Duration 5.361s
Throughput 11.2 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 60 378ms 5ms 2.274s 2.274s
parse 60 16µs 15µs 31µs 42µs
reserve 60 1ms 1ms 3ms 3ms
route 60 432µs 402µs 652µs 911µs
coordinator_to_provider 60 375ms 3ms 2.267s 2.269s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=16.25µs (threshold=1ms)
parse:p95<=5ms PASS p95=31µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.250833ms (threshold=50ms)
reserve:p95<=200ms PASS p95=2.534ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

1-provider-scaling

1 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 1 0.5 GB
Metric Value
Total Requests 30
Success 5
Errors 25
Total Duration 3.261s
Throughput 1.5 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 5 2.371s 2.183s 3.125s 3.125s
parse 5 17µs 17µs 22µs 22µs
reserve 5 3ms 3ms 3ms 3ms
route 5 622ms 1ms 3.11s 3.11s
queue_wait 1 3.11s 3.11s 3.11s 3.11s
dispatch 1 25µs 25µs 25µs 25µs
coordinator_to_provider 5 1.743s 2.176s 2.177s 2.177s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=17.4µs (threshold=1ms)
parse:p95<=5ms PASS p95=22µs (threshold=5ms)
reserve:mean<=50ms PASS mean=2.9798ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.473ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:mean<=5ms PASS mean=25µs (threshold=5ms)
dispatch:p95<=50ms PASS p95=25µs (threshold=50ms)

3-provider-scaling

3 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 3.855s
Throughput 7.8 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 716ms 7ms 2.172s 2.172s
parse 30 16µs 12µs 32µs 81µs
reserve 30 2ms 1ms 3ms 4ms
route 30 436µs 413µs 708µs 747µs
coordinator_to_provider 30 712ms 4ms 2.165s 2.166s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15.766µs (threshold=1ms)
parse:p95<=5ms PASS p95=32µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.577066ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.282ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

5-provider-scaling

5 providers, 5 users, 30 requests, concurrency=10, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 5 0.5 GB
Metric Value
Total Requests 30
Success 30
Errors 0
Total Duration 3.968s
Throughput 7.6 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 30 700ms 4ms 2.138s 2.138s
parse 30 16µs 16µs 29µs 31µs
reserve 30 2ms 1ms 4ms 4ms
route 30 483µs 468µs 730µs 836µs
coordinator_to_provider 30 696ms 1ms 2.131s 2.132s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=15.933µs (threshold=1ms)
parse:p95<=5ms PASS p95=29µs (threshold=5ms)
reserve:mean<=50ms PASS mean=1.745766ms (threshold=50ms)
reserve:p95<=200ms PASS p95=3.502ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

3-provider-heavy-100conc-10kb

3 providers, 20 users, 100 requests, concurrency=100, streaming=true

Model Providers RAM
mlx-community/Qwen3.5-0.8B-MLX-4bit 3 0.5 GB
Metric Value
Total Requests 100
Success 12
Errors 88
Total Duration 3.324s
Throughput 3.6 req/s

Latency Decomposition

Segment Count Mean P50 P95 Max
total_e2e 12 2.201s 2.202s 2.216s 2.216s
parse 12 136µs 129µs 301µs 301µs
reserve 12 12ms 13ms 13ms 13ms
route 12 19ms 20ms 20ms 20ms
coordinator_to_provider 12 2.157s 2.156s 2.191s 2.191s

Assertion Report: FAIL

Assertion Result Detail
parse:mean<=1ms PASS mean=135.916µs (threshold=1ms)
parse:p95<=5ms PASS p95=301µs (threshold=5ms)
reserve:mean<=50ms PASS mean=12.183166ms (threshold=50ms)
reserve:p95<=200ms PASS p95=13.348ms (threshold=200ms)
encrypt:present FAIL no data for segment encrypt
dispatch:present FAIL no data for segment dispatch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant