newcore-network · Flussen · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026
diff --git a/README.md b/README.md
@@ -227,34 +227,37 @@ Note: `pnpm test` does not run benchmarks.
 
 ## Benchmarks
 
-There are two benchmark suites:
-
-- Core benchmarks (Tinybench)
-- Load benchmarks (Vitest project `benchmark`)
+Benchmarks are split by value, so the default run focuses on framework features that matter for real servers.
 
 ```bash
-pnpm bench:core
+pnpm bench
+pnpm bench:value
+pnpm bench:gold
+pnpm bench:startup
+pnpm bench:diagnostic
+pnpm bench:soak
 pnpm bench:load
 pnpm bench:all
 ```
 
+- `bench` / `bench:value`: value-focused suite. Commands, net events, RPC, lifecycle, ticks, binary path, bootstrap.
+- `bench:gold`: hot-path load scenarios only.
+- `bench:startup`: startup and registration cost.
+- `bench:diagnostic`: internal and low-level synthetic benchmarks.
+- `bench:soak`: long-running stress scenario.
+
 ### Snapshot (latest local run)
 
-These values are a small extract from the latest local run (`1.0.0-beta.1`, Feb 26, 2026). Results vary by machine.
-
-- **Core**
-  - BinaryService - classify response type: `~18.25M ops/sec` (mean `~0.055μs`, p95 `~0.076μs`)
-  - EventInterceptor - getStatistics (1000 events): `~17.78M ops/sec` (mean `~0.056μs`)
-  - RuntimeConfig - resolve CORE mode: `~10.49M ops/sec` (mean `~0.095μs`)
-  - Decorators - define metadata (Command): `~6.92M ops/sec` (mean `~0.145μs`)
-  - EventBus - multiple event types: `~2.57M ops/sec` (mean `~0.390μs`)
-  - DI - resolve simple service: `~1.78M ops/sec` (mean `~0.560μs`)
-- **Load**
-  - Commands - 500 players (validated): `~4.78M ops/sec` (p95 `~0.008ms`)
-  - Pipeline - validated (500 players): `~4.79M ops/sec` (p95 `~0.024ms`)
-  - Pipeline - full (500 players): `~2.34M ops/sec` (p95 `~0.011ms`)
-  - RPC - schema generation complex (500 methods): `~705K ops/sec` (p95 `~0.335ms`)
-  - Commands - 500 players (concurrent): `~6.31K ops/sec` (p95 `~76.00ms`)
+Use `benchmark/reports/` as the source of truth. Results vary by machine and should be compared relatively, not treated as product guarantees.
+
+- Primary benchmark targets:
+  - full command execution
+  - full net event handling
+  - RPC processing
+  - player lifecycle churn
+  - tick budget impact
+  - bootstrap cost
+  - binary transport cost
 
 Full reports and methodology are available in benchmark/README.md.
 

diff --git a/RELEASE.md b/RELEASE.md
@@ -1,20 +1,16 @@
-## OpenCore Framework v1.0.6
+## OpenCore Framework v1.0.7
 
 ### Added
 - Added `RpcPublicError` and `serializeRpcError()` for safe RPC error exposure.
-- Added `PUBLIC_RPC_ERROR_MESSAGE` as the default public message for unexpected RPC failures.
-- Added transport exports for RPC error helpers through `src/adapters/contracts/transport`.
-- Added unit and integration coverage for RPC error serialization and server RPC flow logging.
+- Added structured benchmark suites and reporting.
 
 ### Changed
-- Updated server RPC processing to log handler failures with event, handler, player, and account context.
-- Updated RPC handling to preserve explicit public errors while masking unexpected internal errors.
-- Refined the RPC path so invalid payloads and session issues are logged with clearer warnings.
+- Updated server RPC logging and error handling for clearer failures.
+- Updated benchmark metrics to include duration tracking and line-delimited JSON output.
 
 ### Fixed
 - Fixed RPC error leakage by sanitizing unexpected exceptions before they are returned to the client.
-- Fixed RPC logger behavior so exposed errors can pass through with their original message and name.
-- Fixed contract alignment across transport, server RPC processing, and test coverage.
+- Fixed `PlayerPersistenceService` bootstrap so `PlayerPersistenceContract` implementations run on session load.
 
 ### Notes
-- This release tracks the `fix/rpc-logger` merge request (#51) and keeps the release note focused on the RPC error-handling changes.
+- This release tracks the current branch changes for RPC logging, benchmarks, and session persistence.
diff --git a/benchmark/LATEST_REPORT.md b/benchmark/LATEST_REPORT.md
@@ -0,0 +1,306 @@
+# OpenCore Framework Benchmark Report
+
+Generated from:
+
+- `benchmark/reports/benchmark-2026-04-01T19-12-11-034Z.json`
+- `benchmark/reports/benchmark-2026-04-01T19-12-11-030Z.txt`
+- `benchmark/reports/.load-metrics.json`
+
+Run metadata:
+
+- Timestamp: `2026-04-01T19:03:36.782Z`
+- Version: `1.0.6`
+
+## Executive Summary
+
+This run reflects the new benchmark strategy for OpenCore.
+
+The benchmark suite is now split by value:
+
+- `gold`: real framework feature paths
+- `startup`: boot and registration cost
+- `diagnostic`: low-level internals for tuning
+- `soak`: longer-running stress checks
+
+This matters because the previous suite mixed product-facing signals with synthetic internals. The new report is much easier to interpret for real servers.
+
+Suite distribution in this run:
+
+- `gold`: 227 results
+- `startup`: 30 results
+- `diagnostic`: 283 results
+- `soak`: 11 results
+
+## Diagnostic Summary
+
+## What is working well
+
+### 1. Gold benchmarks now measure actual framework value
+
+The most useful benchmarks in this run are the ones that exercise real framework features:
+
+- full command execution
+- full net event handling
+- RPC schema and dispatch paths
+- player lifecycle churn
+- tick handler cost
+- binary transport paths
+- bootstrap / startup registration
+
+This is a large improvement over microbenchmarks that only measure metadata reads or helper internals.
+
+### 2. Startup costs are visible and actionable
+
+The startup suite gives useful numbers for:
+
+- metadata scanning
+- dependency injection setup
+- schema generation
+- bootstrap controller registration
+
+This is useful for release quality and for understanding how fast a server resource graph can initialize.
+
+### 3. Concurrency bottlenecks are now exposed honestly
+
+The most important runtime signal in the new report is not the best-case path, but the degradation under contention.
+
+That shows up clearly in:
+
+- command concurrent execution
+- tick parallel execution
+- large payload binary serialization
+
+These are meaningful server-facing signals.
+
+## What still needs attention
+
+### 1. Some diagnostic benchmarks still report zero iterations
+
+Examples in this run:
+
+- `DI - Resolve with 1 dependency`
+- `DI - Resolve with 2 dependencies`
+- `DI - Resolve with 3 dependencies`
+- `DI - Resolve 100 times (complex)`
+- several `AccessControl` success-path scenarios
+
+These should either be fixed or removed from the primary diagnostic output. Right now they create noise and reduce trust in that part of the suite.
+
+### 2. Some low-sample scenarios still have weak statistical value
+
+Examples:
+
+- `BinaryService - Buffer split + parse` scenarios with only `1` operation
+- `BinaryService - Pending requests lifecycle` scenarios with only `2` operations
+- connect/disconnect cycle scenarios with only `3` operations
+
+These can still be useful as sanity checks, but their `p95` and `p99` are not as meaningful as the larger-sample runs.
+
+### 3. Diagnostic still contains more data than decision-makers need
+
+This is acceptable because `diagnostic` is now demoted, but it confirms the design decision:
+
+- keep `gold` for product decisions
+- keep `diagnostic` for tuning
+
+## Final Diagnosis
+
+OpenCore now has a benchmark system that is directionally correct for a framework runtime:
+
+- it measures feature paths instead of mostly internal trivia
+- it separates startup from hot paths
+- it surfaces concurrency pain points instead of hiding them in averages
+- it produces a report that can support engineering decisions
+
+The main remaining cleanup is in the diagnostic tier, not in the gold suite.
+
+## Key Results
+
+## Gold Suite
+
+### Commands
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `Command Full - Validated (100 players)` | `115.68K ops/sec` | `0.012ms` | Strong validated happy-path throughput |
+| `Command Full - End-to-End (100 players)` | `863.50K ops/sec` | `0.0027ms` | Extremely cheap synthetic end-to-end path |
+| `Command Full - Concurrent (100 players)` | `121.71 ops/sec` | `14.42ms` | Main contention signal |
+
+Takeaway:
+
+- happy-path command handling is strong
+- concurrent saturation is where the runtime should be watched most closely
+
+### Net Events
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `Net Events - Simple (10 players)` | `16.81K ops/sec` | `0.223ms` | base handler cost |
+| `Net Events - Validated (10 players)` | `9.61K ops/sec` | `0.488ms` | validation overhead is visible |
+| `Net Events - Full Event (small, 10 players)` | `74.42K ops/sec` | `0.029ms` | small payload path remains cheap |
+| `Net Events - Full Event (medium, 10 players)` | `44.73K ops/sec` | `0.079ms` | moderate serialization cost |
+| `Net Events - Full Event (large, 10 players)` | `27.68K ops/sec` | `0.113ms` | payload size starts to dominate |
+
+Takeaway:
+
+- payload size matters more than simple dispatch
+- validated net events remain comfortably sub-millisecond in this run
+
+### RPC
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `RPC - Schema generation simple (200 methods)` | `7.95K ops/sec` | `0.176ms` | strong simple-schema throughput |
+| `RPC - Schema generation complex (200 methods)` | `3.06K ops/sec` | `0.400ms` | complex generation costs ~2-3x more |
+
+Takeaway:
+
+- RPC stays in a reasonable range even when schemas become more complex
+- schema complexity is a real cost center in startup/registration paths
+
+### Player Lifecycle
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `Player Lifecycle - Full Cycle (500 players)` | `200.55K ops/sec` | `0.0096ms` | strong lifecycle throughput |
+| `Player Lifecycle - Concurrent Connections (500 players)` | `108.68K ops/sec` | `0.0046ms` | connection fan-out still healthy |
+| `Player Lifecycle - Concurrent Disconnections (500 players)` | `1.83M ops/sec` | `0.00075ms` | disconnect path is very cheap |
+
+Takeaway:
+
+- lifecycle churn performs well
+- connect cost is meaningfully higher than disconnect cost, as expected
+
+### Tick Budget
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `Tick - Real setTick (50 handlers)` | `93.12K ops/sec` | `0.021ms` | good light-handler budget |
+| `Tick - 5 Handlers (medium workload)` | `18.45K ops/sec` | `0.098ms` | still acceptable under moderate work |
+| `Tick - 5 Handlers (heavy workload)` | `2.26K ops/sec` | `0.559ms` | heavy work is the danger zone |
+| `Tick - Parallel Execution` | `243.24 ops/sec` | `8.00ms` | expensive and not a default win |
+
+Takeaway:
+
+- small tick handlers are cheap
+- heavy per-tick work remains one of the biggest practical risks for servers
+- parallel tick execution is far more expensive than sequential in this run
+
+### BinaryService
+
+| Scenario | Throughput | p95 | Notes |
+| --- | --- | --- | --- |
+| `BinaryService - Parse mixed responses (500 ops)` | `1.20M ops/sec` | `0.0011ms` | very strong parse path |
+| `BinaryService - Full round-trip (50 calls)` | `350.12K ops/sec` | `0.0092ms` | healthy round-trip path |
+| `BinaryService - Serialize large payload (500 ops)` | `2.88K ops/sec` | `0.428ms` | large payload serialization is expensive |
+
+Takeaway:
+
+- binary transport is excellent for parse and smaller payloads
+- large payload serialization is the main bottleneck here
+
+## Startup Suite
+
+### MetadataScanner
+
+| Scenario | Throughput | Median | p99 |
+| --- | --- | --- | --- |
+| `1 controller, 3 methods` | `743.86K ops/sec` | `1.27μs` | `4.49μs` |
+| `3 controllers, 6 methods` | `390.84K ops/sec` | `2.48μs` | `4.61μs` |
+| `10 controllers` | `112.61K ops/sec` | `8.73μs` | `21.15μs` |
+
+### Dependency Injection
+
+| Scenario | Throughput | Median | p99 |
+| --- | --- | --- | --- |
+| `Resolve simple service` | `1.92M ops/sec` | `0.48μs` | `1.36μs` |
+| `Resolve 100 times (simple)` | `69.44K ops/sec` | `14.06μs` | `32.24μs` |
+
+### SchemaGenerator
+
+| Scenario | Throughput | Median | p99 |
+| --- | --- | --- | --- |
+| `1 param` | `42.58K ops/sec` | `22.33μs` | `71.08μs` |
+| `3 params` | `28.79K ops/sec` | `33.19μs` | `98.64μs` |
+| `5 params` | `17.29K ops/sec` | `55.17μs` | `142.94μs` |
+| `batch 10 methods` | `3.18K ops/sec` | `0.298ms` | `11.48ms` |
+| `batch 50 methods` | `628 ops/sec` | `1.45ms` | `14.17ms` |
+
+### Bootstrap Load
+
+| Scenario | Throughput | p95 |
+| --- | --- | --- |
+| `Bootstrap - 1 controller` | `2.81K ops/sec` | `1.11ms` |
+| `Bootstrap - 10 controllers` | `1.16K ops/sec` | `1.46ms` |
+| `Bootstrap - 50 controllers` | `396.99 ops/sec` | `2.84ms` |
+| `Bootstrap - 100 controllers` | `205.87 ops/sec` | `6.37ms` |
+
+Takeaway:
+
+- startup remains healthy
+- schema generation is the most expensive startup subsystem visible in this run
+
+## Diagnostic Suite
+
+The diagnostic suite still has value for framework maintainers, especially for:
+
+- Zod validation cost
+- rate limiter scaling
+- event bus fan-out cost
+- decorator and metadata overhead
+
+Notable diagnostics:
+
+- `Zod - Simple schema validation`: `2.72M ops/sec`
+- `RateLimiter - Single key check`: `3.50M ops/sec`
+- `EventBus - Emit to 1 handler`: `4.56M ops/sec`
+- `EventBus - Emit to 100 handlers`: `131.12K ops/sec`
+
+However, this suite still contains scenarios with zero iterations and should not be treated as the primary external benchmark story.
+
+## Engineering Conclusions
+
+## What these numbers say about the framework
+
+1. OpenCore hot paths are fast when kept on the intended model.
+2. Validation and typed dispatch are not the dominant cost in most happy paths.
+3. Concurrency pressure is more important than raw single-path throughput.
+4. Tick workload and large payload serialization are the practical danger areas.
+5. Startup cost is acceptable and mostly dominated by schema generation scale.
+
+## What matters most to server developers
+
+For real servers, the most useful numbers in this report are:
+
+- command concurrent throughput and tail latency
+- net event cost by payload size
+- tick budget under realistic handler counts
+- lifecycle churn under hundreds of players
+- bootstrap time as controller count grows
+
+## Recommended Follow-up
+
+1. Fix or remove zero-iteration diagnostic benchmarks.
+2. Increase sample counts for low-op scenarios like pending-request lifecycle and buffer split benchmarks.
+3. Add memory and event-loop lag metrics to `gold` and `soak`.
+4. Keep `gold` as the default benchmark story for docs and landing pages.
+
+## Final Verdict
+
+This benchmark run supports the new benchmark direction.
+
+OpenCore now has a benchmark system that is useful for:
+
+- framework engineering
+- release validation
+- communicating real runtime behavior
+
+The benchmark story is no longer “here are some fast internals”.
+
+It is now closer to:
+
+- here is what commands cost
+- here is what net events cost
+- here is what ticks cost
+- here is how lifecycle behaves at scale
+- here is what startup actually costs