Skip to content

feat(sandbox): workflow-sandbox-runner agent + remote-runner config (infra-admin P3b PR8/12)#848

Merged
intel352 merged 2 commits into
mainfrom
feat/infra-p3b-agent
Jun 3, 2026
Merged

feat(sandbox): workflow-sandbox-runner agent + remote-runner config (infra-admin P3b PR8/12)#848
intel352 merged 2 commits into
mainfrom
feat/infra-p3b-agent

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented Jun 3, 2026

PR8/12 — workflow-sandbox-runner agent + static-config wiring (infra-admin Phase 3b, part 2)

Locked plan Tasks 15–16; ADR 0019. The remote executor agent + the engine-side wiring so step.sandbox_exec with exec_env: <runner-name> dispatches to a configured remote agent. Builds on PR7 (proto + RemoteRunner client).

cmd/workflow-sandbox-runner (agent — a trust boundary)

Serves SandboxExecService over gRPC. Per request: clamps the requested profile to {strict,standard} (forces standard for permissive/unknown, logs the clamp — never honors a privileged request from the engine); resolves secret:// env refs agent-side with its OWN secrets provider (values never logged/echoed); runs the command in a local Docker sandbox; streams stdout/stderr + a terminal exit_code. Auth: bearer-token StreamServerInterceptor (constant-time compare) + optional mTLS. Refuses to start without --token or --tls-ca unless --allow-unauthenticated is explicitly passed (loud warning) — no silent unauthenticated executor.

Static-config wiring

sandbox.remote_runners module: parses [{name, address, token, secrets_provider, tls, allow_insecure}], validates (required address, no reserved/duplicate names, auth-or-TLS-or-explicit-insecure), and exposes a lookup registry. resolveSandboxRunner resolves exec_env: <name> → a RemoteRunner to that agent (token secret:// refs resolved via the engine's secrets provider). ephemeral stays deferred to PR9.

Review notes (resolved — security)

  • Critical: secret:// runner tokens now resolve through the secrets provider (were sent verbatim → broken auth).
  • Critical: the agent refuses an unauthenticated start unless --allow-unauthenticated.
  • Constant-time bearer compare; reserved/duplicate runner-name rejection; no-auth-no-TLS rejected unless allow_insecure; schema Inputs fixed; windows dropped from the agent release matrix (Linux Docker daemon).

Release: added cmd/workflow-sandbox-runner to the release.yml build-binaries matrix (v0.72.0 ships it). Verified: build + go test ./... exit 0 (151 ok); full golangci-lint v2.12.0 0 issues; --help + unauthenticated-refusal validated.

🤖 Generated with Claude Code

…ig wiring (PR8)

Task 15: cmd/workflow-sandbox-runner — gRPC server (SandboxExecService) with
mTLS + bearer-token auth, profile clamping (permissive→standard), agent-side
secret:// resolution via secrets.Provider, streaming stdout/stderr/exit_code
chunks. Version via ldflags.

Task 16: module/sandbox_remote_runners — new sandbox.remote_runners module type
(SandboxRemoteRunnersModule) that parses named RemoteRunnerSpec list from config
and exposes a RemoteRunnerRegistry service. resolveSandboxRunner treats any
unknown exec_env as a named remote runner — looks up the registry via
app.GetService, builds sandbox/remote.RemoteRunner from the spec + per-exec
SandboxConfig. pipeline_step_sandbox_exec.go: exec_env validation deferred to
Execute time. sandbox.SandboxConfig gains a Profile field.

Registration: pipelinesteps plugin, schema coreModuleTypes + registerBuiltins,
cmd/wfctl/type_registry.go, DOCUMENTATION.md, golden editor-schemas.json.
Release: .github/workflows/release.yml build-binaries matrix gains sandbox-runner.

Security review fixes (CHANGES_REQUESTED):
- CRITICAL: secret:// runner tokens are now resolved through the module's
  configured secrets_provider (new config key) before dialing — a literal
  "secret://..." string is no longer sent as the Bearer header. A secret:// token
  with no provider is a hard error; literal tokens pass through.
- CRITICAL: the agent refuses to start as an unauthenticated executor (no token
  AND no mTLS) unless --allow-unauthenticated is passed, which logs a loud warning.
- IMPORTANT: bearer-token compare uses crypto/subtle.ConstantTimeCompare.
- IMPORTANT: SandboxRemoteRunnersModule.Init rejects reserved names
  (""/local-docker/ephemeral), duplicate names, and no-auth-no-TLS runners
  (unless per-spec allow_insecure: true).
- MINOR: drainStream now fails the test on unexpected stream errors; added a
  non-zero exit-code (exit 7) streaming test.
- MINOR: removed the bogus Inputs port from the sandbox.remote_runners editor
  schema (provider-only module); regenerated golden.
- MINOR: dropped windows/amd64 from the sandbox-runner release matrix (Linux
  Docker daemon target).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 3, 2026 03:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a remote sandbox execution agent (workflow-sandbox-runner) and engine-side static configuration wiring so step.sandbox_exec can dispatch to named remote runners via exec_env: <runner-name>, including schema/docs/release updates.

Changes:

  • Introduces sandbox.remote_runners module type that registers a RemoteRunnerRegistry for named remote sandbox agents, with token secret resolution and TLS options.
  • Updates step.sandbox_exec runtime resolution to route non-local-docker exec_env values to configured remote runners, and forwards an informational sandbox profile to remote agents.
  • Adds the cmd/workflow-sandbox-runner gRPC agent implementation (profile clamping, env secret resolution, bearer auth interceptor) and ships it in the release workflow.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
schema/testdata/editor-schemas.golden.json Adds editor schema entry for sandbox.remote_runners module config.
schema/schema.go Registers sandbox.remote_runners as a core module type.
schema/module_schema.go Adds built-in schema for sandbox.remote_runners outputs/config fields.
sandbox/profile.go Records the originating profile name into SandboxConfig for forwarding.
sandbox/docker.go Adds SandboxConfig.Profile + GetProfile() helper (defaults to strict).
plugins/pipelinesteps/plugin.go Exposes sandbox.remote_runners module type + factory via pipelinesteps plugin.
module/sandbox_remote_runners.go Implements the remote runner registry module, spec parsing, TLS construction, and RemoteRunner building.
module/sandbox_remote_runners_test.go Adds unit tests for registry parsing and validation rules.
module/pipeline_step_sandbox_exec.go Defers exec_env validation to runtime and wires context into runner resolution.
module/execenv_factory.go Resolves named remote runners via service registry + secrets provider token resolution.
module/execenv_factory_test.go Updates tests for new runtime validation and token secret resolution behavior.
DOCUMENTATION.md Documents the new sandbox.remote_runners module type.
cmd/workflow-sandbox-runner/server.go Implements the SandboxExec gRPC server (profile clamp, env secret resolution, streaming output, bearer auth).
cmd/workflow-sandbox-runner/server_test.go Adds bufconn-based gRPC tests for clamp/secret/auth/output streaming behaviors.
cmd/workflow-sandbox-runner/main.go Adds the agent entrypoint, auth gate, secrets backend selection, and TLS/bearer server setup.
cmd/workflow-sandbox-runner/main_test.go Tests the unauthenticated-start refusal gate logic.
cmd/wfctl/type_registry.go Adds sandbox.remote_runners to wfctl’s known module type registry.
.github/workflows/release.yml Builds/releases the new workflow-sandbox-runner binary (non-Windows).

Comment thread cmd/workflow-sandbox-runner/main.go
Comment thread module/sandbox_remote_runners.go
Comment thread module/sandbox_remote_runners.go
Comment thread module/sandbox_remote_runners.go Outdated
Comment thread cmd/workflow-sandbox-runner/server.go Outdated
Comment thread cmd/workflow-sandbox-runner/server.go
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

⏱ Benchmark Results

No significant performance regressions detected.

benchstat comparison (baseline → PR)
## benchstat: baseline → PR
baseline-bench.txt:302: parsing iteration count: invalid syntax
baseline-bench.txt:354441: parsing iteration count: invalid syntax
baseline-bench.txt:711733: parsing iteration count: invalid syntax
baseline-bench.txt:1046258: parsing iteration count: invalid syntax
baseline-bench.txt:1366263: parsing iteration count: invalid syntax
baseline-bench.txt:1709884: parsing iteration count: invalid syntax
benchmark-results.txt:304: parsing iteration count: invalid syntax
benchmark-results.txt:340061: parsing iteration count: invalid syntax
benchmark-results.txt:644415: parsing iteration count: invalid syntax
benchmark-results.txt:941629: parsing iteration count: invalid syntax
benchmark-results.txt:1233368: parsing iteration count: invalid syntax
benchmark-results.txt:1556235: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 7763 64-Core Processor                
                            │ benchmark-results.txt │
                            │        sec/op         │
InterpreterCreation-4                  6.927m ± 60%
ComponentLoad-4                        3.572m ± 10%
ComponentExecute-4                     1.944µ ±  1%
PoolContention/workers-1-4             1.084µ ±  5%
PoolContention/workers-2-4             1.083µ ±  1%
PoolContention/workers-4-4             1.080µ ±  1%
PoolContention/workers-8-4             1.082µ ±  4%
PoolContention/workers-16-4            1.081µ ±  1%
ComponentLifecycle-4                   3.599m ±  1%
SourceValidation-4                     2.313µ ±  0%
RegistryConcurrent-4                   781.5n ±  6%
LoaderLoadFromString-4                 3.589m ±  1%
geomean                                18.56µ

                            │ benchmark-results.txt │
                            │         B/op          │
InterpreterCreation-4                  2.027Mi ± 0%
ComponentLoad-4                        2.180Mi ± 0%
ComponentExecute-4                     1.203Ki ± 0%
PoolContention/workers-1-4             1.203Ki ± 0%
PoolContention/workers-2-4             1.203Ki ± 0%
PoolContention/workers-4-4             1.203Ki ± 0%
PoolContention/workers-8-4             1.203Ki ± 0%
PoolContention/workers-16-4            1.203Ki ± 0%
ComponentLifecycle-4                   2.183Mi ± 0%
SourceValidation-4                     1.984Ki ± 0%
RegistryConcurrent-4                   1.133Ki ± 0%
LoaderLoadFromString-4                 2.182Mi ± 0%
geomean                                15.25Ki

                            │ benchmark-results.txt │
                            │       allocs/op       │
InterpreterCreation-4                   15.68k ± 0%
ComponentLoad-4                         18.02k ± 0%
ComponentExecute-4                       25.00 ± 0%
PoolContention/workers-1-4               25.00 ± 0%
PoolContention/workers-2-4               25.00 ± 0%
PoolContention/workers-4-4               25.00 ± 0%
PoolContention/workers-8-4               25.00 ± 0%
PoolContention/workers-16-4              25.00 ± 0%
ComponentLifecycle-4                    18.07k ± 0%
SourceValidation-4                       32.00 ± 0%
RegistryConcurrent-4                     2.000 ± 0%
LoaderLoadFromString-4                  18.06k ± 0%
geomean                                  183.3

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                            │ baseline-bench.txt │
                            │       sec/op       │
InterpreterCreation-4               7.713m ± 61%
ComponentLoad-4                     3.386m ±  0%
ComponentExecute-4                  1.953µ ±  1%
PoolContention/workers-1-4          1.168µ ±  1%
PoolContention/workers-2-4          1.167µ ±  2%
PoolContention/workers-4-4          1.171µ ±  1%
PoolContention/workers-8-4          1.172µ ±  1%
PoolContention/workers-16-4         1.199µ ±  2%
ComponentLifecycle-4                3.445m ±  3%
SourceValidation-4                  2.244µ ±  1%
RegistryConcurrent-4                939.2n ±  6%
LoaderLoadFromString-4              3.540m ±  2%
geomean                             19.45µ

                            │ baseline-bench.txt │
                            │        B/op        │
InterpreterCreation-4               2.027Mi ± 0%
ComponentLoad-4                     2.180Mi ± 0%
ComponentExecute-4                  1.203Ki ± 0%
PoolContention/workers-1-4          1.203Ki ± 0%
PoolContention/workers-2-4          1.203Ki ± 0%
PoolContention/workers-4-4          1.203Ki ± 0%
PoolContention/workers-8-4          1.203Ki ± 0%
PoolContention/workers-16-4         1.203Ki ± 0%
ComponentLifecycle-4                2.183Mi ± 0%
SourceValidation-4                  1.984Ki ± 0%
RegistryConcurrent-4                1.133Ki ± 0%
LoaderLoadFromString-4              2.182Mi ± 0%
geomean                             15.25Ki

                            │ baseline-bench.txt │
                            │     allocs/op      │
InterpreterCreation-4                15.68k ± 0%
ComponentLoad-4                      18.02k ± 0%
ComponentExecute-4                    25.00 ± 0%
PoolContention/workers-1-4            25.00 ± 0%
PoolContention/workers-2-4            25.00 ± 0%
PoolContention/workers-4-4            25.00 ± 0%
PoolContention/workers-8-4            25.00 ± 0%
PoolContention/workers-16-4           25.00 ± 0%
ComponentLifecycle-4                 18.07k ± 0%
SourceValidation-4                    32.00 ± 0%
RegistryConcurrent-4                  2.000 ± 0%
LoaderLoadFromString-4               18.06k ± 0%
geomean                               183.3

pkg: github.com/GoCodeAlone/workflow/middleware
cpu: AMD EPYC 7763 64-Core Processor                
                                  │ benchmark-results.txt │
                                  │        sec/op         │
CircuitBreakerDetection-4                    291.5n ± 18%
CircuitBreakerExecution_Success-4            21.53n ±  0%
CircuitBreakerExecution_Failure-4            66.24n ±  0%
geomean                                      74.63n

                                  │ benchmark-results.txt │
                                  │         B/op          │
CircuitBreakerDetection-4                    144.0 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

                                  │ benchmark-results.txt │
                                  │       allocs/op       │
CircuitBreakerDetection-4                    1.000 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                  │ baseline-bench.txt │
                                  │       sec/op       │
CircuitBreakerDetection-4                  458.0n ± 2%
CircuitBreakerExecution_Success-4          59.72n ± 1%
CircuitBreakerExecution_Failure-4          66.25n ± 0%
geomean                                    121.9n

                                  │ baseline-bench.txt │
                                  │        B/op        │
CircuitBreakerDetection-4                 144.0 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │
                                  │     allocs/op      │
CircuitBreakerDetection-4                 1.000 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
cpu: AMD EPYC 7763 64-Core Processor                
                                 │ benchmark-results.txt │
                                 │        sec/op         │
IaCStateBackend_InProcess-4                 315.2n ± 28%
IaCStateBackend_GRPC-4                      9.479m ±  3%
JQTransform_Simple-4                        659.1n ± 37%
JQTransform_ObjectConstruction-4            1.500µ ±  1%
JQTransform_ArraySelect-4                   3.421µ ±  1%
JQTransform_Complex-4                       39.00µ ±  1%
JQTransform_Throughput-4                    1.823µ ±  1%
SSEPublishDelivery-4                        68.41n ±  1%
geomean                                     3.859µ

                                 │ benchmark-results.txt │
                                 │         B/op          │
IaCStateBackend_InProcess-4                 416.0 ± 0%
IaCStateBackend_GRPC-4                    6.011Mi ± 6%
JQTransform_Simple-4                      1.273Ki ± 0%
JQTransform_ObjectConstruction-4          1.773Ki ± 0%
JQTransform_ArraySelect-4                 2.625Ki ± 0%
JQTransform_Complex-4                     16.31Ki ± 0%
JQTransform_Throughput-4                  1.984Ki ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                 │ benchmark-results.txt │
                                 │       allocs/op       │
IaCStateBackend_InProcess-4                 2.000 ± 0%
IaCStateBackend_GRPC-4                     6.846k ± 0%
JQTransform_Simple-4                        10.00 ± 0%
JQTransform_ObjectConstruction-4            15.00 ± 0%
JQTransform_ArraySelect-4                   30.00 ± 0%
JQTransform_Complex-4                       328.0 ± 0%
JQTransform_Throughput-4                    17.00 ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                 │ baseline-bench.txt │
                                 │       sec/op       │
IaCStateBackend_InProcess-4              339.8n ± 24%
IaCStateBackend_GRPC-4                   9.660m ±  1%
JQTransform_Simple-4                     697.4n ± 31%
JQTransform_ObjectConstruction-4         1.524µ ±  2%
JQTransform_ArraySelect-4                3.283µ ±  0%
JQTransform_Complex-4                    36.22µ ±  1%
JQTransform_Throughput-4                 1.841µ ±  0%
SSEPublishDelivery-4                     76.53n ±  1%
geomean                                  3.943µ

                                 │ baseline-bench.txt │
                                 │        B/op        │
IaCStateBackend_InProcess-4             416.0 ±  0%
IaCStateBackend_GRPC-4                5.603Mi ± 14%
JQTransform_Simple-4                  1.273Ki ±  0%
JQTransform_ObjectConstruction-4      1.773Ki ±  0%
JQTransform_ArraySelect-4             2.625Ki ±  0%
JQTransform_Complex-4                 16.31Ki ±  0%
JQTransform_Throughput-4              1.984Ki ±  0%
SSEPublishDelivery-4                    0.000 ±  0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │
                                 │     allocs/op      │
IaCStateBackend_InProcess-4              2.000 ± 0%
IaCStateBackend_GRPC-4                  6.872k ± 0%
JQTransform_Simple-4                     10.00 ± 0%
JQTransform_ObjectConstruction-4         15.00 ± 0%
JQTransform_ArraySelect-4                30.00 ± 0%
JQTransform_Complex-4                    328.0 ± 0%
JQTransform_Throughput-4                 17.00 ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ benchmark-results.txt │
                                    │        sec/op         │
SchemaValidation_Simple-4                       1.092µ ± 3%
SchemaValidation_AllFields-4                    1.660µ ± 2%
SchemaValidation_FormatValidation-4             1.576µ ± 1%
SchemaValidation_ManySchemas-4                  1.809µ ± 4%
geomean                                         1.508µ

                                    │ benchmark-results.txt │
                                    │         B/op          │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

                                    │ benchmark-results.txt │
                                    │       allocs/op       │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                    │ baseline-bench.txt │
                                    │       sec/op       │
SchemaValidation_Simple-4                    1.044µ ± 4%
SchemaValidation_AllFields-4                 1.520µ ± 6%
SchemaValidation_FormatValidation-4          1.486µ ± 2%
SchemaValidation_ManySchemas-4               1.486µ ± 5%
geomean                                      1.368µ

                                    │ baseline-bench.txt │
                                    │        B/op        │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │
                                    │     allocs/op      │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
cpu: AMD EPYC 7763 64-Core Processor                
                                   │ benchmark-results.txt │
                                   │        sec/op         │
EventStoreAppend_InMemory-4                   1.242µ ± 13%
EventStoreAppend_SQLite-4                     1.396m ±  6%
GetTimeline_InMemory/events-10-4              14.32µ ±  5%
GetTimeline_InMemory/events-50-4              80.58µ ±  5%
GetTimeline_InMemory/events-100-4             159.0µ ± 21%
GetTimeline_InMemory/events-500-4             638.0µ ±  0%
GetTimeline_InMemory/events-1000-4            1.304m ±  0%
GetTimeline_SQLite/events-10-4                71.71µ ±  2%
GetTimeline_SQLite/events-50-4                215.9µ ±  1%
GetTimeline_SQLite/events-100-4               394.3µ ±  1%
GetTimeline_SQLite/events-500-4               1.793m ±  1%
GetTimeline_SQLite/events-1000-4              3.539m ±  0%
geomean                                       215.9µ

                                   │ benchmark-results.txt │
                                   │         B/op          │
EventStoreAppend_InMemory-4                     823.0 ± 5%
EventStoreAppend_SQLite-4                     1.986Ki ± 1%
GetTimeline_InMemory/events-10-4              7.953Ki ± 0%
GetTimeline_InMemory/events-50-4              46.62Ki ± 0%
GetTimeline_InMemory/events-100-4             94.48Ki ± 0%
GetTimeline_InMemory/events-500-4             472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4            944.3Ki ± 0%
GetTimeline_SQLite/events-10-4                16.74Ki ± 0%
GetTimeline_SQLite/events-50-4                87.14Ki ± 0%
GetTimeline_SQLite/events-100-4               175.4Ki ± 0%
GetTimeline_SQLite/events-500-4               846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4              1.639Mi ± 0%
geomean                                       67.58Ki

                                   │ benchmark-results.txt │
                                   │       allocs/op       │
EventStoreAppend_InMemory-4                     7.000 ± 0%
EventStoreAppend_SQLite-4                       53.00 ± 0%
GetTimeline_InMemory/events-10-4                125.0 ± 0%
GetTimeline_InMemory/events-50-4                653.0 ± 0%
GetTimeline_InMemory/events-100-4              1.306k ± 0%
GetTimeline_InMemory/events-500-4              6.514k ± 0%
GetTimeline_InMemory/events-1000-4             13.02k ± 0%
GetTimeline_SQLite/events-10-4                  382.0 ± 0%
GetTimeline_SQLite/events-50-4                 1.852k ± 0%
GetTimeline_SQLite/events-100-4                3.681k ± 0%
GetTimeline_SQLite/events-500-4                18.54k ± 0%
GetTimeline_SQLite/events-1000-4               37.29k ± 0%
geomean                                        1.162k

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                   │ baseline-bench.txt │
                                   │       sec/op       │
EventStoreAppend_InMemory-4                1.117µ ± 16%
EventStoreAppend_SQLite-4                  880.1µ ±  1%
GetTimeline_InMemory/events-10-4           14.40µ ±  8%
GetTimeline_InMemory/events-50-4           79.36µ ±  2%
GetTimeline_InMemory/events-100-4          157.7µ ± 11%
GetTimeline_InMemory/events-500-4          646.1µ ±  1%
GetTimeline_InMemory/events-1000-4         1.310m ±  1%
GetTimeline_SQLite/events-10-4             60.38µ ±  1%
GetTimeline_SQLite/events-50-4             211.4µ ±  1%
GetTimeline_SQLite/events-100-4            398.9µ ±  2%
GetTimeline_SQLite/events-500-4            1.877m ±  1%
GetTimeline_SQLite/events-1000-4           3.741m ±  2%
geomean                                    204.5µ

                                   │ baseline-bench.txt │
                                   │        B/op        │
EventStoreAppend_InMemory-4                  762.5 ± 9%
EventStoreAppend_SQLite-4                  1.985Ki ± 1%
GetTimeline_InMemory/events-10-4           7.953Ki ± 0%
GetTimeline_InMemory/events-50-4           46.62Ki ± 0%
GetTimeline_InMemory/events-100-4          94.48Ki ± 0%
GetTimeline_InMemory/events-500-4          472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4         944.3Ki ± 0%
GetTimeline_SQLite/events-10-4             16.74Ki ± 0%
GetTimeline_SQLite/events-50-4             87.14Ki ± 0%
GetTimeline_SQLite/events-100-4            175.4Ki ± 0%
GetTimeline_SQLite/events-500-4            846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4           1.639Mi ± 0%
geomean                                    67.15Ki

                                   │ baseline-bench.txt │
                                   │     allocs/op      │
EventStoreAppend_InMemory-4                  7.000 ± 0%
EventStoreAppend_SQLite-4                    53.00 ± 0%
GetTimeline_InMemory/events-10-4             125.0 ± 0%
GetTimeline_InMemory/events-50-4             653.0 ± 0%
GetTimeline_InMemory/events-100-4           1.306k ± 0%
GetTimeline_InMemory/events-500-4           6.514k ± 0%
GetTimeline_InMemory/events-1000-4          13.02k ± 0%
GetTimeline_SQLite/events-10-4               382.0 ± 0%
GetTimeline_SQLite/events-50-4              1.852k ± 0%
GetTimeline_SQLite/events-100-4             3.681k ± 0%
GetTimeline_SQLite/events-500-4             18.54k ± 0%
GetTimeline_SQLite/events-1000-4            37.29k ± 0%
geomean                                     1.162k

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 3, 2026

Forward fix for 6 hardening issues on the PR8 security surface:

1. main.go buildServerOptions: refuse --tls-ca without --tls-cert AND --tls-key.
   Previously a CA-only invocation started with INSECURE transport while
   checkAuthRequirement treated --tls-ca as "auth configured" → no TLS + no token.
   mTLS needs the server's own certificate; fail-fast otherwise.
2. module Init: a runner with a token but no TLS (and no allow_insecure) now
   fails at Init (cleartext-token leak) instead of late in NewRemoteRunner at
   first Execute. Also validates tls.cert/tls.key both-or-neither at Init.
3. buildTLSConfig: error when exactly one of cert/key is set (both-or-neither)
   rather than silently dropping client authentication.
4. doc: AllowInsecure comment now reflects that it also permits no-token+no-TLS
   runners (not just token-over-no-TLS).
5. server.go bearer compare: compare fixed-length SHA-256 digests via
   subtle.ConstantTimeCompare. Raw ConstantTimeCompare is only constant-time for
   equal-length inputs, leaking the expected token length on a mismatch.
6. server.go Exec: validate empty command / missing image up front and return
   codes.InvalidArgument (caller error) instead of letting the Docker runner
   surface them as codes.Internal.

Tests: buildServerOptions CA-without-cert/key + cert/key-mismatch + no-TLS;
module token-no-TLS (+allow_insecure OK), tls cert/key both-or-neither, CA-only
OK; buildTLSConfig mismatch + neither; wrong-length token Unauthenticated;
empty-command + missing-image InvalidArgument.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@intel352 intel352 merged commit 0a833ed into main Jun 3, 2026
22 checks passed
@intel352 intel352 deleted the feat/infra-p3b-agent branch June 3, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants