Problem
workflow-compute agent/provider updates need a Workflow-owned hot-reload path that can safely stage a plugin/config update, probe it, and rollback without crashing or unregistering the currently active plugin surface.
Current observations from workflow:
plugin/external/manager.go implements ReloadPlugin as unload-then-load. If the candidate binary/manifest fails to load, the old plugin process is already killed.
cmd/server/main.go startup discovery loads external plugins through a local manager, while the management API creates a separate ExternalPluginManager. The API can manage subprocesses but does not obviously own the already-loaded engine plugin registrations.
cmd/server/main.go full config reload stops the current engine before building/starting the replacement. Failure after stop can leave the running process degraded.
- Existing docs claim reload support, but they do not define try-activate, health probe, rollback, or crash-safe handoff semantics.
Required contract
Add a Workflow/wfctl-owned safe reload contract for plugin/config updates:
- Stage candidate plugin binary/config without replacing current active marker.
- Start candidate plugin process and perform handshake/strict contract validation.
- Build/probe candidate engine/config before stopping the current engine when possible.
- Swap active pointers only after probe success.
- On candidate load/probe failure, kill candidate and keep current engine/plugin active.
- Emit observable reload result/status for operators and agent update managers.
- Keep package/update artifact trust outside Workflow core; Workflow should consume already-staged local artifacts/config, not fetch arbitrary release URLs.
Acceptance
- Unit tests prove
ReloadPlugin failure preserves the old plugin client/process registration.
- Unit/integration tests prove config reload failure keeps the prior engine active.
- HTTP/API or wfctl surface exposes a dry-run/try-activate result with enough status to drive workflow-compute update campaigns.
- Docs explicitly distinguish legacy unload/load reload from safe try-activate rollback.
Downstream reference
GoCodeAlone/workflow-compute SPEC task T199, invariant V387: Workflow plugin hot-reload upgrade path supports try-activate, health probe, rollback, and crash-safe config/plugin binary handoff.
Problem
workflow-computeagent/provider updates need a Workflow-owned hot-reload path that can safely stage a plugin/config update, probe it, and rollback without crashing or unregistering the currently active plugin surface.Current observations from
workflow:plugin/external/manager.goimplementsReloadPluginas unload-then-load. If the candidate binary/manifest fails to load, the old plugin process is already killed.cmd/server/main.gostartup discovery loads external plugins through a local manager, while the management API creates a separateExternalPluginManager. The API can manage subprocesses but does not obviously own the already-loaded engine plugin registrations.cmd/server/main.gofull config reload stops the current engine before building/starting the replacement. Failure after stop can leave the running process degraded.Required contract
Add a Workflow/wfctl-owned safe reload contract for plugin/config updates:
Acceptance
ReloadPluginfailure preserves the old plugin client/process registration.Downstream reference
GoCodeAlone/workflow-computeSPEC taskT199, invariantV387: Workflow plugin hot-reload upgrade path supports try-activate, health probe, rollback, and crash-safe config/plugin binary handoff.