Skip to content

Add safe try-activate rollback for plugin/config hot-reload#684

Merged
intel352 merged 5 commits into
mainfrom
copilot/add-safe-plugin-hot-reload
May 15, 2026
Merged

Add safe try-activate rollback for plugin/config hot-reload#684
intel352 merged 5 commits into
mainfrom
copilot/add-safe-plugin-hot-reload

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 15, 2026

reloadEngine stopped the current engine before building the replacement — a build failure left the server with no running engine. ReloadPlugin similarly had unsafe stop-then-start semantics. Neither path defined try-activate, rollback, or probe contracts.

Core: safe reloadEngine sequence

Before: stop → build → start (build failure = degraded)

After: build candidate → stop old (only on build success) → start candidate → rollback to previous config on start failure

// Stage 1: build candidate — current engine stays live the entire time
newEngine, _, _, buildErr := buildEngine(newCfg, logger)
if buildErr != nil {
    return fmt.Errorf("failed to build candidate engine (current engine unchanged): %w", buildErr)
}
// Stage 2: stop old, only reached when candidate is viable
oldEngine.Stop(ctx)
// Stage 3: start candidate; rollback to oldConfig on failure
if startErr := newEngine.Start(ctx); startErr != nil {
    rollbackEngine, _ := buildEngine(oldConfig, logger)
    rollbackEngine.Start(ctx)
    return fmt.Errorf("reload failed (rolled back to previous config): %w", startErr)
}

Try-activate probe endpoint

New POST /api/workflow/try-activate (also registered at /api/v1/admin/engine/try-activate) builds a candidate engine without touching the active one. Returns structured result for automated update campaigns:

{ "status": "build_ok", "moduleTypes": ["http.server"], "stepTypes": ["step.set"], "triggerTypes": ["http"] }
{ "status": "build_failed", "error": "module type \"bad.type\" not found" }
  • StdEngine.RegisteredModuleTypes/StepTypes/TriggerTypes() expose registered type names for the probe result
  • WorkflowUIHandler.SetTryActivateFunc wires the probe into the management handler

Tests

  • TestReloadEngine_BuildFailureKeepsPriorEngineActive — build failure leaves engine pointer unchanged
  • TestReloadEngine_SuccessReplacesEngine — successful reload replaces engine
  • TestTryActivateEngine_Valid/Invalid — probe returns build_ok/build_failed without touching active engine
  • TestStdEngine_RegisteredModule/Step/TriggerTypes — type enumeration and sort order
  • TestWorkflowUIHandler_TryActivate_* — HTTP 200/422/503 coverage

Docs

APPLICATION_LIFECYCLE.md, DEPLOYMENT_GUIDE.md, and PLUGIN_DEVELOPMENT_GUIDE.md now explicitly distinguish the legacy unsafe unload→load pattern from the current safe try-activate contract.

Copilot AI self-assigned this May 15, 2026
Copilot AI review requested due to automatic review settings May 15, 2026 13:06
Copilot AI review requested due to automatic review settings May 15, 2026 13:06
Copilot AI linked an issue May 15, 2026 that may be closed by this pull request
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 87.23404% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
cmd/server/main.go 75.00% 6 Missing and 4 partials ⚠️
module/api_workflow_ui.go 92.00% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

⏱ Benchmark Results

No significant performance regressions detected.

benchstat comparison (baseline → PR)
## benchstat: baseline → PR
baseline-bench.txt:276: parsing iteration count: invalid syntax
baseline-bench.txt:347521: parsing iteration count: invalid syntax
baseline-bench.txt:683936: parsing iteration count: invalid syntax
baseline-bench.txt:1035174: parsing iteration count: invalid syntax
baseline-bench.txt:1359625: parsing iteration count: invalid syntax
baseline-bench.txt:1630286: parsing iteration count: invalid syntax
benchmark-results.txt:276: parsing iteration count: invalid syntax
benchmark-results.txt:266290: parsing iteration count: invalid syntax
benchmark-results.txt:576866: parsing iteration count: invalid syntax
benchmark-results.txt:885597: parsing iteration count: invalid syntax
benchmark-results.txt:1159408: parsing iteration count: invalid syntax
benchmark-results.txt:1470611: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 9V74 80-Core Processor                
                            │ benchmark-results.txt │
                            │        sec/op         │
InterpreterCreation-4                  6.455m ± 71%
ComponentLoad-4                        3.531m ±  7%
ComponentExecute-4                     1.854µ ±  3%
PoolContention/workers-1-4             1.017µ ±  1%
PoolContention/workers-2-4             1.013µ ±  1%
PoolContention/workers-4-4             1.007µ ±  1%
PoolContention/workers-8-4             1.011µ ±  1%
PoolContention/workers-16-4            1.015µ ±  1%
ComponentLifecycle-4                   3.548m ±  0%
SourceValidation-4                     2.099µ ±  3%
RegistryConcurrent-4                   749.9n ±  5%
LoaderLoadFromString-4                 3.588m ±  3%
geomean                                17.63µ

                            │ benchmark-results.txt │
                            │         B/op          │
InterpreterCreation-4                  2.027Mi ± 0%
ComponentLoad-4                        2.180Mi ± 0%
ComponentExecute-4                     1.203Ki ± 0%
PoolContention/workers-1-4             1.203Ki ± 0%
PoolContention/workers-2-4             1.203Ki ± 0%
PoolContention/workers-4-4             1.203Ki ± 0%
PoolContention/workers-8-4             1.203Ki ± 0%
PoolContention/workers-16-4            1.203Ki ± 0%
ComponentLifecycle-4                   2.183Mi ± 0%
SourceValidation-4                     1.984Ki ± 0%
RegistryConcurrent-4                   1.133Ki ± 0%
LoaderLoadFromString-4                 2.182Mi ± 0%
geomean                                15.25Ki

                            │ benchmark-results.txt │
                            │       allocs/op       │
InterpreterCreation-4                   15.68k ± 0%
ComponentLoad-4                         18.02k ± 0%
ComponentExecute-4                       25.00 ± 0%
PoolContention/workers-1-4               25.00 ± 0%
PoolContention/workers-2-4               25.00 ± 0%
PoolContention/workers-4-4               25.00 ± 0%
PoolContention/workers-8-4               25.00 ± 0%
PoolContention/workers-16-4              25.00 ± 0%
ComponentLifecycle-4                    18.07k ± 0%
SourceValidation-4                       32.00 ± 0%
RegistryConcurrent-4                     2.000 ± 0%
LoaderLoadFromString-4                  18.06k ± 0%
geomean                                  183.3

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                            │ baseline-bench.txt │
                            │       sec/op       │
InterpreterCreation-4               7.661m ± 59%
ComponentLoad-4                     3.461m ±  2%
ComponentExecute-4                  1.839µ ±  1%
PoolContention/workers-1-4          1.175µ ±  1%
PoolContention/workers-2-4          1.178µ ±  1%
PoolContention/workers-4-4          1.171µ ±  1%
PoolContention/workers-8-4          1.181µ ±  1%
PoolContention/workers-16-4         1.174µ ±  1%
ComponentLifecycle-4                3.437m ±  1%
SourceValidation-4                  2.212µ ±  1%
RegistryConcurrent-4                867.9n ±  2%
LoaderLoadFromString-4              3.520m ±  2%
geomean                             19.22µ

                            │ baseline-bench.txt │
                            │        B/op        │
InterpreterCreation-4               2.027Mi ± 0%
ComponentLoad-4                     2.180Mi ± 0%
ComponentExecute-4                  1.203Ki ± 0%
PoolContention/workers-1-4          1.203Ki ± 0%
PoolContention/workers-2-4          1.203Ki ± 0%
PoolContention/workers-4-4          1.203Ki ± 0%
PoolContention/workers-8-4          1.203Ki ± 0%
PoolContention/workers-16-4         1.203Ki ± 0%
ComponentLifecycle-4                2.183Mi ± 0%
SourceValidation-4                  1.984Ki ± 0%
RegistryConcurrent-4                1.133Ki ± 0%
LoaderLoadFromString-4              2.182Mi ± 0%
geomean                             15.25Ki

                            │ baseline-bench.txt │
                            │     allocs/op      │
InterpreterCreation-4                15.68k ± 0%
ComponentLoad-4                      18.02k ± 0%
ComponentExecute-4                    25.00 ± 0%
PoolContention/workers-1-4            25.00 ± 0%
PoolContention/workers-2-4            25.00 ± 0%
PoolContention/workers-4-4            25.00 ± 0%
PoolContention/workers-8-4            25.00 ± 0%
PoolContention/workers-16-4           25.00 ± 0%
ComponentLifecycle-4                 18.07k ± 0%
SourceValidation-4                    32.00 ± 0%
RegistryConcurrent-4                  2.000 ± 0%
LoaderLoadFromString-4               18.06k ± 0%
geomean                               183.3

pkg: github.com/GoCodeAlone/workflow/middleware
cpu: AMD EPYC 9V74 80-Core Processor                
                                  │ benchmark-results.txt │
                                  │        sec/op         │
CircuitBreakerDetection-4                     297.2n ± 7%
CircuitBreakerExecution_Success-4             22.69n ± 1%
CircuitBreakerExecution_Failure-4             70.90n ± 0%
geomean                                       78.20n

                                  │ benchmark-results.txt │
                                  │         B/op          │
CircuitBreakerDetection-4                    144.0 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

                                  │ benchmark-results.txt │
                                  │       allocs/op       │
CircuitBreakerDetection-4                    1.000 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                  │ baseline-bench.txt │
                                  │       sec/op       │
CircuitBreakerDetection-4                  448.3n ± 6%
CircuitBreakerExecution_Success-4          59.64n ± 0%
CircuitBreakerExecution_Failure-4          64.71n ± 0%
geomean                                    120.1n

                                  │ baseline-bench.txt │
                                  │        B/op        │
CircuitBreakerDetection-4                 144.0 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │
                                  │     allocs/op      │
CircuitBreakerDetection-4                 1.000 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
cpu: AMD EPYC 9V74 80-Core Processor                
                                 │ benchmark-results.txt │
                                 │        sec/op         │
IaCStateBackend_InProcess-4                 290.6n ± 24%
IaCStateBackend_GRPC-4                      9.739m ± 18%
JQTransform_Simple-4                        663.1n ± 25%
JQTransform_ObjectConstruction-4            1.407µ ±  1%
JQTransform_ArraySelect-4                   3.390µ ±  2%
JQTransform_Complex-4                       41.40µ ±  1%
JQTransform_Throughput-4                    1.736µ ±  1%
SSEPublishDelivery-4                        65.05n ±  2%
geomean                                     3.782µ

                                 │ benchmark-results.txt │
                                 │         B/op          │
IaCStateBackend_InProcess-4                416.0 ±  0%
IaCStateBackend_GRPC-4                   5.631Mi ± 19%
JQTransform_Simple-4                     1.273Ki ±  0%
JQTransform_ObjectConstruction-4         1.773Ki ±  0%
JQTransform_ArraySelect-4                2.625Ki ±  0%
JQTransform_Complex-4                    16.22Ki ±  0%
JQTransform_Throughput-4                 1.984Ki ±  0%
SSEPublishDelivery-4                       0.000 ±  0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                 │ benchmark-results.txt │
                                 │       allocs/op       │
IaCStateBackend_InProcess-4                 2.000 ± 0%
IaCStateBackend_GRPC-4                     6.857k ± 0%
JQTransform_Simple-4                        10.00 ± 0%
JQTransform_ObjectConstruction-4            15.00 ± 0%
JQTransform_ArraySelect-4                   30.00 ± 0%
JQTransform_Complex-4                       324.0 ± 0%
JQTransform_Throughput-4                    17.00 ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                 │ baseline-bench.txt │
                                 │       sec/op       │
IaCStateBackend_InProcess-4              333.2n ± 29%
IaCStateBackend_GRPC-4                   9.514m ±  3%
JQTransform_Simple-4                     710.5n ± 29%
JQTransform_ObjectConstruction-4         1.499µ ±  1%
JQTransform_ArraySelect-4                3.244µ ±  1%
JQTransform_Complex-4                    35.89µ ±  1%
JQTransform_Throughput-4                 1.821µ ±  1%
SSEPublishDelivery-4                     76.45n ±  1%
geomean                                  3.911µ

                                 │ baseline-bench.txt │
                                 │        B/op        │
IaCStateBackend_InProcess-4              416.0 ± 0%
IaCStateBackend_GRPC-4                 5.714Mi ± 6%
JQTransform_Simple-4                   1.273Ki ± 0%
JQTransform_ObjectConstruction-4       1.773Ki ± 0%
JQTransform_ArraySelect-4              2.625Ki ± 0%
JQTransform_Complex-4                  16.22Ki ± 0%
JQTransform_Throughput-4               1.984Ki ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │
                                 │     allocs/op      │
IaCStateBackend_InProcess-4              2.000 ± 0%
IaCStateBackend_GRPC-4                  6.872k ± 0%
JQTransform_Simple-4                     10.00 ± 0%
JQTransform_ObjectConstruction-4         15.00 ± 0%
JQTransform_ArraySelect-4                30.00 ± 0%
JQTransform_Complex-4                    324.0 ± 0%
JQTransform_Throughput-4                 17.00 ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
cpu: AMD EPYC 9V74 80-Core Processor                
                                    │ benchmark-results.txt │
                                    │        sec/op         │
SchemaValidation_Simple-4                       1.089µ ± 6%
SchemaValidation_AllFields-4                    1.630µ ± 5%
SchemaValidation_FormatValidation-4             1.569µ ± 1%
SchemaValidation_ManySchemas-4                  1.593µ ± 2%
geomean                                         1.451µ

                                    │ benchmark-results.txt │
                                    │         B/op          │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

                                    │ benchmark-results.txt │
                                    │       allocs/op       │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                    │ baseline-bench.txt │
                                    │       sec/op       │
SchemaValidation_Simple-4                    1.018µ ± 3%
SchemaValidation_AllFields-4                 1.546µ ± 2%
SchemaValidation_FormatValidation-4          1.499µ ± 0%
SchemaValidation_ManySchemas-4               1.504µ ± 4%
geomean                                      1.372µ

                                    │ baseline-bench.txt │
                                    │        B/op        │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │
                                    │     allocs/op      │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
cpu: AMD EPYC 9V74 80-Core Processor                
                                   │ benchmark-results.txt │
                                   │        sec/op         │
EventStoreAppend_InMemory-4                   1.116µ ± 17%
EventStoreAppend_SQLite-4                     1.059m ±  2%
GetTimeline_InMemory/events-10-4              12.35µ ±  2%
GetTimeline_InMemory/events-50-4              68.75µ ± 20%
GetTimeline_InMemory/events-100-4             109.7µ ±  1%
GetTimeline_InMemory/events-500-4             560.3µ ±  0%
GetTimeline_InMemory/events-1000-4            1.137m ±  0%
GetTimeline_SQLite/events-10-4                83.53µ ±  1%
GetTimeline_SQLite/events-50-4                217.5µ ±  1%
GetTimeline_SQLite/events-100-4               380.9µ ±  1%
GetTimeline_SQLite/events-500-4               1.648m ±  1%
GetTimeline_SQLite/events-1000-4              3.234m ±  2%
geomean                                       192.5µ

                                   │ benchmark-results.txt │
                                   │         B/op          │
EventStoreAppend_InMemory-4                     764.0 ± 6%
EventStoreAppend_SQLite-4                     1.985Ki ± 1%
GetTimeline_InMemory/events-10-4              7.953Ki ± 0%
GetTimeline_InMemory/events-50-4              46.62Ki ± 0%
GetTimeline_InMemory/events-100-4             94.48Ki ± 0%
GetTimeline_InMemory/events-500-4             472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4            944.3Ki ± 0%
GetTimeline_SQLite/events-10-4                16.74Ki ± 0%
GetTimeline_SQLite/events-50-4                87.14Ki ± 0%
GetTimeline_SQLite/events-100-4               175.4Ki ± 0%
GetTimeline_SQLite/events-500-4               846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4              1.639Mi ± 0%
geomean                                       67.16Ki

                                   │ benchmark-results.txt │
                                   │       allocs/op       │
EventStoreAppend_InMemory-4                     7.000 ± 0%
EventStoreAppend_SQLite-4                       53.00 ± 0%
GetTimeline_InMemory/events-10-4                125.0 ± 0%
GetTimeline_InMemory/events-50-4                653.0 ± 0%
GetTimeline_InMemory/events-100-4              1.306k ± 0%
GetTimeline_InMemory/events-500-4              6.514k ± 0%
GetTimeline_InMemory/events-1000-4             13.02k ± 0%
GetTimeline_SQLite/events-10-4                  382.0 ± 0%
GetTimeline_SQLite/events-50-4                 1.852k ± 0%
GetTimeline_SQLite/events-100-4                3.681k ± 0%
GetTimeline_SQLite/events-500-4                18.54k ± 0%
GetTimeline_SQLite/events-1000-4               37.29k ± 0%
geomean                                        1.162k

cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
                                   │ baseline-bench.txt │
                                   │       sec/op       │
EventStoreAppend_InMemory-4                1.119µ ±  4%
EventStoreAppend_SQLite-4                  915.3µ ±  6%
GetTimeline_InMemory/events-10-4           13.58µ ±  4%
GetTimeline_InMemory/events-50-4           74.38µ ±  3%
GetTimeline_InMemory/events-100-4          149.5µ ±  2%
GetTimeline_InMemory/events-500-4          746.4µ ± 21%
GetTimeline_InMemory/events-1000-4         1.195m ±  1%
GetTimeline_SQLite/events-10-4             80.53µ ±  1%
GetTimeline_SQLite/events-50-4             229.6µ ±  2%
GetTimeline_SQLite/events-100-4            415.9µ ±  3%
GetTimeline_SQLite/events-500-4            1.864m ±  1%
GetTimeline_SQLite/events-1000-4           3.720m ±  4%
geomean                                    210.0µ

                                   │ baseline-bench.txt │
                                   │        B/op        │
EventStoreAppend_InMemory-4                  801.0 ± 5%
EventStoreAppend_SQLite-4                  1.982Ki ± 2%
GetTimeline_InMemory/events-10-4           7.953Ki ± 0%
GetTimeline_InMemory/events-50-4           46.62Ki ± 0%
GetTimeline_InMemory/events-100-4          94.48Ki ± 0%
GetTimeline_InMemory/events-500-4          472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4         944.3Ki ± 0%
GetTimeline_SQLite/events-10-4             16.74Ki ± 0%
GetTimeline_SQLite/events-50-4             87.14Ki ± 0%
GetTimeline_SQLite/events-100-4            175.4Ki ± 0%
GetTimeline_SQLite/events-500-4            846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4           1.639Mi ± 0%
geomean                                    67.42Ki

                                   │ baseline-bench.txt │
                                   │     allocs/op      │
EventStoreAppend_InMemory-4                  7.000 ± 0%
EventStoreAppend_SQLite-4                    53.00 ± 0%
GetTimeline_InMemory/events-10-4             125.0 ± 0%
GetTimeline_InMemory/events-50-4             653.0 ± 0%
GetTimeline_InMemory/events-100-4           1.306k ± 0%
GetTimeline_InMemory/events-500-4           6.514k ± 0%
GetTimeline_InMemory/events-1000-4          13.02k ± 0%
GetTimeline_SQLite/events-10-4               382.0 ± 0%
GetTimeline_SQLite/events-50-4              1.852k ± 0%
GetTimeline_SQLite/events-100-4             3.681k ± 0%
GetTimeline_SQLite/events-500-4             18.54k ± 0%
GetTimeline_SQLite/events-1000-4            37.29k ± 0%
geomean                                     1.162k

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

- Fix reloadEngine to build candidate before stopping old engine
  (safe try-activate: build failure leaves current engine untouched)
- Add rollback: if candidate start fails, rebuild old config and restart
- Add tryActivateEngine probe that builds without swapping any pointers
- Add RegisteredModuleTypes/StepTypes/TriggerTypes to StdEngine
- Add TryActivateResult type and SetTryActivateFunc to WorkflowUIHandler
- Add POST /api/workflow/try-activate endpoint
- Add POST /api/v1/admin/engine/try-activate to OpenAPI schema
- Wire tryActivateEngine into server management handlers
- Add unit tests: reload build failure, success, try-activate success/failure
- Add engine tests: RegisteredModuleTypes/StepTypes/TriggerTypes
- Add UI handler tests: try-activate endpoint coverage
- Update docs: APPLICATION_LIFECYCLE.md, DEPLOYMENT_GUIDE.md,
  PLUGIN_DEVELOPMENT_GUIDE.md distinguish legacy vs safe reload semantics

Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/8b12843e-11ce-440e-b111-9201f556b02a

Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 15, 2026 13:33
Copilot AI changed the title [WIP] Add safe plugin/config hot-reload try-activate rollback Add safe try-activate rollback for plugin/config hot-reload May 15, 2026
Copilot AI requested a review from intel352 May 15, 2026 13:39
@intel352 intel352 marked this pull request as ready for review May 15, 2026 13:44
Copilot AI review requested due to automatic review settings May 15, 2026 13:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a safe try-activate/rollback contract for hot-reloading the engine and external plugins, plus a new probe endpoint that builds a candidate config without disturbing the live engine.

Changes:

  • reloadEngine reorders to build-candidate → stop-old → start-candidate, with automatic rollback to the previous config when the candidate fails to start.
  • Adds POST /api/workflow/try-activate (and /api/v1/admin/engine/try-activate) with a new TryActivateResult response and StdEngine.RegisteredModule/Step/TriggerTypes() accessors.
  • Documents the legacy-vs-safe semantics in APPLICATION_LIFECYCLE.md, DEPLOYMENT_GUIDE.md, and PLUGIN_DEVELOPMENT_GUIDE.md, and adds unit tests for the new code paths.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cmd/server/main.go Implements safe reload sequence and tryActivateEngine probe; wires probe into the management handler.
cmd/server/main_test.go Tests build-failure-preserves-engine, success replaces engine, and try-activate valid/invalid paths.
engine.go Adds RegisteredModuleTypes/StepTypes/TriggerTypes accessors.
engine_test.go Tests for the new accessors verifying contents and sort order.
module/api_workflow_ui.go New TryActivateResult type, SetTryActivateFunc, route registration, and handleTryActivate handler.
module/api_workflow_ui_test.go Tests handler 503/200/422 paths and ServeHTTP dispatch.
module/openapi_admin_schemas.go Adds OpenAPI operation schema for the new endpoint (component schema not yet registered).
docs/APPLICATION_LIFECYCLE.md Documents the three-stage safe reload and try-activate probe.
docs/DEPLOYMENT_GUIDE.md Documents safe reload sequence and adds dry-run probe section.
docs/PLUGIN_DEVELOPMENT_GUIDE.md Clarifies legacy vs. safe plugin reload semantics.

Comment thread module/openapi_admin_schemas.go
Comment thread cmd/server/main.go
Comment thread module/api_workflow_ui.go
Copilot AI review requested due to automatic review settings May 15, 2026 13:54
Copy link
Copy Markdown
Contributor

Addressed the three Copilot review findings in 0a4d878:

  • registered TryActivateResult in the OpenAPI component schemas and added an integration assertion for the try-activate refs
  • stop the failed candidate engine best-effort before rollback
  • synthesize a build_failed try-activate response when the callback returns nil with an error

Focused verification passed:

  • GOWORK=off go test ./module -run 'TestWorkflowUIHandler_TryActivate|TestAdminSchemasAppliedToOpenAPISpec' -count=1
  • GOWORK=off go test ./cmd/server -run 'TestReloadEngine_(BuildFailureKeepsPriorEngineActive|SuccessReplacesEngine|StartFailureRollsBackToPriorConfig|StartFailureReportsRollback)' -count=1

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@intel352 intel352 merged commit 5d259b7 into main May 15, 2026
22 checks passed
@intel352 intel352 deleted the copilot/add-safe-plugin-hot-reload branch May 15, 2026 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add safe plugin/config hot-reload try-activate rollback

3 participants