Skip to content

fix: add failureReason to conformance evidence and strip +dirty from wfctlVersion#649

Merged
intel352 merged 3 commits into
mainfrom
copilot/update-wfctl-plugin-conformance
May 13, 2026
Merged

fix: add failureReason to conformance evidence and strip +dirty from wfctlVersion#649
intel352 merged 3 commits into
mainfrom
copilot/update-wfctl-plugin-conformance

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 13, 2026

Conformance evidence for failed typed-IaC checks omitted the actual error message, forcing maintainers to reproduce failures locally. Additionally, wfctlVersion in evidence could be stamped with +dirty pseudo-versions that are not valid Go module pseudo-versions, causing CanonicalEvidenceEngineVersion to silently fall back to v0.0.0.

Changes

  • PluginCompatibilityEvidence (plugin_compat_model.go): Add failureReason string (json:"failureReason,omitempty"). Included in evidenceDigest computation when non-empty; absent from passing evidence.

  • runPluginConformanceCheck (plugin_conformance.go): Set ev.FailureReason = err.Error() before normalizing failed evidence. The error message (e.g. "typed IaC service not registered", "legacy InvokeService dispatch removed") is now captured in the evidence JSON alongside stderrTail.

  • buildVersion (main.go): Strip +dirty suffix from build info versions. The marker reflects build-environment VCS state, not a meaningful version difference. Strings like v0.22.8-0.20260510180701-a851625d3bf0+dirty are not valid Go pseudo-versions and would silently downgrade wfctlVersion in evidence to v0.0.0; stripping the suffix lets the real commit-bound pseudo-version propagate.

Failed evidence now looks like:

{
  "status": "fail",
  "failureReason": "plugin uses legacy InvokeService dispatch removed in workflow v1.0.0",
  "stderrTail": "...",
  "wfctlVersion": "v0.22.8-0.20260510180701-a851625d3bf0",
  ...
}

Copilot AI requested review from Copilot and removed request for Copilot May 13, 2026 05:58
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

⏱ Benchmark Results

No significant performance regressions detected.

benchstat comparison (baseline → PR)
## benchstat: baseline → PR
baseline-bench.txt:264: parsing iteration count: invalid syntax
baseline-bench.txt:291409: parsing iteration count: invalid syntax
baseline-bench.txt:558851: parsing iteration count: invalid syntax
baseline-bench.txt:1096044: parsing iteration count: invalid syntax
baseline-bench.txt:1386271: parsing iteration count: invalid syntax
baseline-bench.txt:1660442: parsing iteration count: invalid syntax
benchmark-results.txt:264: parsing iteration count: invalid syntax
benchmark-results.txt:340045: parsing iteration count: invalid syntax
benchmark-results.txt:628598: parsing iteration count: invalid syntax
benchmark-results.txt:969510: parsing iteration count: invalid syntax
benchmark-results.txt:1290256: parsing iteration count: invalid syntax
benchmark-results.txt:1627400: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 7763 64-Core Processor                
                            │ baseline-bench.txt │
                            │       sec/op       │
InterpreterCreation-4               7.657m ± 59%
ComponentLoad-4                     3.575m ±  0%
ComponentExecute-4                  1.924µ ±  1%
PoolContention/workers-1-4          1.089µ ±  1%
PoolContention/workers-2-4          1.074µ ±  4%
PoolContention/workers-4-4          1.081µ ±  1%
PoolContention/workers-8-4          1.083µ ±  2%
PoolContention/workers-16-4         1.092µ ±  1%
ComponentLifecycle-4                3.644m ±  1%
SourceValidation-4                  2.317µ ±  2%
RegistryConcurrent-4                807.9n ±  3%
LoaderLoadFromString-4              3.659m ±  1%
geomean                             18.81µ

                            │ baseline-bench.txt │
                            │        B/op        │
InterpreterCreation-4               2.027Mi ± 0%
ComponentLoad-4                     2.180Mi ± 0%
ComponentExecute-4                  1.203Ki ± 0%
PoolContention/workers-1-4          1.203Ki ± 0%
PoolContention/workers-2-4          1.203Ki ± 0%
PoolContention/workers-4-4          1.203Ki ± 0%
PoolContention/workers-8-4          1.203Ki ± 0%
PoolContention/workers-16-4         1.203Ki ± 0%
ComponentLifecycle-4                2.183Mi ± 0%
SourceValidation-4                  1.984Ki ± 0%
RegistryConcurrent-4                1.133Ki ± 0%
LoaderLoadFromString-4              2.182Mi ± 0%
geomean                             15.25Ki

                            │ baseline-bench.txt │
                            │     allocs/op      │
InterpreterCreation-4                15.68k ± 0%
ComponentLoad-4                      18.02k ± 0%
ComponentExecute-4                    25.00 ± 0%
PoolContention/workers-1-4            25.00 ± 0%
PoolContention/workers-2-4            25.00 ± 0%
PoolContention/workers-4-4            25.00 ± 0%
PoolContention/workers-8-4            25.00 ± 0%
PoolContention/workers-16-4           25.00 ± 0%
ComponentLifecycle-4                 18.07k ± 0%
SourceValidation-4                    32.00 ± 0%
RegistryConcurrent-4                  2.000 ± 0%
LoaderLoadFromString-4               18.06k ± 0%
geomean                               183.3

cpu: AMD EPYC 9V74 80-Core Processor                
                            │ benchmark-results.txt │
                            │        sec/op         │
InterpreterCreation-4                  6.399m ± 57%
ComponentLoad-4                        3.475m ±  1%
ComponentExecute-4                     1.824µ ±  1%
PoolContention/workers-1-4             1.006µ ±  1%
PoolContention/workers-2-4             1.013µ ±  1%
PoolContention/workers-4-4             1.012µ ±  2%
PoolContention/workers-8-4             1.013µ ±  1%
PoolContention/workers-16-4            1.010µ ±  1%
ComponentLifecycle-4                   3.490m ±  0%
SourceValidation-4                     2.076µ ±  1%
RegistryConcurrent-4                   778.5n ±  5%
LoaderLoadFromString-4                 3.538m ±  0%
geomean                                17.55µ

                            │ benchmark-results.txt │
                            │         B/op          │
InterpreterCreation-4                  2.027Mi ± 0%
ComponentLoad-4                        2.180Mi ± 0%
ComponentExecute-4                     1.203Ki ± 0%
PoolContention/workers-1-4             1.203Ki ± 0%
PoolContention/workers-2-4             1.203Ki ± 0%
PoolContention/workers-4-4             1.203Ki ± 0%
PoolContention/workers-8-4             1.203Ki ± 0%
PoolContention/workers-16-4            1.203Ki ± 0%
ComponentLifecycle-4                   2.183Mi ± 0%
SourceValidation-4                     1.984Ki ± 0%
RegistryConcurrent-4                   1.133Ki ± 0%
LoaderLoadFromString-4                 2.182Mi ± 0%
geomean                                15.25Ki

                            │ benchmark-results.txt │
                            │       allocs/op       │
InterpreterCreation-4                   15.68k ± 0%
ComponentLoad-4                         18.02k ± 0%
ComponentExecute-4                       25.00 ± 0%
PoolContention/workers-1-4               25.00 ± 0%
PoolContention/workers-2-4               25.00 ± 0%
PoolContention/workers-4-4               25.00 ± 0%
PoolContention/workers-8-4               25.00 ± 0%
PoolContention/workers-16-4              25.00 ± 0%
ComponentLifecycle-4                    18.07k ± 0%
SourceValidation-4                       32.00 ± 0%
RegistryConcurrent-4                     2.000 ± 0%
LoaderLoadFromString-4                  18.06k ± 0%
geomean                                  183.3

pkg: github.com/GoCodeAlone/workflow/middleware
cpu: AMD EPYC 7763 64-Core Processor                
                                  │ baseline-bench.txt │
                                  │       sec/op       │
CircuitBreakerDetection-4                  286.7n ± 5%
CircuitBreakerExecution_Success-4          21.38n ± 1%
CircuitBreakerExecution_Failure-4          66.81n ± 0%
geomean                                    74.26n

                                  │ baseline-bench.txt │
                                  │        B/op        │
CircuitBreakerDetection-4                 144.0 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │
                                  │     allocs/op      │
CircuitBreakerDetection-4                 1.000 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                  │ benchmark-results.txt │
                                  │        sec/op         │
CircuitBreakerDetection-4                    302.7n ± 12%
CircuitBreakerExecution_Success-4            22.68n ±  0%
CircuitBreakerExecution_Failure-4            70.94n ±  0%
geomean                                      78.67n

                                  │ benchmark-results.txt │
                                  │         B/op          │
CircuitBreakerDetection-4                    144.0 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

                                  │ benchmark-results.txt │
                                  │       allocs/op       │
CircuitBreakerDetection-4                    1.000 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
cpu: AMD EPYC 7763 64-Core Processor                
                                 │ baseline-bench.txt │
                                 │       sec/op       │
JQTransform_Simple-4                     869.8n ± 27%
JQTransform_ObjectConstruction-4         1.434µ ±  2%
JQTransform_ArraySelect-4                3.325µ ±  2%
JQTransform_Complex-4                    38.52µ ±  1%
JQTransform_Throughput-4                 1.760µ ±  1%
SSEPublishDelivery-4                     62.74n ±  1%
geomean                                  1.613µ

                                 │ baseline-bench.txt │
                                 │        B/op        │
JQTransform_Simple-4                   1.273Ki ± 0%
JQTransform_ObjectConstruction-4       1.773Ki ± 0%
JQTransform_ArraySelect-4              2.625Ki ± 0%
JQTransform_Complex-4                  16.22Ki ± 0%
JQTransform_Throughput-4               1.984Ki ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │
                                 │     allocs/op      │
JQTransform_Simple-4                     10.00 ± 0%
JQTransform_ObjectConstruction-4         15.00 ± 0%
JQTransform_ArraySelect-4                30.00 ± 0%
JQTransform_Complex-4                    324.0 ± 0%
JQTransform_Throughput-4                 17.00 ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                 │ benchmark-results.txt │
                                 │        sec/op         │
JQTransform_Simple-4                        819.1n ± 22%
JQTransform_ObjectConstruction-4            1.383µ ±  1%
JQTransform_ArraySelect-4                   3.343µ ±  1%
JQTransform_Complex-4                       40.80µ ±  1%
JQTransform_Throughput-4                    1.717µ ±  1%
SSEPublishDelivery-4                        63.25n ±  1%
geomean                                     1.600µ

                                 │ benchmark-results.txt │
                                 │         B/op          │
JQTransform_Simple-4                      1.273Ki ± 0%
JQTransform_ObjectConstruction-4          1.773Ki ± 0%
JQTransform_ArraySelect-4                 2.625Ki ± 0%
JQTransform_Complex-4                     16.22Ki ± 0%
JQTransform_Throughput-4                  1.984Ki ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                 │ benchmark-results.txt │
                                 │       allocs/op       │
JQTransform_Simple-4                        10.00 ± 0%
JQTransform_ObjectConstruction-4            15.00 ± 0%
JQTransform_ArraySelect-4                   30.00 ± 0%
JQTransform_Complex-4                       324.0 ± 0%
JQTransform_Throughput-4                    17.00 ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ baseline-bench.txt │
                                    │       sec/op       │
SchemaValidation_Simple-4                   1.105µ ±  2%
SchemaValidation_AllFields-4                1.666µ ± 25%
SchemaValidation_FormatValidation-4         1.587µ ±  1%
SchemaValidation_ManySchemas-4              1.836µ ±  1%
geomean                                     1.521µ

                                    │ baseline-bench.txt │
                                    │        B/op        │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │
                                    │     allocs/op      │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                    │ benchmark-results.txt │
                                    │        sec/op         │
SchemaValidation_Simple-4                      1.081µ ± 15%
SchemaValidation_AllFields-4                   1.664µ ±  3%
SchemaValidation_FormatValidation-4            1.564µ ±  1%
SchemaValidation_ManySchemas-4                 1.580µ ±  3%
geomean                                        1.452µ

                                    │ benchmark-results.txt │
                                    │         B/op          │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

                                    │ benchmark-results.txt │
                                    │       allocs/op       │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
cpu: AMD EPYC 7763 64-Core Processor                
                                   │ baseline-bench.txt │
                                   │       sec/op       │
EventStoreAppend_InMemory-4                1.218µ ± 14%
EventStoreAppend_SQLite-4                  1.259m ±  5%
GetTimeline_InMemory/events-10-4           13.48µ ±  3%
GetTimeline_InMemory/events-50-4           76.33µ ±  1%
GetTimeline_InMemory/events-100-4          121.9µ ± 27%
GetTimeline_InMemory/events-500-4          624.7µ ±  0%
GetTimeline_InMemory/events-1000-4         1.271m ±  1%
GetTimeline_SQLite/events-10-4             106.9µ ±  1%
GetTimeline_SQLite/events-50-4             247.6µ ±  1%
GetTimeline_SQLite/events-100-4            416.4µ ±  1%
GetTimeline_SQLite/events-500-4            1.769m ±  1%
GetTimeline_SQLite/events-1000-4           3.460m ±  1%
geomean                                    216.0µ

                                   │ baseline-bench.txt │
                                   │        B/op        │
EventStoreAppend_InMemory-4                  812.0 ± 5%
EventStoreAppend_SQLite-4                  1.985Ki ± 2%
GetTimeline_InMemory/events-10-4           7.953Ki ± 0%
GetTimeline_InMemory/events-50-4           46.62Ki ± 0%
GetTimeline_InMemory/events-100-4          94.48Ki ± 0%
GetTimeline_InMemory/events-500-4          472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4         944.3Ki ± 0%
GetTimeline_SQLite/events-10-4             16.74Ki ± 0%
GetTimeline_SQLite/events-50-4             87.14Ki ± 0%
GetTimeline_SQLite/events-100-4            175.4Ki ± 0%
GetTimeline_SQLite/events-500-4            846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4           1.639Mi ± 0%
geomean                                    67.50Ki

                                   │ baseline-bench.txt │
                                   │     allocs/op      │
EventStoreAppend_InMemory-4                  7.000 ± 0%
EventStoreAppend_SQLite-4                    53.00 ± 0%
GetTimeline_InMemory/events-10-4             125.0 ± 0%
GetTimeline_InMemory/events-50-4             653.0 ± 0%
GetTimeline_InMemory/events-100-4           1.306k ± 0%
GetTimeline_InMemory/events-500-4           6.514k ± 0%
GetTimeline_InMemory/events-1000-4          13.02k ± 0%
GetTimeline_SQLite/events-10-4               382.0 ± 0%
GetTimeline_SQLite/events-50-4              1.852k ± 0%
GetTimeline_SQLite/events-100-4             3.681k ± 0%
GetTimeline_SQLite/events-500-4             18.54k ± 0%
GetTimeline_SQLite/events-1000-4            37.29k ± 0%
geomean                                     1.162k

cpu: AMD EPYC 9V74 80-Core Processor                
                                   │ benchmark-results.txt │
                                   │        sec/op         │
EventStoreAppend_InMemory-4                   1.110µ ± 15%
EventStoreAppend_SQLite-4                     999.4µ ±  1%
GetTimeline_InMemory/events-10-4              12.66µ ±  1%
GetTimeline_InMemory/events-50-4              70.83µ ± 14%
GetTimeline_InMemory/events-100-4             108.8µ ±  1%
GetTimeline_InMemory/events-500-4             555.1µ ±  0%
GetTimeline_InMemory/events-1000-4            1.133m ±  0%
GetTimeline_SQLite/events-10-4                83.22µ ±  2%
GetTimeline_SQLite/events-50-4                218.7µ ±  1%
GetTimeline_SQLite/events-100-4               379.5µ ±  1%
GetTimeline_SQLite/events-500-4               1.648m ±  2%
GetTimeline_SQLite/events-1000-4              3.203m ±  0%
geomean                                       191.8µ

                                   │ benchmark-results.txt │
                                   │         B/op          │
EventStoreAppend_InMemory-4                     779.0 ± 6%
EventStoreAppend_SQLite-4                     1.983Ki ± 2%
GetTimeline_InMemory/events-10-4              7.953Ki ± 0%
GetTimeline_InMemory/events-50-4              46.62Ki ± 0%
GetTimeline_InMemory/events-100-4             94.48Ki ± 0%
GetTimeline_InMemory/events-500-4             472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4            944.3Ki ± 0%
GetTimeline_SQLite/events-10-4                16.74Ki ± 0%
GetTimeline_SQLite/events-50-4                87.14Ki ± 0%
GetTimeline_SQLite/events-100-4               175.4Ki ± 0%
GetTimeline_SQLite/events-500-4               846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4              1.639Mi ± 0%
geomean                                       67.26Ki

                                   │ benchmark-results.txt │
                                   │       allocs/op       │
EventStoreAppend_InMemory-4                     7.000 ± 0%
EventStoreAppend_SQLite-4                       53.00 ± 0%
GetTimeline_InMemory/events-10-4                125.0 ± 0%
GetTimeline_InMemory/events-50-4                653.0 ± 0%
GetTimeline_InMemory/events-100-4              1.306k ± 0%
GetTimeline_InMemory/events-500-4              6.514k ± 0%
GetTimeline_InMemory/events-1000-4             13.02k ± 0%
GetTimeline_SQLite/events-10-4                  382.0 ± 0%
GetTimeline_SQLite/events-50-4                 1.852k ± 0%
GetTimeline_SQLite/events-100-4                3.681k ± 0%
GetTimeline_SQLite/events-500-4                18.54k ± 0%
GetTimeline_SQLite/events-1000-4               37.29k ± 0%
geomean                                        1.162k

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
cmd/wfctl/main.go 66.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copilot AI requested review from Copilot and removed request for Copilot May 13, 2026 06:31
Copilot AI changed the title [WIP] Update wfctl plugin for conformance and failure diagnostics fix: add failureReason to conformance evidence and strip +dirty from wfctlVersion May 13, 2026
Copilot AI requested a review from intel352 May 13, 2026 06:34
@intel352 intel352 marked this pull request as ready for review May 13, 2026 22:14
Copilot AI review requested due to automatic review settings May 13, 2026 22:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves wfctl plugin conformance evidence by making failed typed-IaC checks easier to diagnose and ensuring dirty build metadata does not invalidate evidence version normalization.

Changes:

  • Adds failureReason to compatibility evidence and populates it on conformance failures.
  • Ensures passing conformance evidence omits failureReason.
  • Strips +dirty from wfctl build versions before evidence stamping.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
cmd/wfctl/plugin_conformance.go Captures the conformance error message in failed evidence.
cmd/wfctl/plugin_conformance_test.go Adds assertions for failure reasons and dirty-version suffix absence.
cmd/wfctl/plugin_compat_model.go Extends compatibility evidence JSON with optional failureReason.
cmd/wfctl/main.go Normalizes build versions by removing a trailing +dirty marker.
cmd/wfctl/main_test.go Adds coverage for dirty-marker stripping behavior.

@intel352
Copy link
Copy Markdown
Contributor

@copilot need improved coverage: #649 (comment)

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 13, 2026

@copilot need improved coverage: #649 (comment)

Addressed in 5ea6ce8. Extracted a cleanBuildVersion(raw string) string helper so the dirty-stripping logic is directly testable without depending on debug.ReadBuildInfo(). The test now exercises four cases: pseudo-version with +dirty, release tag with +dirty, clean pseudo-version, and clean release tag — covering all 3 previously-missing lines in main.go.

@intel352 intel352 merged commit f6d4bb1 into main May 13, 2026
21 checks passed
@intel352 intel352 deleted the copilot/update-wfctl-plugin-conformance branch May 13, 2026 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wfctl plugin conformance should preserve failure diagnostics and ship in release

3 participants