feat: spec-coverage-analyzer spike (#277)#300
Draft
esraagamal6 wants to merge 2 commits into
Draft
Conversation
Reads an OpenAPI spec and emits a per-endpoint test plan, tagging each plan item as either: - computable -- derivable from the spec alone - needs-abox:X -- requires domain knowledge; X names the missing fact Snapshot against the OCA spec: 190 operations, 1817 plan items 1027 computable from spec (56%) 790 needs-abox / domain knowledge (44%) The needs-ABox load is concentrated in a handful of facts (top 5): - RBAC permissions per endpoint 190 items - spec-gap: which endpoints require auth 189 items - creation chain per identifier semantic 120 items - filter-field-semantics + sort-allowlist 106 items - duplicatePolicy per endpoint 59 items Also surfaces a real spec/reality drift: the OCA spec declares securitySchemes but only applies them on getAuthentication. The analyzer flags this as a spec-gap so 401 coverage stays visible in the plan. Outputs (next to the script, committed for diffability): - plan.csv machine-readable, one row per (op, plan-item) - plan.md per-endpoint readable summary - needs-abox.md aggregated needs-ABox gaps grouped by fact Independent of coverage-analysis/ (which runs in the opposite direction, analysing what the generator already emits). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 19, 2026
…es + deps Spells out the 5-step path forward after the spike is signed off: duplicatePolicy slice → analyzer/ABox wiring → 404 fake-ID emitter → Camunda Hub generalisation → coverage-analysis/ as verification check. Also lists what's deferred (RBAC, filter-semantics, Hub generalisation) so reviewers know what we're explicitly NOT picking up first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Status
Draft / spike for #277. Not for merge as-is — the goal of this PR is to give @jwulf and the team something concrete to react to before deciding on the next step (ABox slice priorities, rule-table refinement, integration design).
Summary
Adds
spec-coverage-analyzer/— a Python analyzer that reads an OpenAPI spec and emits a per-endpoint test plan, tagging each plan item as either computable from the spec alone or needs an ABox fact (with the missing fact named in the output).Runs by default against
spec/camunda-oca/bundled/rest-api.bundle.json; designed to be re-run on Camunda Hub when that spec lands.How is this different from the generator?
Both tools read the spec and the ABox — the generator already integrates the ontology under
configs/<config>/ontology/(that's how it knows e.g. process-instance needs a deployed process first). The analyzer-spike-as-shipped doesn't read the ABox yet — that's deferred to next-step #2 below.The real difference is what each tool produces:
.spec.tstest files (executable code)Concrete example for
POST /tenants—What the generator currently produces (real
.spec.tsfiles):create → present → update → delete → absent) from the EntityLifecycle templateWhat the analyzer says should exist for that endpoint:
needs-abox: spec-gap(spec under-declares auth)needs-abox: RBAC(no slice yet)computable✓ (no emitter for it yet either, but the spec gives us enough)needs-abox: duplicatePolicy(Josh's 8.8 design, not landed)search—needs-abox: filter/sort semanticsThe ✅ items: generator already produces them.⚠️ items: generator produces zero of them.
The
Why this matters: the analyzer is the checklist the generator gets graded against. Without it, "is the generator producing enough tests?" gets hand-counted against another suite. With it, every missing test is named, scoped, and tagged with exactly what's blocking it — usually a specific ABox slice or a specific emitter plugin.
Snapshot against the OCA spec
Top missing ABox facts (by plan-item load):
duplicatePolicyper endpoint (idempotent / conflict / replace)*Before/*After)Spec-gap finding
The OCA spec declares
securitySchemes(BearerAuth, basicAuth) but only applies them ongetAuthentication. The analyzer flags 189 ops with a401-unauthorizedneeds-ABox item (spec-gap: which endpoints actually require auth, encoded only in deployment, not the spec). That's a real spec/reality drift worth surfacing — relates to camunda/camunda#52511.Rule table
See
spec-coverage-analyzer/README.mdfor the full rule table. Brief summary:happy-path,bad-request:{missing-required, type-mismatch, format-invalid, enum-violation, range-violation, additional-property, oneof-violation},404-not-found(per path param),401-unauthorized(when security is declared on the op),pagination-sort:request-shape,filter:request-shape,documented-XXX(per documented non-2xx response).401-unauthorized:spec-gap,403-forbidden,409-conflict,business-entity-lifecycle,prerequisite-resource,eventual-consistency,scale-large-n,cross-field-range,pagination-sort/filter:behaviour-assertion.Outputs
plan.csv— machine-readable, one row per(operationId, plan-item)tuple.plan.md— per-endpoint readable summary, formatted as upstream's coverage_breakdown.md.needs-abox.md— aggregated needs-ABox gaps grouped by missing fact.Test plan
python3 spec-coverage-analyzer/build_plan.pyand confirm it emits the 3 artifacts.plan.md— do the computable / needs-ABox tags match what you'd expect?needs-abox.md— are the 9 ABox-fact buckets the right axes? Anything missing?Open questions for review
duplicatePolicy(smallest, already designed in 8.8) is the cheapest unlock — would move 59 plan items from "needs ABox" to "computable" in a single shot.path-analyser/stack (consistent with the rest of the generator codebase)?Next steps (once this spike is signed off)
In priority order, with dependencies:
duplicatePolicyas the first ABox slice. Cheapest unblock — Josh has already designed it (8.8); needs to be expressed as a new file underconfigs/camunda-oca/ontology/duplicatePolicy.jsonwith{ operationId → policy }entries for the ~59 create-style endpoints flagged. Independent of this PR's review outcome — could start in parallel.build_plan.pyto consumeduplicatePolicyand reclassify those 59 plan items fromneeds-aboxtocomputable. Validates the analyzer ↔ ABox contract on the smallest slice before scaling to the bigger ones (RBAC, filter-semantics).ontology/semantics.jsonalready encodes the path-param identifier types. ~1 day's work, closes ~127 upstream-equivalent tests.coverage-analysis/(PR chore: add coverage analysis for generated tests (#275) #278) as a verification check. Currently it analyses what the generator emits. Once the analyzer exists, the two can be diffed: "does the generator emit what the analyzer says it should?". Becomes a CI check rather than a static snapshot.Out of scope for follow-up (defer)
duplicatePolicyvalidates the analyzer↔ABox pattern.🤖 Generated with Claude Code