Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
94b4fba
chore: add coverage analysis for generated tests (#275)
esraagamal6 May 18, 2026
5277d5c
docs: mark coverage-analysis as implementation-phase scaffolding
esraagamal6 May 18, 2026
c49f125
docs(coverage-analysis): refresh against upstream camunda/camunda#53387
esraagamal6 May 18, 2026
287f85b
docs(coverage-analysis): drop historical phrasing from upstream snaps…
esraagamal6 May 18, 2026
d943236
fix(coverage-analysis): scan request-validation + edge lifecycle sources
esraagamal6 May 19, 2026
381aca8
chore(coverage-analysis): regenerate against current main + map /forms
esraagamal6 May 19, 2026
9d4e305
fix(coverage-analysis): address review comments from PR #278
esraagamal6 May 19, 2026
04f2140
fix(coverage-analysis): address 2nd round of review comments (PR #278)
esraagamal6 May 19, 2026
88f25d2
feat(coverage-analysis): body-detect pagination/filter request shapes
esraagamal6 May 19, 2026
fdeae43
fix(coverage-analysis): address 3rd round of review comments (PR #278)
esraagamal6 May 19, 2026
48d9d4d
docs(coverage-analysis): list lifecycle_disjoint.md in README Files t…
esraagamal6 May 19, 2026
e9e1243
docs(coverage-analysis): drop redundant generate:request-validation step
esraagamal6 May 19, 2026
baa7c21
fix(coverage-analysis): variant blurb + (none) fallback in gaps.md
esraagamal6 May 19, 2026
c8078d7
fix(coverage-analysis): preserve 'unlabeled' alongside body-shape extras
esraagamal6 May 19, 2026
b9e973d
docs(coverage-analysis): clarify the not-found vs observe-absence sem…
esraagamal6 May 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 149 additions & 0 deletions coverage-analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Generator coverage analysis

> **Status:** implementation-phase scaffolding. This directory exists to help
> assess what the generator currently produces while it's being built. Once the
> generator is delivered it can be deleted — the artifacts here are snapshots
> and are not maintained as part of the product.

Categorises the test files emitted under `generated/camunda-oca/playwright/` and
produces a coverage matrix in the same shape as
[`camunda/camunda/qa/c8-orchestration-cluster-e2e-test-suite/coverage-analysis/`](https://github.com/camunda/camunda/tree/main/qa/c8-orchestration-cluster-e2e-test-suite/coverage-analysis),
so the two suites can be diffed directly. Answers the questions in
[issue #275](https://github.com/camunda/api-test-generator/issues/275).

## Files

| file | what it is |
|---|---|
| `build_coverage.py` | Walks every generator test source, resolves each `operationId` against `spec/camunda-oca/bundled/rest-api.bundle.json`, classifies entity/operation/variant/category/form-step/prerequisite, and writes the artifacts below. |
| `tests.csv` | One row per `test()` declaration. Columns: `file, line, source, entity, category, operation, form_step, prerequisite, method, path, operationId, variants, test_name`. |
| `coverage_matrix.csv` | `entity × operation` grid with variant counts. Same columns as the upstream matrix. |
| `coverage_matrix.md` | Markdown view of the matrix: at-a-glance ✓ table + counts-per-cell. |
| `gaps.md` | Heuristic gap report: entities missing 401/403/400/404/409 coverage, missing observe-after-delete, search ops with no pagination/sort/filter. |
| `category_breakdown.md` | Per-category breakdown (Form, prerequisite, observation channel split, form-step counts, per-test table with `file:line`). Mirrors upstream `category_breakdown.md`. |
Comment thread
esraagamal6 marked this conversation as resolved.
| `lifecycle_disjoint.md` | **Manually maintained** — a focused write-up answering #279's request for the disjoint between the 10 EntityLifecycle tests and the matching upstream tests. Not regenerated by `build_coverage.py`; refresh by hand when the lifecycle template or upstream coverage shifts materially. |

## Test sources scanned

The generator emits tests into five locations; `build_coverage.py` scans all of them:

| location | emitter | tag in `tests.csv` `source` column |
|---|---|---|
| `generated/camunda-oca/playwright/<operationId>.feature.spec.ts` | feature emitter (happy path + basic shape) | `feature` |
| `generated/camunda-oca/playwright/<operationId>.variant.spec.ts` | variant emitter (schema/input variations: `bpmn`, `oneOf …`, etc.) | `variant` |
| `generated/camunda-oca/playwright/edges/<EdgeName>.lifecycle.spec.ts` | edge lifecycle template (`establish → observe present → revoke → observe absent`) | `lifecycle` |
| `generated/camunda-oca/playwright/entities/<EntityName>.lifecycle.spec.ts` | entity lifecycle template (`create → present → update → present → delete → absent`) | `lifecycle` |
| `generated/camunda-oca/request-validation/<entity>-validation-api-tests.spec.ts` | request-validation emitter (negative schema cases, all bad-request) | `request-validation` |

## Regenerate

The analyser reads from `spec/camunda-oca/bundled/rest-api.bundle.json` and
`generated/camunda-oca/`, both of which are gitignored. On a fresh checkout
you need to populate them first:

```sh
npm install # one-time, brings in tooling deps
npm run pipeline # fetch spec + generate scenarios + emit feature/variant/lifecycle playwright tests + request-validation
python3 coverage-analysis/build_coverage.py
```

(`npm run pipeline` already chains `fetch-spec → testsuite:generate → generate:request-validation`, so the request-validation emitter does not need a separate invocation.)

If `spec/camunda-oca/bundled/` and `generated/camunda-oca/` are already
populated locally, only the last command is required to refresh the
analysis. The analyser itself has no dependencies beyond the Python stdlib.

## How tests are classified

- **Entity** — derived from the first path segment of the endpoint
(`/jobs/...` → `job`, `/process-instances/...` → `process-instance`).
Mapping is explicit (`SEGMENT_TO_ENTITY`) to preserve the few entities upstream
keeps plural (`cluster-variables`, `decision-requirements`,
`message-subscriptions`) and to fold `deployments` into `resource`.
- **Operation** — derived from the operationId prefix (`create*`, `delete*`,
`update*`, `search*`/`list*`, `get*`/`fetch*`) with HTTP-method fallback.
- **Category** (A–O upstream buckets, plus a v2-only `P. Agent-Instance`) —
derived from entity, with `assign*To*` / `unassign*From*` and
`search*For(Group|Role|Tenant)` operations classified as
`B. Membership/Association`.
- **Variant** — derived from the generator's test-name suffix:
- `base` → `happy-path`
- `bpmn` / `dmn` / `drd` / `form` / `path` / `cycle/*` / `oneOf *` → `data-driven`
- `negative empty` → `observe-absence`
- `variant-N - scenario` (dynamic name) → `unlabeled`
- **Form step** — derived from operation + variant
(`create`, `observe-present-get`, `observe-present-search`, `mutate`, `delete`,
`observe-absence`).
- **Prerequisite** — entity-based, copied from upstream's mapping; for
membership ops it's parent + member (e.g. `tenant + client`).

## Comparison with upstream

Upstream snapshot:
[camunda/camunda#53387](https://github.com/camunda/camunda/pull/53387)
(head `7cf8bc1`). In its `coverage_matrix.csv` the `total` column equals
unique-test count; variant columns are label-occurrences, so a test tagged
`happy-path|filter` shows up in both.

| | upstream | generator |
|---|---:|---:|
| Unique tests | 1001 | **1617** |
| Entities | 33 | 37 |
| Happy-path (occurrences) | 173 | 211 |
| Bad-request (400, occurrences) | 195 | **1071** |
| Pagination-sort (occurrences) | 53 | **85** |
| Filter (occurrences) | 85 | **196** |
| Observe-absence | 2 | 48 |
Comment thread
esraagamal6 marked this conversation as resolved.
| Data-driven / oneOf variants | 5 | 302 |
| Unauthorized (401) | 165 | **0** |
| Not-found (404) | 127 | **0** |
| Conflict (409) | 31 | **0** |
| Forbidden (403) | 29 | **0** |

**The generator emits 616 more tests than upstream.** It dominates upstream on 400 bad-request coverage (the `request-validation` emitter alone produces 1071 tests across 17 violation kinds: `additional-prop`, `constraint-violation`, `enum-violation`, `format-invalid`, `missing-body`, `missing-required`, `missing-required-combo`, `oneof-ambiguous`, `oneof-cross-bleed`, `oneof-none-match`, `param-constraint-violation`, `param-missing`, `param-type-mismatch`, `type-mismatch`, `union`, `unique-items-violation`, and `additional-prop-general`).

**Pagination/filter counts need a caveat.** The generator's variant emitter sends `page: { after: cursor }` and `filter: { ... }` in request bodies on many search and batch-operation specs (detected by the classifier from the test body shape), so the variant column counts are non-zero. But these tests only assert `status === 200`; they do **not** assert pagination *correctness* (e.g. "page 2 yields the next N items, no overlap with page 1") or filter *correctness* (e.g. "filtering by `status=active` returns only active rows"). Upstream's 53 pagination and 85 filter tests are behaviour assertions, not request-shape assertions — so although the generator's pagination/filter counts now exceed upstream, the *semantic depth* is still much lower. The numeric comparison is a request-shape comparison, not a behaviour-coverage comparison.

The buckets where the generator currently emits zero tests:

- **401 unauthorized** (165 in upstream) — verified zero: no test asserts `status === 401` anywhere in the suite. Needs deployment-mode-aware auth context, see `camunda/camunda#52511`.
- **403 forbidden** (29 in upstream) — verified zero. Needs RBAC ABox + restricted-token test infrastructure.
- **404 not-found** (127 in upstream) — see note below; the matrix says zero, but the generator *does* assert 404 in 10 entity-lifecycle tests (final "observe absent" phase). The real gap is the fake-ID variant. Needs `ontology/` semantic-type-based fake-ID generation on path params.
- **409 conflict** (31 in upstream) — verified zero. Needs `duplicatePolicy` ABox slice (designed in 8.8, not yet landed; see #277).

### Note on the `not-found` count

The matrix shows `not-found: 0` for the generator, but this is a semantic distinction inherited from upstream's taxonomy, not "the generator never asserts 404". Upstream splits 404 assertions into two buckets:

- **`observe-absence`** — `GET` after `DELETE`, expect 404. Entity *was* created, now gone. Generator currently has **48 of these** (10 entity-lifecycle + 12 edge-lifecycle + 26 from feature/variant emitters that include `negative empty` semantics).
- **`not-found`** — `GET` against a fake/never-existing ID, expect 404. Entity *was never* created. Generator currently has **0** of these.

Concretely, every entity-lifecycle test ends with:

```typescript
expect(resp4.status()).toBe(404); // observe absent (after the prior delete)
```

These are real 404 assertions — they just exercise the "after-delete" path, not the "fake-ID" path. The capability gap is specifically the fake-ID pattern (replace `{tenantId}` with a generated invalid ID, call `GET /tenants/{tenantId}`, expect 404). Upstream's 127 `not-found` tests are mostly that fake-ID pattern.

See `gaps.md` for the categorised per-entity list.

## Limitations

- Variant classification depends on the generator's emitter suffix vocabulary.
When emitters change names (or new ones are added) update `variants_of()` in
`build_coverage.py`.
- The generator emits substantial 400/bad-request coverage via the
`request-validation` emitter (1000+ tests across 17 violation kinds), and
the variant emitter exercises pagination (`page.after` cursor) and filter
request shapes on many search/batch-operation specs (detected by the
classifier from the test body, not the test name). The buckets where the
generator emits **zero** tests are: 401, 403, 404, 409. These are a
generator capability gap — see `gaps.md` for the per-entity breakdown.
- The pagination-sort / filter counts in `coverage_matrix.csv` reflect
request-shape coverage (the test sends `page: { ... }` or
`filter: { ... }`), not behaviour coverage (the test asserts pagination
or filter *results* are correct). Upstream's hand-written tests assert
behaviour; the generator's only assert status code + response schema.
- Dynamic test names (`variant-N - scenario`) are bucketed as `unlabeled` because
reading the test body would be required to refine them.
Loading