Skip to content

spike: evaluate RDF/SPARQL as a unifying query layer (#60)#63

Draft
jwulf wants to merge 3 commits into
mainfrom
spike/rdf-issue-60
Draft

spike: evaluate RDF/SPARQL as a unifying query layer (#60)#63
jwulf wants to merge 3 commits into
mainfrom
spike/rdf-issue-60

Conversation

@jwulf
Copy link
Copy Markdown
Member

@jwulf jwulf commented Apr 29, 2026

Spike for #60. Recommendation: adopt the modelling, defer RDF.

See docs/spikes/rdf/RECOMMENDATION.md for the full reasoning. TL;DR:

  • The named entities in docs/spikes/rdf/ontology/core.ttl are the right abstractions whether or not the carrier is RDF.
  • Index parity passes — the modelling is faithful (bySemanticProducer, domainProducers, providerMap all re-derive cleanly from SPARQL queries).
  • The honest test passes: the planner can be written referring only to terms in core: (no Camunda-specific terms), so the per-API ↔ core abstraction line is structurally achievable.
  • Declarative re-expressions surfaced 5 latent silent-miss defects in domain-semantics.json and the loader (4 value-binding drift cases + 1 loader bug writing domainProducers["undefined"]).
  • The GitHub Issues + PRs second-API sketch fits the core ontology with zero new properties — confirms the boundary is in the right place.

Adopt RDF later if/when multi-API generalisation moves from aspirational to concrete. The spike artifacts (adapters, parity test, queries) are reusable as the migration starting point.

Deliverables (all under docs/spikes/rdf/)

Phase Artifact
1 ontology/core.ttl, ontology/camunda.ttl, shapes/invariants.shapes.ttl
2 adapters/build-store.ts — emits 3,497 quads from current pipeline state
3 parity/index-parity.tsPASS for every well-formed key
4a queries/value-binding-drift.ts — surfaces 4 real defects
4b queries/minimal-scenario-chain.tsdependsOn+ property path replaces gatherDomainPrerequisites
5 second-api-sketch.md — paper sketch
Recommendation RECOMMENDATION.md — adopt modelling, defer RDF, includes 7-step follow-up plan

Pre-push checks

  • npm run lint: clean
  • tsc --noEmit per workspace: clean
  • npm run testsuite:generate + generate:request-validation: clean
  • npm test: 89 passed (15 files)

Adds devDependencies oxigraph + rdf-validate-shacl + n3, scoped to the spike under docs/spikes/rdf/. No production-pipeline change.

Closes #60

jwulf added 3 commits April 29, 2026 15:17
…iants

Lays the modelling foundation for issue #60. Three artefacts:

- docs/spikes/rdf/ontology/core.ttl — API-agnostic vocabulary
  (Operation, SemanticType, RuntimeState, Capability, FieldPath,
  ValueBinding, ArtifactKind, Identifier, Disjunction, Scenario)
- docs/spikes/rdf/ontology/camunda.ttl — per-API instances and
  subclasses; demonstrates the per-API <-> core boundary
- docs/spikes/rdf/shapes/invariants.shapes.ttl — SHACL shapes pinning
  invariants the codebase enforces procedurally today (every required
  semantic type / runtime state has a producer; every value binding
  resolves to a known FieldPath and a parameter declared by its
  target state)

Adds devDependencies oxigraph (SPARQL 1.1) and rdf-validate-shacl
(SHACL) — scoped to the spike under docs/spikes/rdf/, no
production-pipeline change.

Refs #60
Phase 2 (adapters):
- docs/spikes/rdf/adapters/build-store.ts materialises an in-memory
  Oxigraph store from the normalised OperationGraph + DomainSemantics.
  Emits 3,497 quads from current pipeline state (183 operations,
  37 semantic types, 482 canonical field paths, plus runtime states /
  capabilities / identifiers / artifact kinds / value bindings /
  disjunctions).

Phase 3 (index parity — go/no-go checkpoint):
- docs/spikes/rdf/parity/index-parity.ts re-derives the loader's three
  reverse indexes (bySemanticProducer, domainProducers, providerMap)
  from SPARQL queries over the store and asserts parity against
  graphLoader.ts output.

Result: PARITY PASS for every well-formed key.

Side finding (recorded for RECOMMENDATION.md):
- The JobTypeValue identifier in domain-semantics.json has no
  validityState, which causes the loader to write
  domainProducers["undefined"] = ["createDeployment"] — a silent
  data-quality issue. The SHACL IdentifierShape (validityState
  minCount 1) catches this at load time. Surfaced explicitly in the
  parity output as "LOADER-ONLY ARTIFACTS THE ONTOLOGY REJECTS".

The brief's checkpoint is satisfied: structural equivalence at
planner-output level falls out for free without modifying any planner
algorithm code.

Refs #60
…sketch, RECOMMENDATION

Phase 4a (value-binding drift detector):
- docs/spikes/rdf/queries/value-binding-drift.ts expresses
  value-binding resolution as a SPARQL query. Surfaces FOUR latent
  domain-semantics defects that are silent runtime no-ops today:
    1. createDeployment -> FormDeployed.formKey: state does not exist.
    2. createDeployment -> ProcessDefinitionKey.processDefinitionKey:
       ProcessDefinitionKey is a semantic type, not a state.
    3. createProcessInstance -> ProcessInstanceExists.processInstanceKey:
       state declares parameter=processDefinitionId, not
       processInstanceKey. (Single-parameter modelling gap.)
    4. createProcessInstance -> ProcessDefinitionKey.processDefinitionKey:
       same type-confusion as #2.

Phase 4b (minimal scenario-chain candidate query):
- docs/spikes/rdf/queries/minimal-scenario-chain.ts replaces
  gatherDomainPrerequisites() (the single hand-rolled multi-hop
  traversal in scenarioGenerator.ts ~L1254-L1273) with a
  core:dependsOn+ SPARQL property path. Also demonstrates required-
  producer candidate selection (replaces bySemanticProducer +
  providerMap reads in the planner) and a coverage-ranked picker.

Phase 5 (second-API paper sketch):
- docs/spikes/rdf/second-api-sketch.md sketches a github: vocabulary
  for GitHub Issues + Pull Requests. The core ontology accommodates
  it without invasive changes — two SHACL relaxations and one
  optional new property (core:invalidates), all genuinely
  API-agnostic. Per-API vocabulary introduces zero new properties:
  the abstraction line (per-API adapters; API-agnostic core) holds.

RECOMMENDATION:
- docs/spikes/rdf/RECOMMENDATION.md: ADOPT THE MODELLING; DEFER RDF.
  The named entities are the right ones whether or not the carrier
  is RDF; reify them as TS types now. RDF adoption is a separable,
  lower-priority decision that becomes worthwhile when multi-API
  generalisation moves from aspirational to concrete. Spike artifacts
  (adapters, parity test, queries) are reusable as the migration
  starting point if/when that happens. Includes a 7-step concrete
  follow-up plan sized for normal PRs.

Pre-push checks:
- npm run lint: clean
- tsc --noEmit per workspace: clean
- npm run testsuite:generate + generate:request-validation: clean
- npm test: 89 passed (15 files)

Closes #60
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spike: evaluate RDF/SPARQL as a unifying query layer over the dependency, domain-semantics, and value-binding graphs

1 participant