Skip to content

refactor: slug-based identity for Convention and Ontology #129

@rorybyrne

Description

@rorybyrne

Background

In #76 we moved Schema from server-generated UUID ids to user-supplied slugs — pdb-structure@1.0.0 on the wire instead of 4d09002b-0572-4fb0-b@1.0.0. That pattern was a clear win for readability and for client-side use (e.g., the Pockets frontend pinning schema on discovery queries).

Two remaining identity asymmetries are worth closing for the same reasons:

1. Convention still uses UUIDs

Today ConventionService.create_convention generates LocalId(str(uuid4())[:20]). The resulting ConventionSRN is opaque — urn:osa:localhost:conv:4d09002b-0572-4fb0-b@1.0.0. Convention is a declarative, user-named resource (it has a title, it's bundled with a specific schema + hooks), so there's no reason its identifier shouldn't be slug-shaped like Schema's.

This also fixes a real diagnostic pain: if two conventions are registered against the same schema (e.g. rcsb-pdb-full and rcsb-pdb-cryoem-only), today you cannot tell them apart by ID — they're two truncated UUIDs. With slugs, they're distinguishable at a glance.

2. Ontology stores srn: OntologySRN internally (inconsistent with Schema)

Schema was refactored in #76 to use id: SchemaId (short (id, version) form) internally, with SchemaSRN reserved for federation edges. Ontology still stores srn: OntologySRN as its identity. Same kind of resource (declarative, user-named, low-cardinality), different internal representation.

No new user-facing change needed here — ontologies already have meaningful slug-shaped IDs (ncbi-taxonomy, uberon). This is purely bringing the code to the same shape.

Out of scope

Record and Deposition should keep UUIDs. They're high-volume instance resources without natural keys at creation time. Domain-level identifiers (PDB ID, accession, DOI) belong in metadata, not in identity — and are already queryable via the discovery DSL. Coupling record identity to a mutable metadata field is the wrong direction.

Proposed work

Convention slug

  • Add ConventionIdentifier type alongside SchemaIdentifier in osa/domain/shared/model/srn.py (same regex: ^[a-z][a-z0-9-]{2,63}$).
  • CreateConvention command + route accepts id: ConventionIdentifier as required.
  • ConventionService.create_convention uses the supplied slug instead of uuid4()[:20].
  • SDK: add __convention_id__ class attribute requirement on convention() declarations, parallel to __schema_id__ on MetadataSchema. Update rcsb-pdb and pockets/backend definitions.
  • Tests: slug validation, duplicate (id, version) raises ConflictError(code=\"convention_already_exists\").

Ontology identity flip

  • Introduce OntologyId = (LocalId, Semver) analogous to SchemaId.
  • Ontology aggregate: replace srn: OntologySRN with id: OntologyId.
  • OntologyRepository port + Postgres impl updated.
  • Ontology services + CLI import flow threaded through.
  • OntologySRN retained for federation-edge boundaries only (import/export).
  • Command/query wire fields updated (short-form id instead of full URN where applicable).

Expected size: ~400–500 lines of code + tests across the two parts. Follows the pattern already established in #76 so the shape is known.

Notes

  • No migration needed in the greenfield sense — new deployments get the right identity from day one.
  • When/if there's data to migrate from existing UUID-based conventions, a separate CLI backfill is the right shape (same reasoning as feat: typed metadata tables + expressive REST query DSL #76's decision on JSONB→typed backfill).

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactorInternal restructuring, no behavior change

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions