Skip to content

Refactor deprecation scanning to pre-normalize; allow list-form populated_from on PV#250

Open
amc-corey-cox wants to merge 8 commits into
mainfrom
pv-populated-from-list
Open

Refactor deprecation scanning to pre-normalize; allow list-form populated_from on PV#250
amc-corey-cox wants to merge 8 commits into
mainfrom
pv-populated-from-list

Conversation

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Closes the deprecation of sources on PermissibleValueDerivation started in #194 (issue #193). Users who needed multi-source-to-one-target PV mapping had to keep using deprecated sources: [...]; now they can write populated_from: [light_red, dark_red].

Scope: PermissibleValueDerivation only. Class/Slot/Enum populated_from list semantics are entangled with broader dispatch design and can follow on their own schedule.

Summary

  • populated_from on PV derivations is now multivalued. Scalar input (populated_from: 'x') is wrapped to a one-element list at load time.
  • sources (deprecated) is migrated into populated_from at runtime entry points when populated_from is empty. sources itself is left in place so the validator's existing deprecation warning still fires.
  • Setting both populated_from and sources on the same PV is now a validator error. validate-spec exits 1; map-data pre-flight surfaces it informationally per existing pre-flight behavior. populated_from wins at runtime when both are set.

Architecture note

Transformer._normalize_spec_dict does structural normalization only (shared with the validator). The new _migrate_pv_sources_to_populated_from is called separately from the three runtime entry points (create_transformer_specification and the two load_* methods). The validator therefore sees the un-migrated dict and can cleanly distinguish "user wrote both" from "we migrated sources".

Spillover fixes

The schema-mapper and inverter were treating populated_from as a scalar string and now handle the list. The markdown template joins the list with , so output reads SIBLING_OF rather than ['SIBLING_OF']. The inverter raises NonInvertibleSpecificationError when a PV has multiple source values (the compliance suite already marks list-form mappings as non-invertible).

Asked-from context

Surfaced from dm-bip: a user wrote populated_from: '1' (PV-level, value form) and needed to add 'M' as a second source value mapping to the same target PV. The old answer was "use deprecated sources"; now it's "use list populated_from".

Test plan

  • uv run pytest — 835 passed, 4 skipped
  • uv run ruff check . and uv run ruff format . — clean
  • New tests in tests/test_validator.py (list form, scalar form, sources-only deprecation, both-set conflict error)
  • New tests in tests/test_transformer/test_multi_enum.py (list-form maps multiple sources, scalar still works, sources migration preserved)
  • Existing deprecation test (test_deprecations.py::test_sources_on_pv_emits_deprecation_warning) still passes

Extends populated_from on PermissibleValueDerivation to accept a list of
source values, completing the deprecation of `sources` started in #194
(issue #193). Scope is limited to PV derivations; class/slot/enum list
support is separate work.

- Schema: populated_from is multivalued; scalar input is wrapped at load.
- Runtime: deprecated `sources` is migrated into `populated_from` so the
  enum-mapping loop has a single field to consult. `sources` is left in
  place after migration so the validator's deprecation warning still fires.
- Validator: setting both `populated_from` and `sources` on the same PV
  now produces severity=error (validate-spec gates on it).
- Architecture: the migration runs only at runtime entry points, so the
  validator sees un-migrated data and can distinguish "user wrote both"
  from "we migrated sources".
- Updated schema_mapper, inverter, and the markdown template to handle
  populated_from as a list. Inverter raises NonInvertibleSpecificationError
  when a PV deriv has multiple source values.
Copilot AI review requested due to automatic review settings May 18, 2026 12:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the deprecation path for PermissibleValueDerivation.sources by making PermissibleValueDerivation.populated_from multivalued (with scalar-to-list coercion), adding validator errors for sources+populated_from conflicts, and updating runtime/inference/compiler code to handle list semantics.

Changes:

  • Update the transformer model so PV-level populated_from is multivalued, and coerce scalar PV populated_from to a one-element list during normalization.
  • Add validator logic to error when both sources and populated_from are set on the same PV derivation; runtime migrates sourcespopulated_from when populated_from is empty.
  • Update enum mapping, schema derivation, inversion, and markdown compilation to treat PV populated_from as a list; add tests for list/scalar/migration/conflict behavior.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_validator.py Adds structural-validation tests for PV populated_from list/scalar acceptance, sources-only deprecation behavior, and both-set conflict error.
tests/test_transformer/test_multi_enum.py Adds runtime tests for list-form PV mapping, scalar normalization, and sources→populated_from migration behavior.
src/linkml_map/validator.py Adds check_pv_field_conflicts() and wires it into validate_spec() before deprecation/semantic checks.
src/linkml_map/transformer/transformer.py Adds PV-derivation iteration, scalar-to-list coercion for PV populated_from, and runtime migration from PV sources.
src/linkml_map/transformer/object_transformer.py Updates enum value mapping to use list-form populated_from membership checks.
src/linkml_map/inference/schema_mapper.py Treats PV sources as populated_from (preferred) or sources for derived enum PV enumeration.
src/linkml_map/inference/inverter.py Rejects inversion when PV populated_from has multiple sources; updates inverted PV derivations to list-form populated_from.
src/linkml_map/datamodel/transformer_model.yaml Declares PV populated_from as multivalued and updates deprecation text for sources.
src/linkml_map/datamodel/transformer_model.py Regenerates the Pydantic model to reflect PV populated_from: list[str] and updated deprecation metadata.
src/linkml_map/compiler/templates/markdown.j2 Renders PV populated_from lists as comma-joined strings.

Comment thread src/linkml_map/transformer/object_transformer.py
Comment thread src/linkml_map/transformer/transformer.py Outdated
…tion

All deprecation scanning now runs in one pre-normalize pass inside
_normalize_spec_dict, so every spec-loading entry point gets it. PV
migration clears sources; runtime asserts the invariant. _warn_deprecated_fields
shim removed; pre-flight reads scan messages from the Transformer.
object_derivations deprecation gains validate-spec surface.

Also regenerated docs/schema/PermissibleValueDerivation.md (CI fix).
@amc-corey-cox amc-corey-cox changed the title Allow list-form populated_from on PermissibleValueDerivation Refactor deprecation scanning to pre-normalize; allow list-form populated_from on PV May 18, 2026
@amc-corey-cox
Copy link
Copy Markdown
Contributor Author

Scope expansion: architectural refactor of deprecation scanning

Both Copilot threads and offline discussion flagged that the original implementation put the PV sources → populated_from migration in a separate method called from three Transformer methods, missing three additional spec-loading paths (Session.set_transformer_specification, utils.loaders.load_specification, cli/cli.py:763).

Commit 6317029 reworks this:

  • All deprecation scanning runs in a single pre-normalize pass inside Transformer._normalize_spec_dict. The scan also folds in object_derivations detection (previously a warnings.warn at flatten-time, not surfaced through validate-spec).
  • Migration clears sources so the runtime can rely on populated_from alone. object_transformer.transform_enum asserts the invariant.
  • _normalize_spec_dict emits DeprecationWarning and raises SpecificationError (subclasses ValueError) by default; silent=True for the validator path.
  • _warn_deprecated_fields shim removed from inference.py. Pre-flight reads scan messages stored on the Transformer rather than re-scanning the post-migration spec.
  • All five spec-loading entry points now get scanning + migration by construction. Any future deprecation that wants auto-migration uses the same pattern: add to the scan, optionally add to the migration phase.

PR title updated. Original PR body retained for history.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Comment thread src/linkml_map/transformer/transformer.py Outdated
Comment thread src/linkml_map/validator.py Outdated
Comment thread src/linkml_map/compiler/templates/markdown.j2 Outdated
Comment thread src/linkml_map/transformer/transformer.py
- _iter_derivation_dicts now recognizes compact-key list form so the
  deprecation scan catches `permissible_value_derivations: [{name: {...}}]`.
- _coerce_pv_populated_from_to_list strips explicit-null populated_from
  (also handles [None] post-ReferenceValidator) so pydantic uses the
  default-factory.
- markdown.j2 defends with `or []` so a programmatic None doesn't crash
  the template.
…dict

Three phases now: SHAPE (recursive compact-key + dict↔list everywhere),
SCAN (on canonical shape), MIGRATE (object_derivations flatten,
populated_from inheritance, PV scalar→list, PV None-strip, PV
sources→populated_from).

Field semantics no longer entangled with structural normalization, so
_iter_derivation_dicts can assume canonical shape.
Copilot AI review requested due to automatic review settings May 18, 2026 20:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Comment thread src/linkml_map/validator.py Outdated
Comment thread src/linkml_map/transformer/transformer.py Outdated
Comment thread src/linkml_map/transformer/transformer.py Outdated
Comment thread src/linkml_map/transformer/object_transformer.py Outdated
xfail(strict=True) on the correct post-fix behavior. Fails loudly with
XPASS when upstream lands the ReferenceValidator fix, signaling that
the PV-level branch in _pre_shape_expand_compact_keys can be removed.
Removed tests/test_upstream_canaries.py — its xfail(strict=True) only
flips on the "accept and normalize" path; if upstream picks "reject
loudly" instead, the test errors out and never signals usefully.

Updated _pre_shape_expand_compact_keys docstring to explain compact-key
list is a linkml-map convention, not LinkML-canonical, and that the
pre-pass makes us independent of any upstream resolution.
Copilot AI review requested due to automatic review settings May 19, 2026 14:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Comment thread src/linkml_map/transformer/transformer.py Outdated
Comment thread src/linkml_map/validator.py Outdated
Upstream's proposed fix hoists unambiguous compact-key list items as
the pk (errors on genuinely ambiguous cases). Our test case is
unambiguous — key is not a slot of PermissibleValueDerivation, value is
a dict — so when the fix lands, RV will produce {'red': {...}} and the
xfail(strict=True) flips to a clean signal to remove the workaround.
Copilot AI review requested due to automatic review settings May 19, 2026 15:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Comment thread src/linkml_map/transformer/object_transformer.py Outdated
Comment thread src/linkml_map/validator.py Outdated
Comment thread src/linkml_map/transformer/transformer.py Outdated
- check_deprecated_fields docstring: reflect actual SHAPE→SCAN→MIGRATE
  ordering (scan now runs after RV, not before).
- _coerce_pv_populated_from_to_list: only strip explicit None / all-None
  lists; empty strings stay (legitimate PV value).
- _migrate_pv_sources_to_populated_from: wrap scalar string instead of
  exploding into characters.
- transform_enum: replace `assert not pv_deriv.sources` with explicit
  SpecificationError raise — invariant check stays enforced under `-O`.
- _iter_values_or_list docstring: relax to match actual use (pre- and
  post-SHAPE callers).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants