Summary
When a class_derivation has a top-level populated_from: TableA, all slot_derivation blocks within it can only reference columns from TableA. There is currently no way to pull a value from a different source table (e.g., a demographics table) into a slot derivation that lives under a class derivation targeting a different table.
This is a common need in tabular data harmonization where related variables live in separate tables — for example, linking an age_at_observation from a demographics table to measurement observations loaded from a different data table.
Current behavior
populated_from at the slot level resolves within the same source class/table as the parent class_derivation. The existing cross-class support (#101, resolve_fk_path in fk_utils.py) handles dot-notation FK traversal (e.g., org_id.name) but requires a foreign key relationship defined in the schema. It does not support arbitrary cross-table lookups where the join key must be specified in the transformation spec.
Desired behavior
Within a class_derivation block with populated_from: TableA, a slot_derivation should be able to reference columns from TableB — for example:
class_derivations:
MeasurementObservation:
populated_from: lab_results
slot_derivations:
analyte_value:
populated_from: result_value
age_at_observation:
populated_from: demographics.age_at_exam # cross-table lookup
# join key: shared participant_id
The exact syntax and join-key specification mechanism need design work. Options include:
- Extending dot-notation (
OtherClass.slot) with an implicit or explicit join key
- A new attribute on
slot_derivation specifying the foreign table and join column
- Making cross-table references available in
expr: blocks via bindings
Use cases
- Longitudinal study data: Age variables live in a demographics/visit table but need to be attached to measurement observations from separate data tables
- Reference data joins: Coded values in one table need to be enriched with labels from a reference/lookup table
- Multi-table harmonization: Any scenario where the target schema combines attributes that originate from different source tables
Related
Summary
When a
class_derivationhas a top-levelpopulated_from: TableA, allslot_derivationblocks within it can only reference columns fromTableA. There is currently no way to pull a value from a different source table (e.g., a demographics table) into a slot derivation that lives under a class derivation targeting a different table.This is a common need in tabular data harmonization where related variables live in separate tables — for example, linking an
age_at_observationfrom a demographics table to measurement observations loaded from a different data table.Current behavior
populated_fromat the slot level resolves within the same source class/table as the parentclass_derivation. The existing cross-class support (#101,resolve_fk_pathinfk_utils.py) handles dot-notation FK traversal (e.g.,org_id.name) but requires a foreign key relationship defined in the schema. It does not support arbitrary cross-table lookups where the join key must be specified in the transformation spec.Desired behavior
Within a
class_derivationblock withpopulated_from: TableA, aslot_derivationshould be able to reference columns fromTableB— for example:The exact syntax and join-key specification mechanism need design work. Options include:
OtherClass.slot) with an implicit or explicit join keyslot_derivationspecifying the foreign table and join columnexpr:blocks via bindingsUse cases
Related
populated_from: OtherClass.slotfor FK-based slot lookupsrc/linkml_map/utils/fk_utils.py— Existing FK resolution infrastructuresrc/linkml_map/transformer/object_transformer.py— Runtime slot derivation logic