Skip to content

Remove lossy edge deduplication from integration pipeline#2106

Merged
eKathleenCarter merged 26 commits intomainfrom
ekcarter/xdata-278-remove-edge-de-duplication
Mar 20, 2026
Merged

Remove lossy edge deduplication from integration pipeline#2106
eKathleenCarter merged 26 commits intomainfrom
ekcarter/xdata-278-remove-edge-de-duplication

Conversation

@eKathleenCarter
Copy link
Copy Markdown
Collaborator

@eKathleenCarter eKathleenCarter commented Mar 5, 2026

Description of the changes

Removes the lossy edge deduplication from the integration pipeline (XDATA-278) and replaces it with a configurable opt-in filter in the filtering pipeline.

union_edges was collapsing all edges sharing the same (subject, predicate, object) triple into a single row using groupBy().agg(F.first()). This silently dropped object_direction_qualifier, knowledge_level, agent_type, and primary_knowledge_source from all but one source per triple, making it impossible to preserve conflicting evidence (e.g. three edges asserting "increased", "decreased", and null direction for the same TGF-β relationship).

Changes

Integration pipeline

  • union_edges: Removed the groupBy(SPO).agg(F.first()...) block. Each source now gets its own row, preserving all qualifier and evidence fields. primary_knowledge_sources (all PKS values for a given SPO triple) is computed via a separate non-lossy join and attached to every row.
  • normalize_edges: Changed subset dedup on (subject, predicate, object, primary_knowledge_source) to dropDuplicates() (exact-row only), so edges differing only in qualifiers or publications are no longer silently dropped.
  • Updated the unioned edge schema unique constraint to (subject, predicate, object, primary_knowledge_source) to reflect the new cardinality.

Filtering pipeline

  • Added DeduplicateEdges filter: an opt-in collapse of duplicate SPO triples for modeling use cases. Qualifier and evidence fields are preserved per-source in source_edge_properties (a JSON string keyed by primary_knowledge_source).
  • Optimized BiolinkDeduplicateEdges: the self-join now operates on a slim 4-column DataFrame (subject, object, predicate, parents) rather than carrying the full edge schema through the shuffle. Avoids a significant memory regression introduced by the new source_edge_properties column.
  • Added ArgoNode config (96 GB) for filter_prm_knowledge_graph_edges.

Fixes / Resolves the following issues:

Checklist:

  • Added label to PR (e.g. enhancement or bug)
  • Ensured the PR is named descriptively. FYI: This name is used as part of our changelog & release notes.
  • Looked at the diff on github to make sure no unwanted files have been committed.
  • Made corresponding changes to the documentation
  • Added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules
  • If breaking changes occur or you need everyone to run a command locally after
    pulling in latest main, uncomment the below "Merge Notification" section and
    describe steps necessary for people
  • Ran on sample data using kedro run -e sample -p test_sample (see sample environment guide)

 - Update get_unioned_edge_schema() unique constraint to include
     primary_knowledge_source, allowing multiple edges with the same
     (subject, predicate, object) from different knowledge sources
   - Change normalize_edges() to exact-row dedup only (dropDuplicates()
     without subset), preserving edges that share SPO+PKS but differ in
     qualifiers or publications
   - Rewrite union_edges() to replace the lossy groupBy().agg(F.first())
     with dropDuplicates() on (subject, predicate, object,
     primary_knowledge_source), keeping per-source edge attributes intact
   - Compute primary_knowledge_sources via a separate non-lossy groupBy
     join so each edge row carries cross-source provenance for its SPO
   - Add logging to union_edges() for per-source edge counts and dedup delta
   - Update test_unify_edges to assert edges from different PKS are
     preserved as separate rows with correct primary_knowledge_sources
@eKathleenCarter eKathleenCarter requested a review from a team as a code owner March 5, 2026 19:07
@eKathleenCarter eKathleenCarter requested a review from lvijnck March 5, 2026 19:07
@eKathleenCarter eKathleenCarter marked this pull request as draft March 5, 2026 19:08
@eKathleenCarter eKathleenCarter requested review from JacquesVergine and removed request for lvijnck March 5, 2026 19:08
@eKathleenCarter eKathleenCarter self-assigned this Mar 5, 2026
@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

Row total_edges distinct_spo_triples additional_edges_from_dedup_removal
1 80,515,662 77,170,212 3,345,450

@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

eKathleenCarter commented Mar 5, 2026

WITH pr2106 AS (
  SELECT
    primary_knowledge_source,
    COUNT(*) AS edge_count_pr2106
  FROM `mtrx-hub-dev-3of.release_PR2106_test.edges_unified`
  GROUP BY primary_knowledge_source
),

v0150 AS (
  SELECT
    primary_knowledge_source,
    COUNT(*) AS edge_count_v0150
  FROM `mtrx-hub-dev-3of.release_v0_15_0.edges_unified`
  GROUP BY primary_knowledge_source
)

SELECT
  COALESCE(pr2106.primary_knowledge_source, v0150.primary_knowledge_source) AS primary_knowledge_source,
  edge_count_pr2106,
  edge_count_v0150,
  edge_count_pr2106 - edge_count_v0150 AS edge_difference
FROM pr2106
FULL OUTER JOIN v0150
  ON pr2106.primary_knowledge_source = v0150.primary_knowledge_source
WHERE edge_count_pr2106 IS DISTINCT FROM edge_count_v0150
ORDER BY ABS(edge_difference) DESC;

only showing the top 25 most changed sources

primary_knowledge_source edge_count_pr2106 edge_count_v0150 edge_difference
infores:ubergraph 3,653,161 2,736,731 916,430
infores:primekg 9,372,210 8,956,446 415,764
infores:pathwhiz 8,232,041 7,968,074 263,967
infores:hmdb 2,699,189 2,439,023 260,166
infores:text-mining-provider-targeted 1,248,746 1,113,630 135,116
infores:pharos 389,056 265,955 123,101
infores:bindingdb 1,753,285 1,636,097 117,188
infores:go-plus 187,274 116,215 71,059
infores:hetionet 917,698 858,531 59,167
infores:semmeddb 20,304,934 20,246,772 58,162
infores:fma-umls 241,677 189,762 51,915
infores:goa 537,327 486,219 51,108
infores:intact 1,600,141 1,549,291 50,850
infores:fma-obo 119,530 69,712 49,818
infores:mondo 98,959 49,915 49,044
infores:string 10,896,709 10,852,506 44,203
infores:ctd 192,492 149,906 42,586
infores:uberon 73,977 34,994 38,983
infores:umls-metathesaurus 496,498 458,473 38,025
infores:mesh 831,618 798,937 32,681
infores:chebi 362,802 331,766 31,036
infores:ncit 1,016,881 986,211 30,670
infores:cl 45,111 14,622 30,489
infores:hpo 57,744 27,795 29,949
infores:drugbank 2,842,992 2,814,489 28,503

@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

See below for examples of what the source_edge_propteries contain

Details

The table below was generated from this BQ query and Claude was used to format the result into a readable MD table.

WITH base AS (
  SELECT
    subject,
    predicate,
    object,
    source_edge_properties,
    REGEXP_EXTRACT(subject, r'^([^:]+(?::[^:]+)?)') AS subject_prefix,
    REGEXP_EXTRACT(object, r'^([^:]+(?::[^:]+)?)') AS object_prefix,
    ARRAY_LENGTH(REGEXP_EXTRACT_ALL(source_edge_properties, r'"infores:[^"]+"')) AS infores_count,
    LENGTH(source_edge_properties) AS json_length
  FROM `mtrx-hub-dev-3of.run_test_pr2106_2_012ed03f.edges_filtered`
  WHERE source_edge_properties IS NOT NULL
    AND predicate NOT IN ('biolink:subclass_of', 'biolink:related_to', 'biolink:has_part')
),

combo_ranked AS (
  SELECT
    *,
    ROW_NUMBER() OVER (
      PARTITION BY subject_prefix, predicate, object_prefix
      ORDER BY infores_count DESC, json_length DESC
    ) AS combo_rank
  FROM base
),

combo_dedup AS (
  SELECT *
  FROM combo_ranked
  WHERE combo_rank = 1
),

predicate_ranked AS (
  SELECT
    *,
    ROW_NUMBER() OVER (
      PARTITION BY predicate
      ORDER BY infores_count DESC, json_length DESC
    ) AS predicate_rank
  FROM combo_dedup
),

predicate_limited AS (
  SELECT *
  FROM predicate_ranked
  WHERE predicate_rank <= 5
),

final AS (
  SELECT
    *,
    DENSE_RANK() OVER (ORDER BY predicate) AS predicate_group
  FROM predicate_limited
)

SELECT
  subject,
  predicate,
  object,
  source_edge_properties,
  subject_prefix,
  object_prefix,
  infores_count
FROM final
ORDER BY
  predicate_group,
  infores_count DESC
LIMIT 50;
subject predicate object source_edge_properties
NCBIGene:142 biolink:active_in GO:0090734 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:32028527, PMID:32241924, PMID:31796734, PMID:34795260, PMID:26626479, PMID:30675909, PMID:34874266, PMID:22683995, PMID:26626480, PMID:30104678, PMID:32358582, PMID:33186521
NCBIGene:9656 biolink:active_in GO:0035861 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:18678890, PMID:18411307, PMID:16377563, PMID:18583988, PMID:30898438, PMID:18582474
NCBIGene:282996 biolink:active_in GO:0005634 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:30262925, PMID:32840935, PMID:34732726, PMID:23886709, PMID:35427468, PMID:33188278
NCBIGene:6774 biolink:active_in GO:0005634 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:15653507, PMID:8756628, PMID:16285960, PMID:7543512, PMID:7568001, PMID:28781374, PMID:28811323, PMID:12743296, PMID:17344214, PMID:7528668
NCBIGene:4683 biolink:active_in GO:0035861 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:19338747, PMID:24534091, PMID:18411307, PMID:18678890, PMID:16622404, PMID:38961290, PMID:23115235, PMID:18583988, PMID:28867292, PMID:18582474
NCBIGene:6422 biolink:actively_involved_in GO:0090090 infores:hetionet
- knowledge_level: not_provided
- agent_type: not_provided
- aggregator_knowledge_source: infores:robokop-kg

infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:17462603, PMID:19254787, PMID:17035233, PMID:17471511, PMID:9391078, PMID:10347172, PMID:16149051, PMID:19095296, PMID:9724099, PMID:19277043, PMID:16532032, PMID:19072540, PMID:24080158, PMID:19569235, PMID:17994217, PMID:16288033, PMID:17443492
NCBIGene:4040 biolink:actively_involved_in GO:0060070 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:20093106, PMID:11742004, PMID:11029007, PMID:22988876, PMID:14739301, PMID:18215320, PMID:15908424, PMID:17239604, PMID:11448771, PMID:20059949, PMID:15271658, PMID:20093472, PMID:20137080, PMID:16890161, PMID:12121999, PMID:16805831, PMID:19107203, PMID:16365045

infores:hetionet
- knowledge_level: not_provided
- agent_type: not_provided
- aggregator_knowledge_source: infores:robokop-kg
NCBIGene:3297 biolink:actively_involved_in GO:0034605 infores:hetionet
- knowledge_level: not_provided
- agent_type: not_provided
- aggregator_knowledge_source: infores:robokop-kg

infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:9341107, PMID:17897941, PMID:16554823, PMID:14707147, PMID:10359787, PMID:10413683, PMID:7935471, PMID:10747973, PMID:12665592, PMID:12659875, PMID:9727490, PMID:21085490, PMID:26159920, PMID:9222587, PMID:12917326, PMID:11514557, PMID:7760831, PMID:11583998, PMID:9499401, PMID:8455624
NCBIGene:7040 biolink:actively_involved_in GO:0007179 infores:hetionet
- knowledge_level: not_provided
- agent_type: not_provided
- aggregator_knowledge_source: infores:robokop-kg

infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:25196641, PMID:27693460, PMID:9389648, PMID:22714950, PMID:12574355, PMID:11157754, PMID:28373289, PMID:28467929, PMID:18625725, PMID:15334054, PMID:18593713, PMID:26908446, PMID:26572508, PMID:31023188, PMID:32644293, PMID:31368162, PMID:18453574, PMID:19736306
NCBIGene:650 biolink:actively_involved_in GO:0030509 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:20427544, PMID:16194878, PMID:18382765, PMID:24778011, PMID:22450430, PMID:18184661, PMID:27860183, PMID:19736317, PMID:19664780, PMID:20843790, PMID:17992660, PMID:18326817, PMID:18436533, PMID:16049014, PMID:16771708

infores:hetionet
- knowledge_level: not_provided
- agent_type: not_provided
- aggregator_knowledge_source: infores:robokop-kg
NCBIGene:5468 biolink:acts_upstream_of GO:0048662 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:20622039, PMID:18382765, PMID:28467929, PMID:29182484, PMID:31023188
NCBIGene:650 biolink:acts_upstream_of GO:0010628 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:18382765, PMID:23399447, PMID:27860183
NCBIGene:11315 biolink:acts_upstream_of GO:0045944 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:19703902, PMID:17015834, PMID:21097510, PMID:16731528
NCBIGene:3456 biolink:acts_upstream_of GO:0007259 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:8969169, PMID:8798579, PMID:8027027, PMID:7813427
NCBIGene:8671 biolink:acts_upstream_of GO:0042391 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:15273250, PMID:10069984, PMID:21976511
NCBIGene:23378 biolink:acts_upstream_of_negative_effect GO:1903450 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:21471221
NCBIGene:7161 biolink:acts_upstream_of_negative_effect GO:0051726 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:18421303
NCBIGene:5457 biolink:acts_upstream_of_negative_effect GO:0051726 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:18421303
NCBIGene:6794 biolink:acts_upstream_of_negative_effect GO:0070314 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:12805220, PMID:17216128
NCBIGene:91039 biolink:acts_upstream_of_negative_effect GO:0070269 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:34019797, PMID:33731929
NCBIGene:2627 biolink:acts_upstream_of_or_within_negative_effect GO:0070315 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:9593712
NCBIGene:9528 biolink:acts_upstream_of_or_within_negative_effect GO:0006486 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:20427278
NCBIGene:3688 biolink:acts_upstream_of_or_within_negative_effect GO:0150003 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
NCBIGene:10019 biolink:acts_upstream_of_or_within_negative_effect GO:0038163 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:27430239, PMID:20404132
NCBIGene:1001 biolink:acts_upstream_of_or_within_negative_effect GO:0030512 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:22696062
NCBIGene:719 biolink:acts_upstream_of_or_within_positive_effect GO:0019722 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:10571060
NCBIGene:401024 biolink:acts_upstream_of_or_within_positive_effect GO:0007288 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:30137358, PMID:30745215
NCBIGene:58529 biolink:acts_upstream_of_or_within_positive_effect GO:0043503 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:18846255
NCBIGene:7114 biolink:acts_upstream_of_or_within_positive_effect GO:0043124 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:21343177
NCBIGene:23126 biolink:acts_upstream_of_or_within_positive_effect GO:0000724 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:26721387
NCBIGene:60496 biolink:acts_upstream_of_positive_effect GO:0009258 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:21238436, PMID:19933275
NCBIGene:54209 biolink:acts_upstream_of_positive_effect GO:0007613 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:31462511, PMID:29518357
NCBIGene:56052 biolink:acts_upstream_of_positive_effect GO:0006487 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:26931382, PMID:14973778
NCBIGene:4846 biolink:acts_upstream_of_positive_effect GO:0010628 infores:goa
- knowledge_level: prediction
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:23583836
NCBIGene:56478 biolink:acts_upstream_of_positive_effect GO:0035278 infores:goa
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:32354837, PMID:27342281, PMID:24335285, PMID:28487484
CHEBI:15940 biolink:affects NCBIGene:338442 infores:drugbank
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:12522134, PMID:12646212, PMID:15580557, PMID:16018973, PMID:16099840

infores:drugcentral
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:https://pubmed.ncbi.nlm.nih.gov/17705685

infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:15929991, PMID:17430113, PMID:19223991
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:hmdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: abundance

infores:text-mining-provider-targeted
- knowledge_level: not_provided
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMC:6714165, PMID:31273921, PMC:6714165, PMID:31273921
- object_aspect_qualifier: activity_or_abundance
- object_direction_qualifier: increased

infores:gtopdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:12522134, PMID:12563315, PMID:12646212
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:chembl
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:dgidb
- knowledge_level: prediction
- agent_type: automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:25737085

infores:pharos
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:22209457, PMID:24900295, PMID:29939744, PMID:24900372, PMID:22435740, PMID:17804224, PMID:20363624, PMID:25737085, PMID:19524438, PMID:29709786
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:drugmechdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:semmeddb
- knowledge_level: prediction
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:19223991, PMID:26621144, PMID:24412617, PMID:35595517, PMID:17932499, PMID:33924461, PMID:17554232, PMID:17430113, PMID:29514953, PMID:32397071, PMID:19349687, PMID:33605454, PMID:16674924, PMID:25599616, PMID:36991440, PMID:22914621, PMID:31617441, PMID:20655299, PMID:21317532

infores:bindingdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:17994679, PMID:17804224, PMID:17358052, PMID:17588745, PMID:18029181, PMID:18752940, PMID:18760600, PMID:19309152, PMID:26784936, PMID:20184326, PMID:20363624, PMID:20444602, PMID:20452209, PMID:24900295, PMID:22209457, PMID:17452318, PMID:19592242, PMID:24900372, PMID:22435740, PMID:24900524, PMID:19524438, PMID:29709786, PMID:25737085, PMID:29939744, PMID:32199732
- object_aspect_qualifier: activity
- object_direction_qualifier: increased
CHEBI:16469 biolink:affects NCBIGene:2100 infores:chembl
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:9048584
- object_aspect_qualifier: molecular_interaction
- object_direction_qualifier: decreased

infores:bindingdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:16610787, PMID:11906280, PMID:15713407, PMID:25305688, PMID:25559213, PMID:16162002, PMID:19128016, PMID:11708925, PMID:12459017, PMID:27647367, PMID:19286283, PMID:21381753, PMID:21481497, PMID:21885279, PMID:22122563, PMID:15081034, PMID:16309907, PMID:17448656, PMID:17890084, PMID:19863083, PMID:20553023, PMID:21218783, PMID:22283328, PMID:22647217, PMID:23043242, PMID:20812681, PMID:29641206, PMID:30144697, PMID:23608764, PMID:29741891, PMID:30967304, PMID:31879182, PMID:27692995, PMID:32912438, PMID:35870729
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:hmdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: abundance

infores:text-mining-provider-targeted
- knowledge_level: not_provided
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMC:3610010, PMID:19350480, PMID:24352099, PMID:34455235, PMID:15087431, PMC:6773857, PMC:6697483, PMC:8448167, PMID:12573528
- object_aspect_qualifier: activity_or_abundance
- object_direction_qualifier: decreased

infores:semmeddb
- knowledge_level: prediction
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:10342855, PMID:11997324, PMID:12737316, PMID:14761887, PMID:15138633, PMID:15389627, PMID:28437005, PMID:8769313

infores:pharos
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:24315190, PMID:12954056, PMID:15225686, PMID:15658851, PMID:15084115, PMID:15876535, PMID:18722117, PMID:15993065, PMID:17448656, PMID:15456246, PMID:15203155, PMID:16309907, PMID:16098741, PMID:30144697, PMID:15664843, PMID:15203156, PMID:27407030, PMID:32169784, PMID:16219463, PMID:20408532, PMID:31879182, PMID:16777408, PMID:28735214, PMID:12749898, PMID:16942012, PMID:25369367, PMID:28426931, PMID:16730987, PMID:16632357, PMID:15582421, PMID:16412638, PMID:15745820, PMID:12825935, PMID:18760603, PMID:17149865, PMID:15943471, PMID:19705860, PMID:15006374, PMID:17289385, PMID:10673099, PMID:17696335, PMID:29869503, PMID:17890084, PMID:12824043, PMID:17188490, PMID:15225685, PMID:16520733, PMID:17049855, PMID:11459665, PMID:15341953, PMID:15771467, PMID:16610787, PMID:15081034, PMID:29741891, PMID:28882502, PMID:20812681
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:dgidb
- knowledge_level: prediction
- agent_type: automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:20659801, PMID:16520733, PMID:19561619

infores:gtopdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:9048584
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:drugcentral
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:drugbank
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:23882126, PMID:2011412, PMID:17464340, PMID:24971815

infores:drugmechdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased
CHEBI:16469 biolink:affects NCBIGene:2099 infores:drugcentral
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:gtopdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:9048584
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:pharos
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:20875743, PMID:24315190, PMID:21510635, PMID:12954056, PMID:15225686, PMID:15658851, PMID:15084115, PMID:19836949, PMID:15876535, PMID:18722117, PMID:15456246, PMID:17448656, PMID:15203155, PMID:16309907, PMID:16098741, PMID:30144697, PMID:15664843, PMID:15203156, PMID:27407030, PMID:32169784, PMID:16219463, PMID:20408532, PMID:31879182, PMID:16777408, PMID:12749898, PMID:16942012, PMID:25369367, PMID:28426931, PMID:11965371, PMID:16730987, PMID:16632357, PMID:10447957, PMID:15582421, PMID:16412638, PMID:15745820, PMID:12825935, PMID:18760603, PMID:17149865, PMID:15943471, PMID:19705860, PMID:15006374, PMID:29348811, PMID:17289385, PMID:10673099, PMID:17696335, PMID:29869503, PMID:17379515, PMID:20621492, PMID:17890084, PMID:12824043, PMID:26183544, PMID:17188490, PMID:15225685, PMID:16520733, PMID:21839641, PMID:27647375, PMID:15341953, PMID:11459665, PMID:17049855, PMID:15771467, PMID:28105283, PMID:16610787, PMID:15081034, PMID:29741891, PMID:28882502, PMID:20812681
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:drugbank
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:17125913, PMID:17138652, PMID:23884115, PMID:2011412, PMID:11752352, PMID:24971815

infores:drugmechdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:hmdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: abundance

infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:12612060, PMID:15615701, PMID:17157789, PMID:23094148, PMID:24586459
- object_aspect_qualifier: molecular_modification
- object_direction_qualifier: increased

infores:bindingdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:17149865, PMID:16610787, PMID:12825935, PMID:15658851, PMID:11906280, PMID:10673099, PMID:12749898, PMID:25369367, PMID:15006374, PMID:15084115, PMID:15225685, PMID:15225686, PMID:15582421, PMID:15745820, PMID:16219463, PMID:16412638, PMID:16632357, PMID:16730987, PMID:16777408, PMID:16942012, PMID:17049855, PMID:17188490, PMID:17289385, PMID:26183544, PMID:18760603, PMID:11965371, PMID:12824043, PMID:15203155, PMID:15203156, PMID:10447957, PMID:15341953, PMID:15876535, PMID:15664843, PMID:15456246, PMID:15943471, PMID:16098741, PMID:27407030, PMID:27647375, PMID:28105283, PMID:19705860, PMID:10098655, PMID:11055361, PMID:10714509, PMID:11738564, PMID:2909731, PMID:21510635, PMID:21839641, PMID:1548683, PMID:15081034, PMID:16309907, PMID:28882502, PMID:17696335, PMID:17448656, PMID:17890084, PMID:19836949, PMID:20875743, PMID:29348811, PMID:20812681, PMID:29869503, PMID:28426931, PMID:30144697, PMID:8784443, PMID:14971899, PMID:24315190, PMID:29741891, PMID:31879182, PMID:32169784, PMID:34415160, PMID:33904307, PMID:34710747, PMID:35276362, PMID:35859859
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:semmeddb
- knowledge_level: prediction
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:8175711, PMID:25864222, PMID:24668680

infores:chembl
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: increased

infores:text-mining-provider-targeted
- knowledge_level: not_provided
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMC:3851293, PMC:6121186, PMC:3610010, PMID:32645509, PMC:8616513, PMID:15778002, PMID:15930180, PMC:6697483, PMC:8491851, PMC:8910684, PMID:16145313, PMID:14600092
- object_aspect_qualifier: activity_or_abundance
- object_direction_qualifier: decreased

infores:dgidb
- knowledge_level: prediction
- agent_type: automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
CHEBI:49668 biolink:affects NCBIGene:1956 infores:gtopdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:17416531, PMID:31895569, PMID:22037378
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:drugbank
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:10815932, PMID:11522647, PMID:11566608, PMID:11585753, PMID:11673690, PMID:11752352

infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:17473213, PMID:21787763
- object_aspect_qualifier: molecular_interaction
- object_direction_qualifier: increased

infores:ncit
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:hmdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: abundance

infores:drugcentral
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:pharos
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:26313252, PMID:26256032, PMID:33556869, PMID:14684309, PMID:23792318, PMID:32139324, PMID:22309911, PMID:27491023, PMID:25462282, PMID:26879314, PMID:16480284, PMID:29421573, PMID:27288180, PMID:22119130, PMID:30878832, PMID:27234887, PMID:32031378, PMID:27769671, PMID:28711703, PMID:31986406, PMID:27987485, PMID:23973168, PMID:24607591, PMID:15975507, PMID:26706113, PMID:29945794, PMID:30096580, PMID:19969465, PMID:20166671, PMID:23988354, PMID:21724404, PMID:27524310, PMID:31560541, PMID:27132165, PMID:26318056, PMID:30472599, PMID:30600149, PMID:29275232, PMID:28291344, PMID:26829280, PMID:33640672, PMID:20961149, PMID:21334203, PMID:28838691, PMID:29496411, PMID:24731281, PMID:23668441, PMID:22101132, PMID:29775935, PMID:31488358, PMID:34464874, PMID:31869655, PMID:17981366, PMID:19170633, PMID:30471829, PMID:29133033, PMID:24411123, PMID:30508379, PMID:26451770, PMID:28395219, PMID:20550212, PMID:17416531, PMID:29407971, PMID:16806916, PMID:24588073, PMID:11459659, PMID:29523467, PMID:28974338, PMID:28711702, PMID:25468044, PMID:28385595, PMID:28236592, PMID:30554954, PMID:28238614
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:bindingdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:14684309, PMID:24954438, PMID:20222760, PMID:23792318, PMID:24588073, PMID:24607591, PMID:24731281, PMID:16516473, PMID:25409491, PMID:25462282, PMID:25468044, PMID:25768701, PMID:16806916, PMID:26275028, PMID:17889528, PMID:17981366, PMID:26256032, PMID:26318056, PMID:26829280, PMID:26879314, PMID:27234887, PMID:27288180, PMID:27387355, PMID:19170633, PMID:27132165, PMID:27491023, PMID:27769671, PMID:19665377, PMID:19969465, PMID:20056425, PMID:20166671, PMID:20304535, PMID:20550212, PMID:20961149, PMID:21334203, PMID:21353546, PMID:21724404, PMID:27987485, PMID:28236592, PMID:22119130, PMID:22101132, PMID:22309911, PMID:28238614, PMID:11459659, PMID:28291344, PMID:28225269, PMID:12270171, PMID:28603991, PMID:28366268, PMID:28385595, PMID:28853575, PMID:16480284, PMID:28838691, PMID:17416531, PMID:22698782, PMID:22818848, PMID:28711702, PMID:29421573, PMID:29407971, PMID:23142066, PMID:23611691, PMID:24411123, PMID:29775935, PMID:28974338, PMID:29133033, PMID:28711703, PMID:29496411, PMID:29945794, PMID:29275232, PMID:28395219, PMID:30096580, PMID:23668441, PMID:23930994, PMID:23973168, PMID:23988354, PMID:24094432, PMID:26451770, PMID:26706113, PMID:31488358, PMID:30508379, PMID:30600149, PMID:30878832, PMID:30554954, PMID:30471829, PMID:30472599, PMID:30655941, PMID:31023512, PMID:30973735, PMID:31560541, PMID:31718182, PMID:27524310, PMID:32031378, PMID:32085964, PMID:32139324, PMID:31986406, PMID:32145644, PMID:31869655, PMID:33429247, PMID:29523467, PMID:33556869, PMID:32828424, PMID:33640672, PMID:33771586, PMID:34216747, PMID:34464874, PMID:33316752, PMID:35964425, PMID:33539089, PMID:33970631, PMID:33540357, PMID:36384036, PMID:35247755, PMID:33243532, PMID:35696863, PMID:32739648, PMID:34655985, PMID:36336316, PMID:36509365, PMID:27414067, PMID:27564586, PMID:36812395, PMID:37857324
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:chembl
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:dgidb
- knowledge_level: prediction
- agent_type: automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:18681783, PMID:17375033, PMID:22370314, PMID:14990633, PMID:16467544, PMID:16377102, PMID:17317677, PMID:14990632, PMID:18784101, PMID:17020982, PMID:19473722, PMID:21921847, PMID:16912157, PMID:19589612, PMID:20573926, PMID:16258541, PMID:19692680, PMID:22581822, PMID:17192902, PMID:16203769

infores:text-mining-provider-targeted
- knowledge_level: not_provided
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMC:7988006, PMC:5460536, PMC:8536918, PMC:7696666, PMID:27126828
- object_aspect_qualifier: activity_or_abundance
- object_direction_qualifier: decreased

infores:semmeddb
- knowledge_level: prediction
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:16487738, PMID:28482613, PMID:16648858, PMID:26418954, PMID:16843264, PMID:17616694, PMID:22922893, PMID:20088784, PMID:18317068, PMID:18682844, PMID:16741154, PMID:18676863, PMID:21464610
CHEBI:114785 biolink:affects NCBIGene:1956 infores:dgidb
- knowledge_level: prediction
- agent_type: automated_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:drugmechdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:ncit
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:text-mining-provider-targeted
- knowledge_level: not_provided
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMC:5119972, PMC:1936363, PMID:16318436, PMID:17932674, PMC:8491203, PMC:5214696, PMC:9249568, PMC:6540786, PMC:6801465, PMC:8867612
- object_aspect_qualifier: activity_or_abundance
- object_direction_qualifier: decreased

infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:21787763
- object_aspect_qualifier: molecular_interaction
- object_direction_qualifier: increased

infores:hmdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- object_aspect_qualifier: abundance

infores:pharos
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:26256032, PMID:14684309, PMID:18077425, PMID:17889528, PMID:20005116, PMID:32139324, PMID:19815412, PMID:20222733, PMID:24900830, PMID:18316192, PMID:16480284, PMID:26756222, PMID:30878832, PMID:26487917, PMID:32031378, PMID:19969465, PMID:26639762, PMID:20166671, PMID:27524310, PMID:28287730, PMID:24565969, PMID:22169601, PMID:32828424, PMID:30600149, PMID:24890652, PMID:33647840, PMID:33429247, PMID:30098869, PMID:19914837, PMID:34046625, PMID:21334203, PMID:32910655, PMID:26819674, PMID:33062157, PMID:27614407, PMID:27235841, PMID:22101132, PMID:31869655, PMID:24183742, PMID:17981366, PMID:26455919, PMID:24411123, PMID:25383627, PMID:30633509, PMID:25882519, PMID:28974338, PMID:27894589, PMID:19028424
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:gtopdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:22037378, PMID:14684309
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:bindingdb
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:14684309, PMID:18316192, PMID:21454582, PMID:24565969, PMID:24607998, PMID:24630412, PMID:24900830, PMID:24890652, PMID:25461317, PMID:25882519, PMID:25172421, PMID:17889528, PMID:26188620, PMID:17981366, PMID:26256032, PMID:26652482, PMID:26756222, PMID:27235841, PMID:20143778, PMID:19914835, PMID:19914837, PMID:19969465, PMID:20005116, PMID:19815412, PMID:20166671, PMID:20222733, PMID:20346655, PMID:20403700, PMID:20817523, PMID:21334203, PMID:21763148, PMID:21802290, PMID:21816517, PMID:21920766, PMID:27894589, PMID:22112541, PMID:22169601, PMID:22204741, PMID:22101132, PMID:22277588, PMID:22361272, PMID:12270171, PMID:28431353, PMID:16480284, PMID:18077425, PMID:20627597, PMID:20594859, PMID:20558072, PMID:22739090, PMID:22980218, PMID:29028534, PMID:34313432, PMID:29407956, PMID:23245570, PMID:23391364, PMID:23375090, PMID:23434139, PMID:24144854, PMID:24183742, PMID:24411123, PMID:30047410, PMID:28287730, PMID:28974338, PMID:29549841, PMID:30098869, PMID:30055463, PMID:23930994, PMID:23962660, PMID:26455919, PMID:26487917, PMID:31539778, PMID:30600149, PMID:31078412, PMID:30878832, PMID:30633509, PMID:30744932, PMID:31023512, PMID:30973735, PMID:31108261, PMID:31416740, PMID:31493743, PMID:27524310, PMID:32031378, PMID:32139324, PMID:32949719, PMID:33065442, PMID:33479688, PMID:32910655, PMID:31869655, PMID:33429247, PMID:33647840, PMID:27614407, PMID:32828424, PMID:33062157, PMID:34046625, PMID:32361329, PMID:34363937, PMID:33771586, PMID:34119830, PMID:34015505, PMID:33992931, PMID:36178776, PMID:35305462, PMID:33970631, PMID:34995690, PMID:35007724, PMID:33540357, PMID:35247755, PMID:33077264, PMID:32739648, PMID:34655985, PMID:36336316, PMID:36512711, PMID:35685617, PMID:27564586, PMID:37122549, PMID:27010345, PMID:37197473, PMID:36812395, PMID:37191268, PMID:38107169, PMID:37857324, PMID:36395650
- object_aspect_qualifier: activity
- object_direction_qualifier: decreased

infores:drugcentral
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2

infores:semmeddb
- knowledge_level: prediction
- agent_type: text_mining_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:16488082, PMID:19075865, PMID:25439691, PMID:22701710, PMID:23377280, PMID:27287856, PMID:15314969, PMID:28027514, PMID:22617245, PMID:19512896, PMID:19182243, PMID:30322862, PMID:25479544

infores:drugbank
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:rtx-kg2
- publications: PMID:12498017, PMID:12517254, PMID:12814826, PMID:12820772, PMID:12840797, PMID:11752352
NCBIGene:2952 biolink:affects_response_to CHEBI:16716 infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:29204680, PMID:15256146, PMID:18848868, PMID:15226677, PMID:15591089, PMID:12460800, PMID:15935803, PMID:15533900, PMID:18836923
NCBIGene:153 biolink:affects_response_to CHEBI:3441 infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:21516734, PMID:20643254, PMID:21395649, PMID:22192668, PMID:14502278, PMID:21599570, PMID:18075464, PMID:17200720
NCBIGene:1636 biolink:affects_response_to CHEBI:3011 infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:11007831, PMID:12652327, PMID:12848919, PMID:15498266, PMID:15555355, PMID:15788353, PMID:15793787
NCBIGene:2952 biolink:affects_response_to CHEBI:25944 infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:14504370, PMID:15922524, PMID:18550589, PMID:23982010, PMID:27185341, PMID:30678827, PMID:31569996
NCBIGene:3704 biolink:affects_response_to CHEBI:2948 infores:ctd
- knowledge_level: knowledge_assertion
- agent_type: manual_agent
- aggregator_knowledge_source: infores:robokop-kg
- publications: PMID:19129747, PMID:21961091, PMID:21544018, PMID:16431304, PMID:19214663, PMID:15167706, PMID:15571265
CHEBI:59560 biolink:ameliorates_condition MONDO:0017389 infores:efo
- knowledge_level: knowledge_assertion
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
CHEBI:15956 biolink:ameliorates_condition MONDO:0011841 infores:efo
- knowledge_level: knowledge_assertion
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
CHEBI:17310 biolink:ameliorates_condition MONDO:0012407 infores:efo
- knowledge_level: knowledge_assertion
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
CHEBI:17015 biolink:ameliorates_condition MONDO:0014795 infores:efo
- knowledge_level: knowledge_assertion
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:rtx-kg2
CHEBI:27690 biolink:ameliorates_condition MONDO:0020483 infores:efo
- knowledge_level: knowledge_assertion
- agent_type: manual_validation_of_automated_agent
- aggregator_knowledge_source: infores:rtx-kg2

@eKathleenCarter eKathleenCarter added enhancement improving an existing system or feature to work better. kg-schema labels Mar 10, 2026
@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

Waiting to review this with EC tomorrow. After the discussion, this will be ready for review.

@eKathleenCarter eKathleenCarter marked this pull request as ready for review March 10, 2026 19:31
…pairs duplicate (source, target) error in CII

The previous pool of 20 drugs/diseases caused ec_indications_list and off_label (both 100 rows) to sample with replacement, allowing duplicate (ec_id, target) pairs with differing on_label/off_label values. This PR causes normalize_edges change from SPO-key dedup to all-column dedup, these duplicates survived into generate_pairs and caused a pandera unique=["source", "target"] violation.

Confirmed via BigQuery: no duplicate (source, target) exist in the real ec_clinical_trials_edges_normalized, off_label_edges_normalized, or ec_indications_list_edges_normalized tables. The 110-row pool more accurately reflects production data for these sources.
Comment thread libs/matrix-pandera/src/matrix_pandera/schemas.py Outdated
Comment thread pipelines/matrix/conf/base/filtering/parameters.yml
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py Outdated
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py Outdated
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py Outdated
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py Outdated
Comment thread pipelines/matrix/tests/pipelines/test_filtering.py
Copy link
Copy Markdown
Collaborator

@JacquesVergine JacquesVergine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better, a few small comments and questions

Comment thread libs/matrix-pandera/src/matrix_pandera/schemas.py Outdated
Comment thread pipelines/matrix/conf/base/fabricator/parameters.yml
Comment thread pipelines/matrix/src/matrix/pipelines/data_release/nodes.py Outdated
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py Outdated
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/src/matrix/pipelines/filtering/filters.py
Comment thread pipelines/matrix/conf/base/filtering/parameters.yml Outdated
@eKathleenCarter eKathleenCarter merged commit 147ef32 into main Mar 20, 2026
63 of 69 checks passed
@eKathleenCarter eKathleenCarter deleted the ekcarter/xdata-278-remove-edge-de-duplication branch March 20, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement improving an existing system or feature to work better. kg-schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants