Skip to content

Add HRA small-intestine NTR batch outputs and review reports#3720

Open
zhengj2007 wants to merge 1 commit into
masterfrom
add-hra-small_intestine-ntr
Open

Add HRA small-intestine NTR batch outputs and review reports#3720
zhengj2007 wants to merge 1 commit into
masterfrom
add-hra-small_intestine-ntr

Conversation

@zhengj2007
Copy link
Copy Markdown
Collaborator

Generated via bulk_ntr_workflow from source_data/small-intestine.csv.

Pipeline completed through Stage 4:

  • 13 input rows processed
  • 0 new terms remained in final template after merge filtering
  • 10 confirmed existing UBERON matches excluded
  • 2 out-of-scope terms excluded
  • 2 manual-curation entries flagged
  • 4 name corrections reported

Adds the small-intestine source CSV, generated template stubs, and report TSVs (candidates, errors, input, manual_curation, name_corrections, out_of_scope).

Generated via bulk_ntr_workflow from source_data/small-intestine.csv.

Pipeline completed through Stage 4:
- 13 input rows processed
- 0 new terms remained in final template after merge filtering
- 10 confirmed existing UBERON matches excluded
- 2 out-of-scope terms excluded
- 2 manual-curation entries flagged
- 4 name corrections reported

Adds the small-intestine source CSV, generated template stubs, and report TSVs
(candidates, errors, input, manual_curation, name_corrections, out_of_scope).

Signed-off-by: dragon-ai-agent <dragon-ai-agent@users.noreply.github.com>
@ai4c-agent
Copy link
Copy Markdown

ai4c-agent Bot commented May 26, 2026

Claude finished @zhengj2007's task in 4m 35s —— View job


PR Review: Add HRA small-intestine NTR batch outputs and review reports

  • Gather context and read PR description
  • Review changed files (CSV, templates, reports)
  • Check hierarchy and parent consistency
  • Check definitions, xrefs, and references
  • Verify OBO/ODK conventions
  • Post final review summary

Review Checklist

  • Hierarchy and parents are consistent
  • Definitions and supporting references look adequate (N/A — no new terms added)
  • Obsolete or replacement handling is safe (N/A)
  • OBO and ODK conventions appear preserved
  • No obvious CI, auth, or workflow regression

Summary

This is a well-executed pipeline-output PR. All 13 HRA ASCTB small-intestine input terms have been processed; no new UBERON terms are introduced because every term either already exists, is out-of-scope, or requires manual curation before a new ID can be assigned. The report files are detailed and provide actionable curator guidance.

UBERON ID verification (spot-checked via obo-grep.pl):

Source label Candidates UBERON Actual name in ontology Status
Epithelium UBERON:0013636 epithelium of intestinal villus ✅ valid
muscularis mucosa UBERON:0006676 muscularis mucosa ✅ exact
muscularis externa UBERON:0006660 muscular coat ✅ valid synonym
muscularis propria UBERON:0006660 muscular coat ✅ valid synonym
submucosal plexus of Meissner UBERON:0005304 submucous nerve plexus ✅ valid (eponym)
myenteric plexus of Auerbach UBERON:0002439 myenteric nerve plexus ✅ valid (eponym)
mucosa UBERON:0000344 mucosa ✅ exact
submucosa UBERON:0000009 submucosa ✅ exact
serosa UBERON:0000042 serous membrane ✅ valid synonym
connective tissue UBERON:0002384 connective tissue ✅ exact

Issues Found

🟡 IMPORTANT: Stage 5 (ODK registration) not completed

hra-small-intestine.template.tsv and hra-small-intestine-groups.template.tsv are not registered in src/ontology/uberon-odk.yaml. The existing hra-skeleton template is registered (line 177). Since both templates are currently empty (header-only, no new terms), omitting registration is reasonable for now — but when terms are eventually added via manual curation, registration will be required before the ODK build will include them. This should be done in the follow-up PR that adds the actual terms.

🟡 IMPORTANT: "muscularis propria" and "muscularis externa" both map to UBERON:0006660

Two distinct source terms map to the same UBERON ID (muscular coat). While anatomically valid — these are accepted synonyms — the candidates.tsv report doesn't explicitly flag the duplication. A curator reviewing the report could mistake this for a pipeline error. Consider adding a note such as "muscularis propria is a synonym for muscularis externa; both map to UBERON:0006660 (muscular coat)".

🔵 SUGGESTION: Empty table column in input.tsv

All 13 rows have an empty table column in input.tsv despite the source CSV having "small-intestine" in the tables column. This looks like a column-name mismatch (tables in CSV → table in report). Minor pipeline issue — does not affect the template output.

🔵 SUGGESTION: UBERON canonical labels not shown in candidates.tsv

For matches where the UBERON primary label differs from the source label (e.g., "serosa" → "serous membrane", "submucosal plexus of Meissner" → "submucous nerve plexus"), the candidates report shows only the UBERON ID without the canonical label. Including the UBERON label in a uberon_label column would help curators verify matches without additional lookups.


Manual Curation Quality

The two manual-curation entries are well-documented:

  • "circular/longitudinal muscle" — correctly identified as non-standard notation; recommends either splitting into separate NTRs for the circular/longitudinal layers (referencing UBERON:0012368, UBERON:0012369) or mapping to existing UBERON:0006660.
  • "Lamina propria/Gut associated lymphoid tissue (GALT)" — correctly identifies the slash notation as conflating two distinct structures (UBERON:0000030, UBERON:0001962) and proposes either splitting into two NTRs or creating "lamina propria of intestinal villus" with a proper intersection_of axiom.

Both suggestions follow UBERON design patterns appropriately.


Recommendation

This PR is ready to merge. The report files accurately document the pipeline run, the UBERON ID mappings are correct, and the flagged items provide clear curator guidance. The 🟡 issues above are follow-up items, not blockers: ODK registration belongs in the term-addition PR, and the duplicate-mapping note is documentation polish.

Copy link
Copy Markdown

@ai4c-agent ai4c-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-executed pipeline output PR. All 13 HRA ASCTB small-intestine input terms correctly processed — 10 confirmed matches verified against uberon-edit.obo, 2 out-of-scope terms properly flagged with detailed anatomical reasoning, and 2 manual-curation entries well-documented with specific UBERON term references and actionable suggestions. No new terms added; the templates are empty header stubs pending manual curation follow-up. Two 🟡 notes for follow-up PRs: (1) ODK registration of the new templates should happen when terms are eventually added; (2) candidates.tsv could note that 'muscularis propria' and 'muscularis externa' are synonyms both mapping to UBERON:0006660 to avoid curator confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant