Skip to content

Gen3 input data from MIDRC BIH - May 2025#35

Open
cgmeyer wants to merge 2 commits intomainfrom
Gen3_input_BIH
Open

Gen3 input data from MIDRC BIH - May 2025#35
cgmeyer wants to merge 2 commits intomainfrom
Gen3_input_BIH

Conversation

@cgmeyer
Copy link
Copy Markdown
Collaborator

@cgmeyer cgmeyer commented May 5, 2025

This is the first version of the MIDRC BIH input data: Modality/StudyDescription combinations with frequency counts and platform (the data resource(s) where the imaging study metadata originated) added.

@fedorov
Copy link
Copy Markdown
Contributor

fedorov commented Oct 16, 2025

@cgmeyer I am guessing this one is superseded by #36 that I just merged - can you confirm?

@cgmeyer
Copy link
Copy Markdown
Collaborator Author

cgmeyer commented Oct 16, 2025

This one contains the imaging hub (MIDRC BIH) data: "in/BIH_StudyDescriptions_Gen3.tsv". The one you just merged is only for the MIDRC data commons "in/StudyDescriptions_Gen3.tsv". MIDRC central requested that we prioritize mapping the Duke CSpineSeg dataset with description "MRI CERVICAL SPINE WITHOUT CONTRAST".

We should meet to further discuss the format of the BIH_... input table, because I think there are outstanding questions about whether the format makes sense.

@fedorov
Copy link
Copy Markdown
Contributor

fedorov commented Oct 16, 2025

Ah, right ... Initially I thought I missed/forgot to merge it, but I now I do remember there were some open questions...

@fedorov
Copy link
Copy Markdown
Contributor

fedorov commented Nov 21, 2025

@cgmeyer it would be great if you could clarify what is the frequency number - studies? series? something else?

Specifically, I am confused why NLST-LSS, which shows up 3 times, has frequency 19569 for SEG modality


but 6553 for the SR row
and only 761 for CT

Those questions aside, I suggest we update this BIH list to have one row for each distinct value of StudyDescription, and contains aggregated distinct values for all modalities and all source repositories. Frequency I think ideally should contain the total number of distinct studies that contain it, but I am not sure if you have that information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants