Skip to content

Present conditions in "Response to treatment" studies without treatment in tables #10

@wbazant

Description

@wbazant

Some large response to treatment studies have a pretty good baseline component in them, that could be cut out and treated usefully, e.g. ERP002113: Praziquantel mode of action and resistance, but also ERP104993: Characterising Schistosoma mansoni stem cell populations and SRP096638: RNA-seq for the study of the effect of histone modifying enzymes (HMEs) inhibitors on gene expression in Schistosoma mansoni .

Expression Atlas has a need for this, and does it in a fairly poor way: they give their own accessions, so they can just add a study a few times under different accessions. E.g. PanCancer is split into "PCAWG by disease and PCAWG by individual".

Analysis

Extra "baseline" files would be produced, with just the non-treatment conditions.

Gene page

The gene page module would know to look for subslices data in a slightly different place, show the results from these files in a different category, and link to the study in the studies page in the same way.

Studies page

Subslices, just like categories, can be considered an internal detail needed to create the gene page. Do not change the studies page at all.

Curation

Each slice should be an extra folder inside studies, titled <study_id>:<category>, with a single file <study_id>:<category>.tsv, holding a list of conditions inside.

Checks

A subslice should have a proper subset of conditions of the parent, a different category to the parent, and no contrasts.

Incoming studies

Identifying a subslice could be done by a heuristic: if a study is a "response to treatment", has at least three "baseline" conditions, and they form a non-"other" category, create a subslice for them. No multiple subslices, yet.

Implementation

Reading a subslice folder would involve reading a parent folder, and changing a few bits. <study_id>:<category> can serve as a unique ID throughout the program.

A subslice has a very short list of analyses - maybe just the TPMs per condition. Given the implementation, it's easiest if they get redone.

The studies page code would need to change a bit if we were to offer the extra files, because as a list of studies come to it, the pieces to draw are no longer one per list. It's easier to just not draw them.

The gene page code has to change in how it makes URLs to the studies page (strip the <study_id>:<category>).

I think I would want to change all the categories to snake_lowercase - <species>.studies.tsv can stay the same if it has to.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions