Skip to content

Set up a schema for Dataset for MorPhiC #54

@gabsie

Description

@gabsie

For which schema is a change/update being suggested? MorPhiC

What should the change/update be?
I would like to add a new schema for a dataset. The entity will be in hierarchy one level down from Study, i.e. Studies might have multiple datasets. Also, in DB, datasets might exist independently, to attach a Dataset to a Study is optional.

What new field(s) need to be changed/added?

Field name: dataset_name

  • Field description: A brief name for the dataset
  • Field type: text
  • Multiple values: no
  • Required: yes
  • Examples: Extra-embryonic lineage KO of gene STS
  • CV or enum: no

Field name: cell_line_name

  • Field description: Name of the cell line studied with this dataset
  • Field type: text
  • Multiple values: no
  • Required: yes
  • Examples: KOLF2.2J
  • CV or enum: no

Field name: assay_type

  • Field description: Name of the assay used in the generation of the dataset.
  • Field type: text
  • Multiple values: no
  • Required: yes
  • Examples: scRNAseq
    (Note: should we consider the options for Pooled on this level, as well as Arrayed for the Genes altered... I don't know .. )
  • CV or enum: yes, CV

Field name: perturbation_type

  • Field description: Type of perturbation introduced by the gene expression alteration assay.
  • Field type: text
  • Multiple values: no
  • Required: yes
  • Examples: CRISPR-Cas9 KO
  • CV or enum: don't know? please advise :)

Field name: duo_code

  • Field description: Data Usage Ontology code that describe the data sharing restrictions for this dataset.
  • Field type: text
  • Multiple values: no
  • Required: yes
  • Examples: NRES, or DUO:0000046
  • Enum: don't know? please advise :)

Field name: release_date

  • Field description: Estimated date of data release to the public.
  • Field type: date
  • Multiple values: no
  • Required: yes
  • Examples: 2025-03-03
  • CV or enum: no

Field name: other_notes

  • Field description: Any notes for the dataset
  • Field type: text
  • Multiple values: no
  • Required: no
  • Examples: This dataset is not for processing just yet.
  • CV or enum: no

Why is the change requested?

At the moment all existing MorPhiC studies are equal to datasets, but we wish to establish the option to have datasets under Studies in case multiple assay types or DUO codes would be supported.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions