Manuscript outline

# Intro
- Problem statement: 
  * NCBI records vary in quality
  * not available for download as a single data set
  * annotation not consistent or difficult to piece together
- Previous 16S data sets
  * RDP
  * GreenGenes
  * NCBI bioproject?
  * Silva
  * 16sitgdb - https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2022.905489/full
  * GSR-DB - https://journals.asm.org/doi/10.1128/msystems.00950-23
 - Summarize ya16sdb features
   * annotation
   * outlier detection (includes plotly website)
   * sequence subsets by confidence
# Methods
- ...
# Results/Discussion
- Record counts in each category (16S genes, whole genomes, taxcheck pass vs fail, refseq, reference sequences) 
- Outlier detection and taxcheck outcomes for each subset
- Discrepancies between taxcheck and outlier detection
- Maybe: are there any predictors of outliers (eg, by year, source, etc)


# TODOs
- [x] start a group zotero (YM)
- [ ] gather literature (group)
- [ ] Chris: begin methods in README or elsewhere in repo
- [ ] Create OneDrive doc for MS (NH)
- [ ] Start authoring problem statement (NH)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manuscript outline #73

Intro

Methods

Results/Discussion

TODOs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Manuscript outline #73

Description

Intro

Methods

Results/Discussion

TODOs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions