Skip to content

torigiffin/Genome-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genome-Analysis

Genome Analysis Project

Project Plan

Aim: "Dead Zones" or areas of ocean with seasonal to long-term low dissolved oxygen (DO) concentrations are becoming increasingly more frequent. This study focuses on one of the largest costal dead zones on the northern Gulf of Mexico. The low DO levels are due to eutrophication-enhanced bacterioplankton respiration as well as strong seasonal stratification. I am going to assemble genomes of becterioplankton collected from the northern Gulf of Mexico dead zone by Thrash et al. as described in Metabolic roles of uncultivated bacterioplankton lineages in the Northern Gulf of Mexico “dead zone”. Analysis will include metagenome assembly, binning to create individual genomes, as well as expression analysis and phylogenetic assignment. Knowing which species or genuses of bacerioplankton are able to live in the nGOM, in both oxic and hypoxic environment is imperative to understanding the environment. Learning more about these species, both previously described and not will help us to better understand the active biogeochemical cycling they mediate. Expression analysis conducted along with functional annotation gives even more insight as we can see the metabolic pathways present and how they contribute to the environment.

Workflow

HI

Analysis Program Output Expected Run Time Expected Deadline Completed
Reads Quality Control FastQC .html ~15 min April 7th Yes
RNA Trimming Trimmomatic fastq ~30 min April 7th Yes
DNA Assembly Megahit
  • ./megahit_out with output directory
  • .contigs.fa with contigs
~6 h April 7th Yes
Assembly Evaluation QUAST
  • .txt
  • .pdf
~ 45 min April 14th Yes
DNA Alignment BWA .bam ~ 4-6 h April 21st Yes
Binning Metabat .fasta for each bin <30 min April 21st Yes
Binning Evaluation CheckM output report ~2 h April 21st Yes
RNA Alignment BWA .sam ~ 4-6 h April 21st Yes
Annotation Prokka
  • GFF file with annotations
  • GenBank file with sequences
~ 1 h April 28th Yes
Phylogenetic Placement PhyloPhlan
  • top SGBs
  • closest SGB, GGB, FGB
  • reference genomes, and "all vs. all" matrix of all pairwise Mash distances
~6 h April 28th Yes
Expression Analysis HTseq .txt with read count table ~6 h April 28th Yes
Extra Analysis: Abundance of Organisms BWA .bam ~ 4-6 h May 12th Yes

Data Management System

├── genome_analyses
    ├── 01_reads_quality_control  
    │       └── fastqc_trimmed_RNA_script 
    │       └── slurm-7528571.out 
    │   ├── fastqc_DNA_results
    │   ├── fastqc_RNA_post_trim_results
    │   ├── fastqc_RNA_pre_trim_results
    ├── 02_RNA_trimming
    │       └── adapter_sequences
    │       └── slurm-7528563.out
    │       └── trimmomatic_script
    │   ├── trimmed_RNA
    ├── 03_DNA_assembly
    │       └── megahit_script
    │       └── slurm-7723214.out
    │   ├── DNA_assembly_results
    ├── 04_assembly_evaluation 
    │       └── quast_script
    │       └── slurm-7731177.out
    │   ├── quast_results 
    ├── 05_alignment
    │       └── bwa_results_SRR4342129.bam
    │       └── bwa_results_SRR4342133.bam
    │       └── bwa_script
    │       └── slurm-7795240.out
    ├── 06_binning
    │       └── change_bin_names_script
    │       └── depth.txt
    │       └── metabat_script
    │       └── slurm-7797909.out
    │       └── slurm-7799427.out
    │   ├── bins
    ├── 07_binning_evaluation
    │       └── checkm_qa
    │       └── checkm_qa_script
    │       └── checkm_script
    │       └── slurm-7799916.out
    │       └── slurm-7827270.out
    │   ├── CheckM_data
    │   ├── checkm_results
    ├── 08_RNA_mapping
    │       └── bwa_script_for_loop 
    │       └── bwa_script_for_loop_sample_2
    │       └── slurm-7855951.out
    │       └── slurm-7855966.out
    ├── 09_annotation
    │       └── prokka_script
    │       └── rename_annotation
    │       └── slurm-7852786.out
    │   ├── 1_prokka_results 
    │   ├── ...  
    │   ├── 49_prokka_results  
    │   ├── renamed_annotations  
    ├── 10_phylo_placement
    │       └── make_annotation_folder 
    │       └── phylophlan_conda_script
    │       └── slurm-7856958.out
    │   ├── phylogeny_results  
    ├── 11_expression_analysis
    │       └── 1_SRR4342137_expression_results.txt
    │       └── ...
    │       └── 49_SRR4342139_expression_results.txt
    │       └── htseq_script
    │       └── slurm-7856869.out
    ├── 12_abundance_of_bins
    │       └── bwa_script
    │   ├── mapping to bins 
    ├── DNA_trimmed
    ├── RNA_untrimmed

About

Genome Analysis Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors