Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions data/community/itps/orov/L/refseq/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## Unreleased

Initial release of a Oropouche Virus (OROV) dataset for segment L based on NCBI refseq reference genome.

Read more about Nextclade datasets in the documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html
20 changes: 20 additions & 0 deletions data/community/itps/orov/L/refseq/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Nextclade Dataset for "OROV" L segment based on RefSeq reference genome

## Dataset Attributes

| Attribute | Value |
| -------------------- | ---------------------------------------- |
| Name | orov/L/refseq |
| RefName | Oropouche virus segment L |
| RefAccession | NC_005776.1 |

## Scope of This Dataset

The dataset aims to enable the quality control of segment L ofOropouche virus using ncbi refseq as reference.


The source code is available at [InstitutoTodosPelaSaude/nextclade-datasets-workflows](https://github.com/InstitutoTodosPelaSaude/nextclade-datasets-workflows/tree/main/orov).

For bugs, please open an [issue](https://github.com/InstitutoTodosPelaSaude/nextclade-datasets-workflows/issues).

Read more about Nextclade datasets in the Nextclade documentation: [Nextclade Datasets](https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html).
7 changes: 7 additions & 0 deletions data/community/itps/orov/L/refseq/genome_annotation.gff3
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
##sequence-region NC_005776.1 1 6846
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=118655
NC_005776.1 RefSeq region 1 6846 . + . ID=NC_005776.1:1..6846;Dbxref=taxon:118655;Name=L;gbkey=Src;genome=genomic;mol_type=genomic RNA;segment=L
NC_005776.1 GenBank gene 44 6796 . + . gene_name=L
98 changes: 98 additions & 0 deletions data/community/itps/orov/L/refseq/pathogen.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
{
"$schema": "https://raw.githubusercontent.com/nextstrain/nextclade/refs/heads/release/packages/nextclade-schemas/input-pathogen-json.schema.json",
"alignmentParams": {
"retryReverseComplement": true,
"alignmentPreset": "high-diversity"
},
"attributes": {
"name": "orov/L/refseq",
"reference accession": "NC_005776.1",
"reference name": "Oropouche virus, L segment"
},
"compatibility": {
"cli": "3.0.0-alpha.0",
"web": "3.0.0-alpha.0"
},
"placementMaskRanges": [
{
"begin": 0,
"end": 43
},
{
"begin": 6796,
"end": 6846
}
],
"deprecated": false,
"enabled": true,
"experimental": true,
"files": {
"changelog": "CHANGELOG.md",
"examples": "sequences.fasta",
"genomeAnnotation": "genome_annotation.gff3",
"pathogenJson": "pathogen.json",
"readme": "README.md",
"reference": "reference.fasta",
"treeJson": "tree.json"
},
"meta": {
"bugs": "https://github.com/InstitutoTodosPelaSaude/nextclade-datasets-workflows/issues",
"source code": "https://github.com/InstitutoTodosPelaSaude/nextclade-datasets-workflows/tree/main/orov"
},
"qc": {
"frameShifts": {
"enabled": true,
"ignoredFrameShifts": [
{
"codonRange": {
"begin": 788,
"end": 792
},
"cdsName": "L"
},
{
"codonRange": {
"begin": 797,
"end": 800
},
"cdsName": "L"
},
{
"codonRange": {
"begin": 846,
"end": 855
},
"cdsName": "L"
}
]
},
"missingData": {
"enabled": true,
"missingDataThreshold": 1369,
"scoreBias": 95
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 7
},
"privateMutations": {
"cutoff": 30,
"enabled": true,
"typical": 10,
"weightLabeledSubstitutions": 2,
"weightReversionSubstitutions": 1,
"weightUnlabeledSubstitutions": 1
},
"snpClusters": {
"enabled": false
},
"stopCodons": {
"enabled": true
}
},
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
},
"defaultCds": "L"
}
116 changes: 116 additions & 0 deletions data/community/itps/orov/L/refseq/reference.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
>NC_005776.1 Oropouche virus segment L, complete genome
AGTAGTGTGCTCCTATTCCGAAACAAACAAAAACAATCTCAAAATGTCACAACTGTTGCT
CAACCAATATCGGAATAGGATATTGCACTGCCGTGAACCTGAGATAGCAAAGGATATATG
GCGAGATCTATTAAATGATCGACACAATTACTTTTCTCGGGAATTTTGCAGAGCTGCAAA
TCTTGAGTACAGAAATGATGTTCCTGCTGAGGATATTTGTGCTGAAGTTCTTGATGGTTA
TAAAGCAAGGAAAGTTCGCTTTTGTACACCTGATAATTACTTACTACATGATGGAAAGAT
GTATATAATAGACTTCAAAGTGTCTGTAGACGACCGATCTTCTAGAATCACAAGGGAGAA
ATATAATGAGATTTTTGGAGAGGTATTCAATCCAGAAGGTGTAGATTTTGAAATTGTTAT
TATTAGATTAGATCCTTCAAATATGACGATACATGTGGACTCTCGAGATTTCGTGAATAC
AATTGGGCCGATTACATTAAACATTAGTATGCAATGGTTTTTTGATATGAAAGACTTCTT
GTTCGGGAAATTTCGGGATGATGATAAATTCCATGCTATAATAAGTCAAGGAGAATTCAC
AATGACATTGCCATGGATTGAAGAAGACACCCCAGAATTGCTTACTCATCCTATATACAA
TGAATTCATGAGTTCAATGCCAGAGGCAGAACAGGCCCTATTCAAGGAAGCATTGGAATT
CAAATCATTTGGGGCAGAAAAATGGAATATCTTTTTGAAGGGGGTGATGTCAAAGTATGG
TGAATATTATAAAGAATTTACTAAAGGACATGCTCATTCTATATTTCTGACAACAGGGGA
CTACCCCAAGCCAGACAAAGACCAAATTTCAGCAGGTTGGAGAGAAATGGTAAACAGAGT
AAGCTCTGAACGTGACATGTCAAATGACATAAATCAGGAAAAACCAAGCATGCATTTTAT
ATGGGCAAAGAATGATTCAAATAGCAACAATAATATACAAAAGCTAATCAAACTATCTAA
ATCACTGCAAGCTATGAGCGGGACAGGGAGCTATGTAAATGCTTTCAAGTCATTAGGGAG
ATTAATGGATATATCATCAGATGTTAAAAAATATGAATCATTTTGTGGGAAATTGAAATC
TCTGGCAAGGTCTAGTATAAAAAAACTTGACAGGAAAATAGAGCCAATACAAATTGGGAC
TGCAACTGTCTTATGGGAACAGCAATTTAAACTAGATACAGATGTTATAAAAAGAGAAGA
CAGAATACATTTAATGAAAGATTATCTTGGGATCGGTAAGCACAAATCATTTTCAAAGAA
ATTAAACAACGACATAAATACTGATAAGCCTAAAATATTAAATTTCAACAATGATGATAT
AGTCAGGAAATGCAAAGATAAATATAATCAAGTCATACATAACCTATCCCAAATCAATGA
ATTAGATAAGATTGGAAACTACCTAGAGCACTTTTCAGCTAAAATTAGTGCCTGCAGTGT
AGAAATGTGGGATTTTATATATAATACAACCAAAACTAAATACTGGCAATGCATCAATGA
CTATTCCACCCTAATGAAAAACATGTTAGCTGTCTCTCAATATAATAGACACAATACGTT
TAGAATTGTCTCATGTGCAAACAATAATGTATTTGGTCTAGTAATGCCAAGCTCAGATAT
AAAGACAAAAAAAGCAACTTTAGTCTATGCAATAATGGCTCTCCATAATGAGGAGGCAGA
AATAGCAGAACTTGGCTCACTCTACTCAACTTTTAAGACAGCAACAGGATATATTTCAAT
ATCAAAGGCTTTTAGGCTGGATAAAGAAAGATGCCAACGCATAGTATCCTCTCCAGGCTT
GTTCCTCATGACAAGCTGCCTATTATTCAACGGTAACAAGAGTTTAGAATTTGATAAATT
ACTAGGATTTTCATTTTTTACGTCAATATCAATTACGAAAGCTATGCTCTCCCTTACTGA
GCCTTCACGTTATATGATCATGAACTCGTTAGCAGTTTCCAGCCATGTAAGAGAGTATAT
ATCTGAAAAATTCTCCCCTTATACAAAAACATCATTTTCTGTGGTAATGACAGACTTAAT
CAAGAAGGGTTGCTATTCAGCATATGAACAGAGAAAAAAAGTACAAATAAGAGACATAAA
ATTAACAGATTATGATATAACACAAAAGGGAGTGGATTCCAAAAGAGATCTTAAATCTAT
TTGGTTCCCAGGAAAGGTAAACCTGAAAGAATATTTAAACCAAATTTATCTACCATTTTA
TTTTAACTCTAAAGGATTACATGAAAAACATCATGTCTTGATAGATTTGGCTAAAACAGT
ACTAGAAATCGAAAAAGAGCAAAGGGAGTCATTACCTGAGCCATGGTCAGAGATACCTGC
TAAGCGACTGTCACTTAATGTTTTAATTTACTCATTGCAGGAACTGAATTTAGATACTTC
AAGACATAATTTTGTAAGAAGCCGGGTGGAAAACGCAAATAATTTCAACAGATCTATAAC
GACAATATCTACTTTTACCAGCTCAAAATCATGCATTAAGATTGGTGATTTTGAAGAAGA
AAAAAGAGAAAAACTAAGAATGATACAAAAGAAACTTGCAAAGGATATTTCTAAATTAAC
CATAGCCAACCCAGCATTCTTAGATGAGATCACAAACGAACATGAGATAAGGCATTCAAC
TTATGAGGACTTAAAACAATCTATCCCAGATTACACAGATTATATGTCTGTGAAAGTTTT
TGACAGATTGTACGAGAAGATTACTACCAATGAAATAAATGATAAGGAAACAGTCAAGCT
GATTCTAGAGACCATGAAAAAACATAAAATATTTCATTTTGGATTCTTCAATAAAGGACA
AAAAACAGCCAAAGATAGAGAAATATTTTTAGGTGAATTTGAAGCAAAAATGTGTCTGTA
CCTTGTCGAAAGAATAGCTAAAGAGAGGTGCAAATTAAACCCTGAAGAAATGATAAGTGA
ACCAGGCGACTCGAAACTAAGGGTATTAGAGAAGCAATCAGAAGACGAAATCAGGTATAT
TAGCAATACAATAAAGACATTAGGGAATGCCATAGAGAACTTGCAATCTGGATCTTTAAA
TTGGGCAGATATATGCGAAAACAAAGCAAGAGGACTTAAGATAGAAATAAATGCTGATAT
GTCCAAATGGAGTGCCCAAGATGTACTTTTTAAATATTTTTGGTTGATAGTGCTTGATCC
CATCTTATATCCTGCTGAGAGGAAAAGGATAATTTATTTCCTCTGTAATTATATGCAGAA
AAGGCTTATAATGCCCGATGAATTGCTCACTACTATATTGGATCAAAGAGTTCCTTATTC
AAATGACATAATTGGATTAATGACAAACAATTATAGGTCTAATACAGTAGAAATAAAGCG
TAACTGGCTTCAAGGCAACTTAAATTATACAAGCAGTTACTTACACAGCTGTAGTATGTC
TGTGTACAAAGATATAATAAGAGAAGCAGCAATATTATTAGAAGGAGAAGCCCTTGTGAA
CTCAATGGTACATTCTGATGATAATCAAACATCTATATGTATGGTGCAGAATAAATTACC
AGATGACAATATAATTGAATTTTGCATTAAGATATTCGAGAAGATATGCTTAACTTTTGG
CAATCAGGCAAATATGAAGAAGACATATCTAACTAACTTCATCAAAGAGTTTGTTTCTTT
ATTTAATATACATGGAGAACCATTTTCTATATATGGGAGATTTCTACTCACAGCAGTAGG
AGACTGTGCCTATCTAGGGCCTTATGAAGATTTAGCAAGTAGGCTATCTGCAACACAAAC
TGCTATAAAGCATGGTTGCCCACCATCACTTGCATGGGTATCTATCGCTCTAAATCACTG
GATAACCCACACTACATATAATATGTTGCCTGGCCAAAATAATGACCCGTTACCATTCTT
CCCTACTAACAATAGAAGTGAAATACCAGTAGAGATGTGCGGAATACTAGAAAGTGATTT
ATCAACAATTGCACTAACTGGTTTAGAAGCAGGGAATGTCACGTTTCTAACAAATATAGC
AAGGAAGTTATCATCCCCAATCTTACAAAGAGAAAGTATTCAAGATCAATACAATTCTAT
AGAAAAGTGGGATCTGAGCAAATTATCACAGATCGACATTCTAAGGCTTAAAATGCTCAG
GTATATATCTCTTGATAGTTCAGTCACATCTGATGATGGTATGGGGGAGACTAGTGAAAT
GAGATCTCGATCACTTTTAACACCTCGTAAATTCACAACAAGTGGGTCACTTAATAGGTT
GAAATCATATAAAGACTTTCAAGATATAATAGCAGATGAGGACAAGACAAACGAACTATT
TGAGAATTTCATTAGACACCCAGAGTTACTGGTTACAAAAGGCGAAACATTTGAAGAATT
TGTTAATACGATATTATTTAGGTACAATTCAAAGAAATTCAAAGAATCTTTGTCAATACA
AAACCCAGCACAGCTTTTTATTGAGCAAATATTATTTTCCAATAAACCAGTAATTGACTA
CACTAGCATACATGACAAGATTTTTGGATTACAAGACATGCCAGGAATTGAAGAACTAGA
TACAATTATAGGTCGCAAAACATTTGTTGAGAGTTATGTTCAAATCGTAGATGACTTAAG
CAATTTAACATTGGATATAAACGATGTCAAGACTATATTTGCCTTTTGTCTTATGAATGA
CCCACTACTGATCACATCTGCTAACAATATAATAATGTCTGTTAAGGGACATAGTCAAGA
AAGAATAGGTCAATCAGCATGCAAAATGCCAGAGGTCCGAAGTCTAAAACTCATACATTA
TTCACCAGCAGTTGTTTTGAGAGCCTATGTGAGAGGGCCAACAAATGTACCGAATGTAGA
TATAGATGAACTTGCAAGGGATCTATCTCATTTAGAAGACTTCATACAAAGTACAAAACT
CAGAGAAAATATGAGAGAGAGAATAGAAATAAATGAGAAGCGGCACTTAGGAAGGGATTT
CAAATTTGAAATCAAAGAACTAACTAGATTTTACCAAGTGTGTTATGATTACATAAAGTC
TACAGAACATAAAGTCAAGGTATTCATATTGCCATACAAAGTTTTCACATCAATAGAATT
CTGCGGGGCACTGACAGGTAACTTGATAAATGACAAATTATGGTACATAACGCATTATCT
GAAAAATATAGTGTCTACTACACATAAGGCACAAATTTCTTCTTCACCTGAATTGGAATT
GCAAATTGCTGATGAGGCACTAAGACTAGTAGCACATTTTGCTGATACTTTCTTGGCATC
AGAATCAAGAATACAATTTCTGAAGAAAATTATTGAAGAATTCACATACAAAGGGATACC
TGTAAAACATTTATACTCAAAAATAAAGAACTCCAAGTTGAGGGTTAAATTTCTAGGGAT
TCTTTTATGGTTAGATGATCTAACACAGAATGATCTGGATAAATTTGATGCAGATAAATC
AGATGAAAAGATTATATGGAATAACTGGCAAGTGTCAAGAGATATGAATACTGGACCAAT
AGACTTAATGATAAGCGGTTACTCTAGACAGCTGCGGATCACTGGGGAAGATGACAAATT
GATTGCTGCTGAATTGCAGGTTACTAGATTGTCAGAAGATTTAATTTATAGACACGGTCA
GGCAATGTTGAATAAGCCACACGGCTTAAAGCTTGAAAAAATGCAACCTGTGACTGAGAT
GTCTAAACGATTACATTATATCGTTTTCCAGCAAAGATCACGGAAACGATACTTCTATTC
TATATTACCCACCCAAGTAATTGAGGACCATAATTCTAGAGTTGAATCATCTAGGCTAAG
CAGAGATTCAAAATGGGTTCCTGTATGCCCTGTTGCAATATCAAAACTCTACCAACAAGG
ACGGCCTATACTTTCCAAAGTTAGAAATCTGAATATGCAGACTCATTCGCTTTCCAGAAT
ACAAGTTAATGTAGATGAATATGCCATCACGAGAAGAGCACATTTTCAGAAAATGCCTTT
CTTCGAAGGACCATCAATCCCTTCTGGTGGTATGGATTTGTCTGAGTTGATGAAATCTAC
ATCCCTATTAAGCTTGAATTATGATAACATAAAAAATGCATCCTTATTGGACATGTCTAG
GGTATTTAAGTGCAATGGCAGTGGAGATGACCAAATGGCTTTCGAATTTCTATCGGACGA
AATTTTGGAGCAAGATGTAGTTGAAGAAATAGAATGCAACCCTATATTTTCTATTAGTTA
TACAAAAAGAGGAGAATCCAATATGACTTATAAAAATGCTTTCCACAAAGCCTTAATCTC
AGAATGTGACAAATTTGAAGAAGCATTTGACTTCCTCGACATGGGATTTTGCTCGAATGA
AAATCTTAGTATTCTGGAGGAAATACATTGGATAATCAGTTATTTAAAAACAAATCAATG
GTCTACGGAACTAGACAATTGTATTCACATGTGCATGTACAGGAATGGATATGATGCAGA
ATATCATAAATTTGATATACCCTCTAAATTCCTCAAAGACCCAATAAACCGAACAATAAA
TTGGACTGAAGTCATTGAATTTATATTATTAATTGAAGATTTCCAAACAAAAATTGAGCC
ATGGTCTAGTATGAAGTCACACTTCTGTTCAAAAGCACACAGTGTAGCACTAGAGTGTAT
GAAAAATGAGAAAAGATCATTGGCAGAATTTGTAGACAAAAGTAAGAAAACTGGCAAATC
CAAATTTGACTTCTAAGGTATACACATGTAAAAGTAGTGTTTGTTTCTAAATAGGAGCAC
ACTACT
Loading