If this could do from the CLI:
refget cache ABCXYZ // stores it into local disk cache
refget extract ABCXYZ --regions query.bed // look up sequences for a set of intervals
I need to retrieve sequences from a remote server.
Two options:
- I can use refgenie to get a fasta asset.
- I can create a new DRS endpoint for just the fasta file.
Option 1: Add DRS endpoint to the refget seqcol server
where is the pointer from Digest to Fasta file location stored? In the database... a new table? Files?
Option 2: Just use refgenie idea
Maybe let refgenie handle the distribution of the sequence data?
Differences
- refgenie asset is indexed by the asset digest, not the seqcol digest
- refgenie asset includes not just the fasta file but other stuff.
Decision
I should use refgenie.
- they'll be storing the fasta files anyway.
- refget benefits from the refgenie content delivery networks, automatically
- a good use case of refgenie
Implementation
How would I use this then? how do I pass information about the file from refgenie to the local refget extractor?
-
refget extract is happening in Python;
-
add an optional dependency of refgenie? Can it operate like a plugin?
-
refget cache looks up the refgenie asset, pulls it out, and creates the refget-rs on-disk representation? so there's a $REFGETCACHE location where the seqcols are stored.
-
refget extract thenis a lightweight python wrapper around the rust extraction command?
If this could do from the CLI:
I need to retrieve sequences from a remote server.
Two options:
Option 1: Add DRS endpoint to the refget seqcol server
where is the pointer from Digest to Fasta file location stored? In the database... a new table? Files?
Option 2: Just use refgenie idea
Maybe let refgenie handle the distribution of the sequence data?
Differences
Decision
I should use refgenie.
Implementation
How would I use this then? how do I pass information about the file from refgenie to the local refget extractor?
refget extract is happening in Python;
add an optional dependency of refgenie? Can it operate like a plugin?
refget cachelooks up the refgenie asset, pulls it out, and creates the refget-rs on-disk representation? so there's a $REFGETCACHE location where the seqcols are stored.refget extractthenis a lightweight python wrapper around the rust extraction command?