Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
8c635b3
Add Swagger documentation for API query parameters; improve fhr metad…
nsheff Feb 18, 2026
acd2df8
Merge branch 'dev' of github.com:refgenie/refget into dev
nsheff Feb 18, 2026
afc7a2d
add local store lookup capability to the seqcol CLI commands
nsheff Feb 18, 2026
09646a2
improve fasta digest ui
nsheff Feb 18, 2026
58f5799
fix cancel
nsheff Feb 18, 2026
eb2ce70
fix comparison links
nsheff Feb 18, 2026
f2c74b4
improve error msg
nsheff Feb 18, 2026
9c62b3b
update CLI store commands
nsheff Feb 19, 2026
38f290d
first draft of refget-r
nsheff Feb 19, 2026
16e30d6
Merge branch 'dev' of github.com:refgenie/refget into dev
nsheff Feb 19, 2026
4da6f2b
Add compliance testing, improve frontend error handling, and bump to …
nsheff Feb 21, 2026
286fbab
Bump rollup from 4.35.0 to 4.59.0 in /frontend
dependabot[bot] Feb 27, 2026
599cc5b
Bump minimatch from 3.1.2 to 3.1.5 in /frontend
dependabot[bot] Feb 28, 2026
3fc7487
Merge pull request #61 from refgenie/dependabot/npm_and_yarn/frontend…
nsheff Feb 28, 2026
19ff484
Merge pull request #62 from refgenie/dependabot/npm_and_yarn/frontend…
nsheff Feb 28, 2026
3e83b3a
clean up actions
nsheff Feb 28, 2026
c153335
first pass at an r pkg
nsheff Feb 28, 2026
7592176
Merge branch 'r' into dev
nsheff Feb 28, 2026
190fd65
add py alias docstring and tests
nsheff Mar 2, 2026
142ccc1
clean up for new gtars updates
nsheff Mar 2, 2026
4952974
Add inventory_genomes.py for brickyard FASTA inventory
nsheff Mar 3, 2026
c49e55f
parallel encoding for loading fasta
nsheff Mar 3, 2026
8089d52
Merge origin/dev into dev
nsheff Mar 3, 2026
ba6f35f
add explorer, store builders
nsheff Mar 4, 2026
6e8f978
Modernize build system, expand store CLI, and add store-backed backend
nsheff Mar 13, 2026
82e4155
Reorganize data_loaders into task-specific subdirectories
nsheff Mar 13, 2026
e09ccdd
build up store-backed seqcolapi
nsheff Mar 17, 2026
be71c4e
restructure data loading
nsheff Mar 17, 2026
d91f24a
Merge branch 'dev' of github.com:refgenie/refget into dev
nsheff Mar 17, 2026
8324650
major work on building stores, frontend
nsheff Mar 18, 2026
ab77d9f
add scom config
nsheff Mar 18, 2026
0dd0747
prep for deploy
nsheff Mar 18, 2026
01a48d3
Fix store deploy: checkout dev branch for seqcolapi code
nsheff Mar 19, 2026
228b67c
fix store-backed frontend, add tests
nsheff Mar 19, 2026
0243d22
Merge branch 'master' into dev
nsheff Mar 19, 2026
2d2c51e
lint
nsheff Mar 19, 2026
9cc02d4
fix import
nsheff Mar 19, 2026
fca18fa
cleanup
nsheff Mar 19, 2026
aea3a4d
fix tests
nsheff Mar 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
target-branch: "dev"

- package-ecosystem: "npm"
directory: "/frontend"
schedule:
interval: "weekly"
target-branch: "dev"

- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
target-branch: "dev"
10 changes: 7 additions & 3 deletions .github/workflows/black.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- uses: psf/black@stable
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install ruff
- run: ruff check .
- run: ruff format --check .
57 changes: 9 additions & 48 deletions .github/workflows/claude-code-review.yml
Original file line number Diff line number Diff line change
@@ -1,30 +1,22 @@
name: Claude Code Review

on:
pull_request:
types: [opened, ready_for_review]
# Optional: Only run on specific file changes
# paths:
# - "src/**/*.ts"
# - "src/**/*.tsx"
# - "src/**/*.js"
# - "src/**/*.jsx"
workflow_dispatch:
inputs:
pr_number:
description: 'PR number to review'
required: true
type: number

jobs:
claude-review:
# Optional: Filter by PR author
# if: |
# github.event.pull_request.user.login == 'external-contributor' ||
# github.event.pull_request.user.login == 'new-developer' ||
# github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'

runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write

steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand All @@ -36,43 +28,12 @@ jobs:
uses: anthropics/claude-code-action@beta
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}

# Optional: Specify model (defaults to Claude Sonnet 4, uncomment for Claude Opus 4)
# model: "claude-opus-4-20250514"

# Direct prompt for automated review (no @claude mention needed)
direct_prompt: |
Please review this pull request and provide feedback on:
Please review pull request #${{ inputs.pr_number }} and provide feedback on:
- Code quality and best practices
- Potential bugs or issues
- Performance considerations
- Security concerns
- Test coverage

Be constructive and helpful in your feedback.

# Optional: Use sticky comments to make Claude reuse the same comment on subsequent pushes to the same PR
# use_sticky_comment: true

# Optional: Customize review based on file types
# direct_prompt: |
# Review this PR focusing on:
# - For TypeScript files: Type safety and proper interface usage
# - For API endpoints: Security, input validation, and error handling
# - For React components: Performance, accessibility, and best practices
# - For tests: Coverage, edge cases, and test quality

# Optional: Different prompts for different authors
# direct_prompt: |
# ${{ github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' &&
# 'Welcome! Please review this PR from a first-time contributor. Be encouraging and provide detailed explanations for any suggestions.' ||
# 'Please provide a thorough code review focusing on our coding standards and best practices.' }}

# Optional: Add specific tools for running tests or linting
# allowed_tools: "Bash(npm run test),Bash(npm run lint),Bash(npm run typecheck)"

# Optional: Skip review for certain conditions
# if: |
# !contains(github.event.pull_request.title, '[skip-review]') &&
# !contains(github.event.pull_request.title, '[WIP]')

Be constructive and helpful in your feedback.
21 changes: 9 additions & 12 deletions .github/workflows/deploy_store.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,44 +15,41 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v5
with:
ref: dev

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1

- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v1
uses: aws-actions/amazon-ecr-login@v2

- name: Build, tag, and push image to Amazon ECR
id: build-image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: seqcolapi-store
ECR_REPOSITORY: seqcolapi
IMAGE_TAG: ${{ github.sha }}
run: |
cd deployment/seqcolapi-store/
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG -f Dockerfile .
docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG -f deployment/seqcolapi-store/Dockerfile .
docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
echo "::set-output name=image::$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG"
echo "image=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" >> $GITHUB_OUTPUT

- name: Fill in the new image ID in the Amazon ECS task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition@v1
uses: aws-actions/amazon-ecs-render-task-definition@v1.6.2
with:
task-definition: deployment/seqcolapi-store/task_def.json
container-name: seqcolapi-store
container-name: seqcolapi
image: ${{ steps.build-image.outputs.image }}

- name: Deploy Amazon ECS task definition
uses: aws-actions/amazon-ecs-deploy-task-definition@v1
uses: aws-actions/amazon-ecs-deploy-task-definition@v2
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: seqcolapi-store-service
service: seqcolapi-service
cluster: yeti
wait-for-service-stability: true
9 changes: 4 additions & 5 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This workflows will upload a Python Package using Twine when a release is created
# This workflow uploads a Python Package using trusted publishing when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

name: Upload Python Package
Expand All @@ -23,10 +23,9 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Build and publish
pip install build
- name: Build package
run: |
python setup.py sdist bdist_wheel
python -m build
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1

21 changes: 0 additions & 21 deletions .github/workflows/run-codecov.yml

This file was deleted.

11 changes: 4 additions & 7 deletions .github/workflows/run-pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: ["3.10", "3.13"]
python-version: ["3.10", "3.14"]
os: [ubuntu-latest]

steps:
Expand All @@ -20,13 +20,10 @@ jobs:
with:
python-version: ${{ matrix.python-version }}

- name: Install test dependencies
run: if [ -f requirements/requirements-test.txt ]; then pip install -r requirements/requirements-test.txt; fi

- name: Install package
- name: Install package with test extras
env:
PYO3_USE_ABI3_FORWARD_COMPATIBILITY: 1
run: python -m pip install .
run: python -m pip install ".[test]"

- name: Run pytest tests
run: pytest -x -vv --cov=./ --cov-report=xml
run: pytest -x -vv --cov=./ --cov-report=xml
3 changes: 0 additions & 3 deletions MANIFEST.in

This file was deleted.

45 changes: 42 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,48 @@ This starts the test database, runs tests, and cleans up automatically.

## Development and deployment: Backend

### Easy-peasy way
### Store-backed (no database)

In a moment I'll show you how to do these steps individually, but if you're in a hurry, the easy way get a development API running for testing is to just use my very simple shell script like this (no data persistence, just loads demo data):
The store-backed seqcolapi uses a RefgetStore (local files) instead of PostgreSQL. This is the simplest way to run the API:

#### Quick start

```console
bash deployment/store_demo_up.sh
```

This will:
- Build a local RefgetStore from test FASTA files
- Run the store-backed seqcolapi with uvicorn
- Block the terminal until you press Ctrl+C, which cleans up

No Docker or database required.

#### Step-by-step

1. Build a store from FASTA files:

```console
python data_loaders/demo_build_store.py test_fasta /tmp/refget_demo_store
```

2. Start the store-backed API:

```console
REFGET_STORE_PATH=/tmp/refget_demo_store uvicorn seqcolapi.main:store_app --reload --port 8100
```

#### Remote store

To run against a remote (S3) store:

```console
REFGET_STORE_URL=https://example.com/store uvicorn seqcolapi.main:store_app --port 8100
```

### DB-backed (PostgreSQL)

If you need a database-backed instance (e.g., for mutable data, advanced queries), use the DB-backed workflow. In a moment I'll show you how to do these steps individually, but if you're in a hurry, the easy way to get a development API running for testing is to just use my very simple shell script like this (no data persistence, just loads demo data):

```console
bash deployment/demo_up.sh
Expand All @@ -58,7 +97,7 @@ This will:
- load up the demo data
- block the terminal until you press Ctrl+C, which will shut down all services.

### Step-by-step process
### Step-by-step process (DB-backed)

Alternatively, if you want to run each step separately to see what's really going on, start here.

Expand Down
9 changes: 8 additions & 1 deletion data_loaders/demo_build_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,14 @@ def main():
store = RefgetStore.on_disk(store_path)

for fasta in fasta_files:
store.add_sequence_collection_from_fasta(fasta)
result = store.add_sequence_collection_from_fasta(fasta)
# Register the filename (without extension) as a collection alias
basename = os.path.basename(fasta)
name = basename.split(".")[0] # strip .fa, .fasta, .fa.gz, etc.
meta = result[0] if isinstance(result, tuple) else result
if meta:
store.add_collection_alias("fasta_filename", name, meta.digest)
print(f" {name} → {meta.digest}")

print(f"Done. Store at: {store_path}")
print(f"Stats: {store.stats()}")
Expand Down
17 changes: 8 additions & 9 deletions data_loaders/demo_remote_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def main():
print(f"\n1. Loading remote store from:\n {REMOTE_URL}")
print(f" Cache directory: {CACHE_DIR}\n")

store = RefgetStore.load_remote(cache_path=str(CACHE_DIR), remote_url=REMOTE_URL)
store = RefgetStore.open_remote(cache_path=str(CACHE_DIR), remote_url=REMOTE_URL)

print(f" Loaded! {len(store)} sequences available (metadata only)")

Expand All @@ -51,17 +51,16 @@ def main():

# 3. List sequences (first 5)
print(f"\n3. Listing sequences (first 5 of {len(store)}):")
records = store.sequence_records()
for i, rec in enumerate(records[:5]):
m = rec.metadata
records = store.list_sequences()
for i, m in enumerate(records[:5]):
print(f" {i+1}. {m.name[:50]}...")
print(f" sha512t24u: {m.sha512t24u}")
print(f" length: {m.length:,} bp")

# 4. Fetch a sequence by ID (downloads sequence data on first access)
seq_digest = "du4GiRD_OcmdmCn_RmImyb71YZ4XoCdk"
print(f"\n4. Get sequence record by ID (fetches from remote):")
record = store.get_sequence_by_id(seq_digest)
record = store.get_sequence(seq_digest)
if record:
print(f" Name: {record.metadata.name}")
print(f" Length: {record.metadata.length:,} bp")
Expand Down Expand Up @@ -107,7 +106,7 @@ def main():
print(f" Collection: {EXAMPLE_COLLECTION}")
print(f" Sequence: {EXAMPLE_SEQ_NAME[:50]}...")

record = store.get_sequence_by_collection_and_name(EXAMPLE_COLLECTION, EXAMPLE_SEQ_NAME)
record = store.get_sequence_by_name(EXAMPLE_COLLECTION, EXAMPLE_SEQ_NAME)
if record:
print(f" Found! Length: {record.metadata.length:,} bp")
print(f" Digest: {record.metadata.sha512t24u}")
Expand Down Expand Up @@ -149,9 +148,9 @@ def main():
print(f"\nCache directory: {CACHE_DIR}")
print(f"Temp files: {temp_dir}")
print("\nKey features demonstrated:")
print(" - load_remote(): Load store from URL, fetch sequences on-demand")
print(" - get_sequence_by_id(): Lookup by SHA-512/24u or MD5 digest")
print(" - get_sequence_by_collection_and_name(): Lookup by sequence name")
print(" - open_remote(): Load store from URL, fetch sequences on-demand")
print(" - get_sequence(): Lookup by SHA-512/24u or MD5 digest")
print(" - get_sequence_by_name(): Lookup by collection digest + sequence name")
print(" - substrings_from_regions(): Batch retrieval from BED file")
print(" - export_fasta_by_digests(): Export sequences by digest")
print(" - export_fasta_from_regions(): Export BED regions to FASTA")
Expand Down
2 changes: 1 addition & 1 deletion data_loaders/load_demo_seqcols.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
DEMO_FASTA = json.load(open("test_fasta/test_fasta_digests.json"))

# Storage locations from environment (if set, will upload; otherwise use demo defaults with skip_upload)
ENV_STORAGE = json.loads(os.environ.get("FASTA_STORAGE_LOCATIONS", "[]"))
ENV_STORAGE = json.loads(os.environ.get("FASTA_STORAGE_LOCATIONS") or "[]")
if ENV_STORAGE:
DEMO_STORAGE = ENV_STORAGE
SKIP_UPLOAD = False
Expand Down
3 changes: 3 additions & 0 deletions data_loaders/ref-genome-analysis/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__pycache__/
*.pyc
*.log
Loading
Loading