-
Notifications
You must be signed in to change notification settings - Fork 0
CLI Reference
--help and --version work on every command.
Audit a single accession.
pxaudit check [OPTIONS] ACCESSION-
Options
-
--refresh: force re-fetch from PRIDE, update the local cache -
--no-cache: skip cache entirely, always fetch fresh -
--db PATH: SQLite output path (default:pxaudit_results.db)
-
Examples
# Quick check
pxaudit check PXD004683
# Re-download even if cached
pxaudit check PXD004683 --refresh
# Store results somewhere else
pxaudit check PXD004683 --db ~/projects/lab-audits.dbOn success, prints a summary to the terminal and upserts the study, files, and audit rows into the SQLite database. If the PRIDE project endpoint is unreachable and no cached data exists, exits with code 1.
Audit many accessions in one run.
pxaudit bulk-audit [OPTIONS]Options
| Flag | Default | Description |
|---|---|---|
--input PATH |
required | Path to accession list (one per line) or - for stdin |
--db PATH |
pxaudit_results.db |
SQLite output path |
--format FMT |
none | Export results: tsv, json, or csv
|
--output PATH |
auto-generated | Export file path |
--delay SEC |
1.0 |
Seconds to wait between API calls |
--continue-on-error |
off | Skip accessions that fail and keep going |
--overwrite |
off | Overwrite an existing export file |
Examples
# Basic batch
pxaudit bulk-audit --input accessions.txt
# Export to a TSV file for Excel
pxaudit bulk-audit --input ids.txt --format tsv --output results.tsv
# Pipe from another command, keep going on errors
cat accessions.txt | pxaudit bulk-audit --input - --continue-on-error
# Rate-limit yourself to one request every 2 seconds
pxaudit bulk-audit --input ids.txt --delay 2Shows a progress bar via tqdm while processing. When finished, prints a summary with total, completed, failed counts and a tier distribution breakdown.
Handy one-liner for exploring a cohort:
# Audit every PXD from a search result, then query the DB for Silver+ datasets
grep -o 'PXD[0-9]*' search_results.txt | sort -u > cohort.txt
pxaudit bulk-audit --input cohort.txt
sqlite3 pxaudit_results.db "SELECT accession, tier FROM audit WHERE tier IN ('Silver','Gold','Platinum','Diamond') ORDER BY tier;"List files for an accession from the database.
pxaudit manifest [OPTIONS] ACCESSION-
Options
-
--db PATH: SQLite database path (default:pxaudit_results.db) -
--format FMT: output format:tsv(default) orjson
-
Examples
# Default TSV output
pxaudit manifest PXD004683
# JSON for programmatic use
pxaudit manifest PXD004683 --format jsonThe accession must have been audited first (via check or bulk-audit). Output includes: file_name, file_category, file_extension, ftp_location, file_size, checksum, checksum_type. PRIDE provides checksums for most recent submissions; older datasets may have none.
# 1. Collect accessions from a keyword search (or wherever)
echo -e "PXD000001\nPXD004683\nPXD073444" > targets.txt
# 2. Audit them all
pxaudit bulk-audit --input targets.txt --format tsv --output audit.tsv
# 3. Check what you got
sqlite3 pxaudit_results.db "
SELECT tier, COUNT(*) AS n
FROM audit
GROUP BY tier
ORDER BY n DESC;
"
# 4. Inspect a specific dataset's files
pxaudit manifest PXD004683 | head -20
# 5. Re-audit a single accession after a logic update
pxaudit check PXD004683 --refreshDocumentation for PXAudit v0.3.0. Pages can be synced to the GitHub Wiki.
Getting started
Concepts
Contributing