Skip to content

find a place to save extract-study-ids-with-biosamples stats #148

@turbomam

Description

@turbomam
make -f make-gold-cache.Makefile local/gold-study-ids-with-biosamples.txt

curl -o downloads/goldData.xlsx "https://gold.jgi.doe.gov/download?mode=site_excel"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  216M  100  216M    0     0  34.0M      0  0:00:06  0:00:06 --:--:-- 43.4M
date && time poetry run extract-study-ids-with-biosamples \
        --excel-file downloads/goldData.xlsx \
        --sheet-name 'Sequencing Project' \
        --output-file local/gold-study-ids-with-biosamples.txt.tmp && date # 8 minutes
Sat Apr 19 08:31:12 AM EDT 2025

# Rows in sheet: 589825
# Unique Project IDs: 589825
# Unique Study IDs: 62129
# Unique Biosample IDs: 217543
# Unique Study IDs linked to Biosamples: 4739

Extracted 4739 Study GOLD IDs to 'local/gold-study-ids-with-biosamples.txt.tmp'

363.08user 1.72system 6:01.31elapsed 100%CPU (0avgtext+0avgdata 1954644maxresident)k
0inputs+104outputs (0major+480826minor)pagefaults 0swaps

Sat Apr 19 08:37:13 AM EDT 2025
sort local/gold-study-ids-with-biosamples.txt.tmp | uniq > local/gold-study-ids-with-biosamples.txt
rm -rf local/gold-study-ids-with-biosamples.txt.tmp

wc -l local/gold-study-ids-with-biosamples.txt

4739 local/gold-study-ids-with-biosamples.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions