Skip to content

reference panel chr1 is not fully imported  #32

@roman-tremmel

Description

@roman-tremmel

I downloaded the hg19 refpanel with the paramters All and vcf like

get1KGGRCh37.sh All 20 vcf

then I used the command line to score the GWAS data. But first, the refpanel is imported.

REF=~/PascalX/resource/All.1KG.GRCh37
GENE=~/PascalX/resource/gene_GRCh37.tsv
pascalx  -g False -w 10000 -m 0.05 -n True -p 20 ${GENE} ${REF} ${OUT} genescoring -sh False -cr 0 -cp 1 ${IN}

This command produced All.1KG.GRCh37.chr*.db files for all chromosomes, which then can used for scoring. However for chr1 the following error interrupts the import function after 2 hours. Of note, the same error occurs when using a python script instead of the command line function.

Reference panel data not imported. Trying to import...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "~/anaconda3/envs/pascal/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 270, in _import_reference_thread_vcf
    counter[int(geno[2])] += 1
ValueError: invalid literal for int() with base 10: '|'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "run_pascal.py", line 4, in <module>
    Scorer.load_refpanel("~/PascalX/resource/All.1KG.GRCh37",parallel=10, chrlist=[1])
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/genescorer.py", line 95, in load_refpanel
    self._ref.set_refpanel(filename=filename,parallel=parallel,keepfile=keepfile,qualityT=qualityT,SNPonly=SNPonly,chrlist=chrlist)
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 120, in set_refpanel
    self._import_reference(chrs=NF,parallel=parallel,keepfile=keepfile,qualityT=qualityT,SNPonly=SNPonly,regEx=regEx,nobar=nobar)
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 365, in _import_reference
    r.get()
  File "~/anaconda3/envs/pascal/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
ValueError: invalid literal for int() with base 10: '|'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions