Skip to content

Segmentation fault in make_examples when running DeepVariant 1.10.0 with PacBio HiFi BAM #1060

@ChuanzhengWei

Description

@ChuanzhengWei

Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.10/docs/FAQ.md:
Yes
Describe the issue:
I encountered a segmentation fault during the make_examples stage when running DeepVariant v1.10.0 with PacBio HiFi data.

The pipeline starts normally and several shards complete successfully, but one shard crashes with a segmentation fault in make_examples_core.py during writes_examples_in_region.

The crash happens mid-execution, not at startup.

Setup

  • Operating system: Linux (HPC cluster)
  • DeepVariant version: 1.10.0
  • Installation method (Docker, built from source, etc.): Singularity container (deepvariant-1.10.0.sif)
  • Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)
    PacBio HiFi reads
    Reads aligned with minimap2 (map-hifi preset ~50×)
    Plant genome reference (~700 Mb genome)
    Input BAM is sorted and indexed, and contig names match the reference genome.

Steps to reproduce:

  • Command:
 prefix=test
threads=24

singularity exec \
  -B ${PWD}/${prefix}_input:/input \
  -B ${PWD}/${prefix}_output:/output \
  deepvariant-1.10.0.sif \
  /opt/deepvariant/bin/run_deepvariant \
  --sample_name=${prefix} \
  --model_type=PACBIO \
  --ref=/input/reference.fasta \
  --reads=/input/${prefix}.bam \
  --output_vcf=/output/${prefix}.dp.vcf.gz \
  --output_gvcf=/output/${prefix}.dp.gvcf.gz \
  --intermediate_results_dir=/output/deepvariant_tmp \
  --num_shards=${threads} \
  --logging_dir=/output/logs \
  --vcf_stats_report=true
  • Error trace: (if applicable)
Fatal Python error: Segmentation fault

File ".../make_examples_core.py", line 1991 in writes_examples_in_region
File ".../make_examples_core.py", line 3651 in make_examples_runner
File ".../make_examples.py", line 229 in main

Any additional context:
I have tried modifying the num_shards parameter and also allocating more memory, but the run still fails.
I have attached the full runtime log: make_examples.log.

make_examples.log

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions