Skip to content

remove extraenous OOB write#2163

Open
eternal-flame-AD wants to merge 1 commit intoalexdobin:masterfrom
eternal-flame-AD:master
Open

remove extraenous OOB write#2163
eternal-flame-AD wants to merge 1 commit intoalexdobin:masterfrom
eternal-flame-AD:master

Conversation

@eternal-flame-AD
Copy link
Copy Markdown

@eternal-flame-AD eternal-flame-AD commented Jun 28, 2024

I verified this fixes #2158

Not sure why this was added in the first place, with the line removed I could not reproduce a complaining valgrind with the nf-core/rnaseq test dataset or the dataset referenced in #2158 or my own dataset. However this line seem to produce a guaranteed OOB write:

yume@tsubasa ~/O/A/2/l/i/tmp (main) [102] > set NUM_BASES (gawk '{sum = sum + $2}END{if ((log(sum)/log(2))/2 - 1 > 14) {printf "%.0f", 14} else {printf "%.0f", (l
og(sum)/log(2))/2 - 1}}' genome.fa.fai)
yume@tsubasa ~/O/A/2/l/i/tmp (main) > valgrind --track-origins=yes --keep-stacktraces=alloc-and-free ~/build/STAR/source/STAR \              (rnaseq-smk) 23:31:04
                                              --runMode genomeGenerate \
                                              --genomeDir star/ \
                                              --genomeFastaFiles genome.fa \
                                              --sjdbGTFfile genome.filtered.gtf \
                                              --runThreadN 2 \
                                              --genomeSAindexNbases $NUM_BASES \
                                              --limitGenomeGenerateRAM 53587091200
==500278== Memcheck, a memory error detector
==500278== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==500278== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==500278== Command: /home/yume/build/STAR/source/STAR --runMode genomeGenerate --genomeDir star/ --genomeFastaFiles genome.fa --sjdbGTFfile genome.filtered.gtf --runThreadN 2 --genomeSAindexNbases 7 --limitGenomeGenerateRAM 53587091200
==500278== 
        /home/yume/build/STAR/source/STAR --runMode genomeGenerate --genomeDir star/ --genomeFastaFiles genome.fa --sjdbGTFfile genome.filtered.gtf --runThreadN 2 --genomeSAindexNbases 7 --limitGenomeGenerateRAM 53587091200
        STAR version: 2.7.11b   compiled: 2024-06-27T23:30:18-05:00 :/home/yume/build/STAR/source
Jun 27 23:31:06 ..... started STAR run
Jun 27 23:31:06 ... starting to generate Genome files
Jun 27 23:31:06 ..... processing annotations GTF
Jun 27 23:31:06 ... starting to sort Suffix Array. This may take a long time...
Jun 27 23:31:06 ... sorting Suffix Array chunks and saving them to disk...
Jun 27 23:31:11 ... loading chunks from disk, packing SA...
Jun 27 23:31:12 ... finished generating suffix array
Jun 27 23:31:12 ... generating Suffix Array index
Jun 27 23:31:12 ... completed Suffix Array index
==500278== Invalid read of size 8
==500278==    at 0x21530E: PackedArray::writePacked(unsigned long long, unsigned long long) (PackedArray.cpp:24)
==500278==    by 0x2894EA: sjdbInsertJunctions(Parameters&, Genome&, Genome&, SjdbClass&) (sjdbInsertJunctions.cpp:68)
==500278==    by 0x274E72: Genome::genomeGenerate() (Genome_genomeGenerate.cpp:365)
==500278==    by 0x212722: main (STAR.cpp:91)
==500278==  Address 0x5e6fc44 is 1,062,228 bytes inside a block of size 1,062,232 alloc'd
==500278==    at 0x48445F3: operator new[](unsigned long) (vg_replace_malloc.c:729)
==500278==    by 0x21536F: PackedArray::allocateArray() (PackedArray.cpp:32)
==500278==    by 0x27457C: Genome::genomeGenerate() (Genome_genomeGenerate.cpp:297)
==500278==    by 0x212722: main (STAR.cpp:91)
==500278== 
==500278== Invalid write of size 8
==500278==    at 0x215332: PackedArray::writePacked(unsigned long long, unsigned long long) (PackedArray.cpp:24)
==500278==    by 0x2894EA: sjdbInsertJunctions(Parameters&, Genome&, Genome&, SjdbClass&) (sjdbInsertJunctions.cpp:68)
==500278==    by 0x274E72: Genome::genomeGenerate() (Genome_genomeGenerate.cpp:365)
==500278==    by 0x212722: main (STAR.cpp:91)
==500278==  Address 0x5e6fc44 is 1,062,228 bytes inside a block of size 1,062,232 alloc'd
==500278==    at 0x48445F3: operator new[](unsigned long) (vg_replace_malloc.c:729)
==500278==    by 0x21536F: PackedArray::allocateArray() (PackedArray.cpp:32)
==500278==    by 0x27457C: Genome::genomeGenerate() (Genome_genomeGenerate.cpp:297)
==500278==    by 0x212722: main (STAR.cpp:91)
==500278== 
Jun 27 23:31:12 ... writing Genome to disk ...
Jun 27 23:31:12 ... writing Suffix Array to disk ...
Jun 27 23:31:12 ... writing SAindex to disk
Jun 27 23:31:12 ..... finished successfully
==500278== 
==500278== HEAP SUMMARY:
==500278==     in use at exit: 775,568 bytes in 436 blocks
==500278==   total heap usage: 7,795 allocs, 7,359 frees, 100,485,363 bytes allocated
==500278== 
==500278== LEAK SUMMARY:
==500278==    definitely lost: 112,243 bytes in 22 blocks
==500278==    indirectly lost: 0 bytes in 0 blocks
==500278==      possibly lost: 320 bytes in 1 blocks
==500278==    still reachable: 663,005 bytes in 413 blocks
==500278==         suppressed: 0 bytes in 0 blocks
==500278== Rerun with --leak-check=full to see details of leaked memory
==500278== 
==500278== For lists of detected and suppressed errors, rerun with: -s
==500278== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

SA.pointArray(SApass1.charArray + SApass1.lengthByte-SA.lengthByte); //SA is shifted to have space for junction insertion

From this line it seems the buffer for SA is only is nSA long thus accessing the nSA-th element will be a guaranteed OOB write. In certain configuration combinations this will overwrite allocation header for something else and cause the crash.

Signed-off-by: eternal-flame-AD <yume@yumechi.jp>
birdingman0626 added a commit to birdingman0626/STAR-Win that referenced this pull request Apr 13, 2026
Bug fixes from upstream STAR PRs:
- PR alexdobin#2163: Remove OOB write in sjdbInsertJunctions.cpp (SA.writePacked
  at index nSA is one past the end; memory corruption confirmed by Valgrind)
- PR alexdobin#2676: Fix memory leaks in outputSJ.cpp (sjA, sjFilter, sjChunks
  arrays allocated but never freed)
- PR alexdobin#535: Fix segfault in SA lookup shortcut in
  ReadAlign_maxMappableLength2strands.cpp (unreliable shortcut caused
  unsigned underflow and SIGSEGV on certain genomes)

Performance optimizations:
- PR alexdobin#791: PackedArray bitmask optimization (replace expensive double-shift
  with single AND in hot operator[]; ~1-2% improvement)
- PR alexdobin#791: Add FastResetVector.h (O(modified) reset instead of O(N) memset;
  available for winBin array optimization)
- PR alexdobin#773 (partial): Early rejection in stitchWindowAligns.cpp to skip
  unnecessary Transcript copies when alignment will obviously fail

HTSlib upgrade evaluated and deferred: current 1.3 has minimal security
surface (no CRAM, no network I/O, trusted input only).

Note: uniquely mapped count changes slightly due to PR alexdobin#535 fix — the
buggy shortcut was incorrectly skipping valid alignments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

index error: double free or corruption (!prev)

1 participant