Summary
The SLICOT routine NF01BS (QR factorization of Jacobian in compressed form) contains a bug where DLACPY is called with overlapping source and destination memory regions. This is undefined behavior per the LAPACK specification.
Bug Location
File: SLICOT-Reference/src/NF01BS.f
Line: 519
CALL DLACPY( 'Full', N, ST, J(LDJ*BSN+1), LDJ, J(IBSN), N )
Where IBSN = N*BSN + 1 (set at line 518).
Root Cause
The routine reshapes matrix J from leading dimension LDJ (= M = BNBSM) to leading dimension N (= BNBSN + ST). When LDJ > N, the source and destination regions overlap in memory.
Memory Layout
Source: J(LDJ*BSN+1) with leading dimension LDJ
Destination: J(N*BSN+1) with leading dimension N
When LDJ > N:
Source offset = LDJ * BSN = larger
Dest offset = N * BSN = smaller
These regions overlap!
Trigger Conditions
BN > 1 (multiple blocks - general case, not special case)
BSN > 0 (nonlinear part exists)
BSM > BSN (more rows than columns per block, MMN > 0)
ST > 0 (linear part exists)
LAPACK DLACPY Specification
From LAPACK documentation:
DLACPY copies all or part of a two-dimensional matrix A to another matrix B.
DLACPY does not handle overlapping memory regions. It uses simple nested loops (effectively memcpy semantics), not memmove.
Reproduction
Prerequisites
- gfortran
- LAPACK/BLAS libraries
- Valgrind
Build
cd tests/fortran
gfortran -g -O0 -fcheck=all -Wall -c \
../../SLICOT-Reference/src/NF01BS.f \
../../SLICOT-Reference/src/MD03BX.f
gfortran -g -O0 -fcheck=all -Wall -o test_nf01bs_overlap \
test_nf01bs_overlap.f NF01BS.o MD03BX.o -llapack -lblas
Run with Valgrind
valgrind --track-origins=yes ./test_nf01bs_overlap
Expected Output
==19687== Source and destination overlap in memcpy(0x..., 0x..., 48)
==19687== at 0x...: __GI_memcpy (...)
==19687== by 0x...: dlacpy_ (liblapack.so)
==19687== by 0x...: nf01bs_ (NF01BS.f:479)
==19687== by 0x...: MAIN__ (test_nf01bs_overlap.f:96)
Test Parameters
The test uses parameters that reliably trigger the overlap:
| Parameter |
Value |
Description |
| BN |
2 |
Number of blocks |
| BSM |
4 |
Block rows |
| BSN |
2 |
Block columns |
| ST |
2 |
Linear part size |
| N |
6 |
Total columns (BN*BSN + ST) |
| M |
8 |
Total rows (BN*BSM) |
| LDJ |
8 |
Leading dimension (= M) |
Memory offsets:
- Source:
LDJ*BSN + 1 = 17 (1-based)
- Destination:
N*BSN + 1 = 13 (1-based)
- Overlap because source starts at 17, destination at 13, with
LDJ=8 > N=6
Fix in C Translation
The C translation (src/NF/nf01bs.c) uses a local helper function with memmove:
static void local_dlacpy_safe(i32 m, i32 n, const f64 *a, i32 lda, f64 *b, i32 ldb) {
for (i32 j = 0; j < n; j++) {
memmove(&b[j * ldb], &a[j * lda], m * sizeof(f64));
}
}
This correctly handles overlapping memory regions.
Impact
- Severity: Medium
- Behavior: Undefined (may produce incorrect results silently)
- Detection: Only visible with memory debuggers (Valgrind, ASAN)
- Affected versions: All SLICOT versions containing NF01BS
Why It May Not Manifest
- LAPACK's column-by-column copy order may accidentally work for some overlap patterns
- Specific parameter combinations may avoid the overlap entirely
- Resulting numerical errors may be small enough to go unnoticed
Recommendation
The original SLICOT Fortran code should be patched to use a custom copy routine that handles overlapping memory, similar to the C translation fix.
Summary
The SLICOT routine
NF01BS(QR factorization of Jacobian in compressed form) contains a bug whereDLACPYis called with overlapping source and destination memory regions. This is undefined behavior per the LAPACK specification.Bug Location
File:
SLICOT-Reference/src/NF01BS.fLine: 519
Where
IBSN = N*BSN + 1(set at line 518).Root Cause
The routine reshapes matrix
Jfrom leading dimensionLDJ(= M = BNBSM) to leading dimensionN(= BNBSN + ST). WhenLDJ > N, the source and destination regions overlap in memory.Memory Layout
Trigger Conditions
BN > 1(multiple blocks - general case, not special case)BSN > 0(nonlinear part exists)BSM > BSN(more rows than columns per block,MMN > 0)ST > 0(linear part exists)LAPACK DLACPY Specification
From LAPACK documentation:
DLACPY does not handle overlapping memory regions. It uses simple nested loops (effectively
memcpysemantics), notmemmove.Reproduction
Prerequisites
Build
cd tests/fortran gfortran -g -O0 -fcheck=all -Wall -c \ ../../SLICOT-Reference/src/NF01BS.f \ ../../SLICOT-Reference/src/MD03BX.f gfortran -g -O0 -fcheck=all -Wall -o test_nf01bs_overlap \ test_nf01bs_overlap.f NF01BS.o MD03BX.o -llapack -lblasRun with Valgrind
Expected Output
Test Parameters
The test uses parameters that reliably trigger the overlap:
Memory offsets:
LDJ*BSN + 1 = 17(1-based)N*BSN + 1 = 13(1-based)LDJ=8 > N=6Fix in C Translation
The C translation (
src/NF/nf01bs.c) uses a local helper function withmemmove:This correctly handles overlapping memory regions.
Impact
Why It May Not Manifest
Recommendation
The original SLICOT Fortran code should be patched to use a custom copy routine that handles overlapping memory, similar to the C translation fix.