Zero trimmed ranges under verifier flags to make Trim() observable#3
Closed
llersch wants to merge 1 commit into
Closed
Zero trimmed ranges under verifier flags to make Trim() observable#3llersch wants to merge 1 commit into
llersch wants to merge 1 commit into
Conversation
On Linux with ext4/xfs, fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE) causes reads from the punched range to return zeros, so Trim() is already observable there. On macOS, Windows, tmpfs, NFS, and filesystems that do not support PUNCH_HOLE, Trim() returns false and does nothing — meaning the trim path is effectively untested in most CI environments. Under DUCKDB_RUN_SLOW_VERIFIERS or DUCKDB_ALTERNATIVE_VERIFY, explicitly zero the trimmed range with Write() before attempting fallocate. This makes Trim() observable everywhere: any code that reads a trimmed block will see zeros regardless of the underlying filesystem or OS. The two flags are triggered independently and serve different purposes: - DUCKDB_RUN_SLOW_VERIFIERS enables additive extra verification on top of the production code path. Zeroing fits here because it turns a silent no-op (on non-Linux or unsupporting filesystems) into a detectable operation without changing what the production path does. - DUCKDB_ALTERNATIVE_VERIFY selects alternative implementations of the same operation. Zeroing is precisely an alternative implementation of trim: it achieves the same observable effect as PUNCH_HOLE (reads from trimmed blocks return zeros) through explicit writes rather than a kernel hole-punching syscall. The two blocks are independent: on Linux with glibc >= 2.18, both the zeroing and the fallocate execute, exercising the production code path while still making the result observable.
fa147d5 to
69ac9d9
Compare
llersch
pushed a commit
that referenced
this pull request
Apr 17, 2026
Closes duckdb/ducklake#977 When I was using DuckLake I met certain issue ```sh memory D ATTACH ':memory:' AS dl (TYPE ducklake, DATA_INLINING_ROW_LIMIT 1000); memory D CREATE TABLE dl.test(id INTEGER, val TEXT); memory D INSERT INTO dl.test VALUES (1, 'a'), (2, 'b'), (3, 'c'); memory D BEGIN; memory D ALTER TABLE dl.test ALTER COLUMN id TYPE BIGINT; memory D UPDATE dl.test SET val = 'X' WHERE id = 1; ================================================================= ==83726==ERROR: AddressSanitizer: heap-use-after-free on address 0x6160001cffb8 at pc 0x00010bf6c580 bp 0x00016f1a3cf0 sp 0x00016f1a3ce8 READ of size 1 at 0x6160001cffb8 thread T5 #0 0x00010bf6c57c in duckdb::LogicalType::operator==(duckdb::LogicalType const&) const types.cpp:2071 #1 0x000106c89ab8 in duckdb::StatisticsPropagator::CanPropagateCast(duckdb::LogicalType const&, duckdb::LogicalType const&) propagate_cast.cpp:31 #2 0x00010a635dcc in duckdb::TryCastTableFilter(duckdb::TableFilter const&, duckdb::MultiFileIndexMapping&, duckdb::LogicalType const&) multi_file_column_mapper.cpp:1014 #3 0x00010a632458 in duckdb::MultiFileColumnMapper::CreateFilters(std::__1::map<unsigned long long, std::__1::reference_wrapper<duckdb::TableFilter>, std::__1::less<unsigned long long>, std::__1::allocator<std::__1::pair<unsigned long long const, std::__1::reference_wrapper<duckdb::TableFilter>>>>&, duckdb::ResultColumnMapping&) multi_file_column_mapper.cpp:1104 ``` I don't have too much knowledge on `MultiFileColumnMapper`, but considering (1) the invalid memory access is 1 byte; (2) stacktrace code location more or less indicates its logical type reference stored within `MultiFileColumnMapper` gets invalid. Turning it into value fixes the ASAN complaint.
llersch
pushed a commit
that referenced
this pull request
Apr 30, 2026
### Remove copied extensions from release/relassert artifact The tarball for `relassert` contains duplicate extensions: ``` relassert artifact includes: duckdb extension/core_functions/core_functions.duckdb_extension extension/parquet/parquet.duckdb_extension repository/36a968b1bf/linux_amd64/core_functions.duckdb_extension repository/36a968b1bf/linux_amd64/parquet.duckdb_extension src/libduckdb.so test/extension/loadable_extension_demo.duckdb_extension test/extension/loadable_extension_optimizer_demo.duckdb_extension test/unittest relassert artifact size: 470M build/relassert-artifact/relassert/extension/parquet 530M build/relassert-artifact/relassert/extension/core_functions 592M build/relassert-artifact/relassert/src 870M build/relassert-artifact/relassert/test/extension 1000M build/relassert-artifact/relassert/extension 1000M build/relassert-artifact/relassert/repository 1000M build/relassert-artifact/relassert/repository/36a968b1bf 1000M build/relassert-artifact/relassert/repository/36a968b1bf/linux_amd64 1.5G build/relassert-artifact/relassert/test 4.6G build/relassert-artifact/relassert ``` because these two paths: ``` extension/*/*.duckdb_extension repository/36a968b1bf/linux_amd64/*.duckdb_extension ``` appears to contain binary file copies. ### Show symbols for stacktrace in `linux-debug` Before, on `linux-debug`, we would [see](https://github.com/duckdb/duckdb/actions/runs/24786615493/job/72534047134) a traceback with no symbols: ``` build/relassert/test/unittest(+0x4447240) [0x560387001240] build/relassert/test/unittest(+0x7b63367) [0x56038a71d367] build/relassert/test/unittest(+0x7b66aec) [0x56038a720aec] build/relassert/test/unittest(+0x7b7447c) [0x56038a72e47c] build/relassert/test/unittest(+0x7b79e61) [0x56038a733e61] build/relassert/test/unittest(+0x7b722c3) [0x56038a72c2c3] build/relassert/test/unittest(+0x7b75a79) [0x56038a72fa79] build/relassert/test/unittest(+0x7b7c248) [0x56038a736248] build/relassert/test/unittest(+0x7b9b018) [0x56038a755018] build/relassert/test/unittest(+0x301e5d4) [0x560385bd85d4] build/relassert/test/unittest(+0x3ba5b9f) [0x56038675fb9f] build/relassert/test/unittest(+0x3b9e56e) [0x56038675856e] build/relassert/test/unittest(+0x3b9b716) [0x560386755716] build/relassert/test/unittest(+0x3bafee7) [0x560386769ee7] build/relassert/test/unittest(+0x3bad196) [0x560386767196] build/relassert/test/unittest(+0x3bf2c57) [0x5603867acc57] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x7fc4a01b61ca] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x7fc4a01b628b] build/relassert/test/unittest(+0x2ee2065) [0x560385a9c065] ``` After, we get a stacktrace with symbols. Use `-rdynamic` as linker flag to fix unused CLI flag warning. ### `-rdynamic` caused ODR violations With `EXPORT_DYNAMIC_SYMBOLS: 1` for `linux-debug`, address sanitizer [finds](https://github.com/duckdb/duckdb/actions/runs/24837199395/job/72703372246?pr=22246#step:9:22) One Definition Rule violations: ``` ==437==ERROR: AddressSanitizer: odr-violation (0x7f85b6a36b60): [1] size=256 'LOOKUP_TABLE' /home/runner/work/duckdb/duckdb/src/function/cast/nested_to_varchar_cast.cpp:5:12 in /home/runner/work/duckdb/duckdb/build/relassert/src/libduckdb.so [2] size=256 'LOOKUP_TABLE' /home/runner/work/duckdb/duckdb/src/function/cast/nested_to_varchar_cast.cpp:5:12 in /home/runner/work/duckdb/duckdb/build/relassert/test/unittest These globals were registered at these points: [1]: #0 0x7f85d158906f in __asan_register_globals ../../../../src/libsanitizer/asan/asan_globals.cpp:350 #1 0x7f85d1c2071e in call_init elf/dl-init.c:74 #2 0x7f85d1c20823 in call_init elf/dl-init.c:120 #3 0x7f85d1c20823 in _dl_init elf/dl-init.c:121 duckdb#4 0x7f85d1c1c5b1 in __GI__dl_catch_exception elf/dl-catch.c:211 duckdb#5 0x7f85d1c27d7b in dl_open_worker elf/dl-open.c:829 duckdb#6 0x7f85d1c27d7b in dl_open_worker elf/dl-open.c:792 duckdb#7 0x7f85d1c1c51b in __GI__dl_catch_exception elf/dl-catch.c:237 duckdb#8 0x7f85d1c28163 in _dl_open elf/dl-open.c:905 duckdb#9 0x7f85d0a0f1a3 in dlopen_doit dlfcn/dlopen.c:56 duckdb#10 0x7f85d1c1c51b in __GI__dl_catch_exception elf/dl-catch.c:237 duckdb#11 0x7f85d1c1c668 in _dl_catch_error elf/dl-catch.c:256 duckdb#12 0x7f85d0a0ec82 in _dlerror_run dlfcn/dlerror.c:138 duckdb#13 0x7f85d0a0f25e in dlopen_implementation dlfcn/dlopen.c:71 duckdb#14 0x7f85d0a0f25e in ___dlopen dlfcn/dlopen.c:81 duckdb#15 0x7f85d15b2f08 in dlopen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6341 duckdb#16 0x55ad3185a30f in AdbcLoadDriver (/home/runner/work/duckdb/duckdb/build/relassert/test/unittest+0xcf3630f) (BuildId: fbeec3ab975633c9b9b6c6233c1545043737a86a) [2]: #0 0x7f85d158906f in __asan_register_globals ../../../../src/libsanitizer/asan/asan_globals.cpp:350 #1 0x7f85d09a1303 in call_init ../csu/libc-start.c:145 #2 0x7f85d09a1303 in __libc_start_main_impl ../csu/libc-start.c:347 #3 0x55ad31808ea4 in _start (/home/runner/work/duckdb/duckdb/build/relassert/test/unittest+0xcee4ea4) (BuildId: fbeec3ab975633c9b9b6c6233c1545043737a86a) ``` The errors are caused by duplicated symbols. The symbols are present in both the unittest binary and the loaded shared object `libduckdb.so`. We can solve the errors by separating unittest object files and linking the `unittest` binary to `libduckdb.so`. ### Differences in compressed (gzip -4) artifacts build type | before | after | reduction -- | -- | -- | -- linux-relassert-build | 1250 MB | 873 MB | 377 MB (30.2%) linux-release-build | 258 MB | 170 MB | 88 MB (34.1%)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On Linux with ext4/xfs, fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE) causes reads from the punched range to return zeros, so Trim() is already observable there. On macOS, Windows, tmpfs, NFS, and filesystems that do not support PUNCH_HOLE, Trim() returns false and does nothing — meaning the trim path is effectively untested in most CI environments.
Under DUCKDB_RUN_SLOW_VERIFIERS or DUCKDB_ALTERNATIVE_VERIFY, explicitly zero the trimmed range with Write() before attempting fallocate. This makes Trim() observable everywhere: any code that reads a trimmed block will see zeros regardless of the underlying filesystem or OS.
The two flags are triggered independently and serve different purposes:
DUCKDB_RUN_SLOW_VERIFIERS enables additive extra verification on top of the production code path. Zeroing fits here because it turns a silent no-op (on non-Linux or unsupporting filesystems) into a detectable operation without changing what the production path does.
DUCKDB_ALTERNATIVE_VERIFY selects alternative implementations of the same operation. Zeroing is precisely an alternative implementation of trim: it achieves the same observable effect as PUNCH_HOLE (reads from trimmed blocks return zeros) through explicit writes rather than a kernel hole-punching syscall.
The two blocks are independent: on Linux with glibc >= 2.18, both the zeroing and the fallocate execute, exercising the production code path while still making the result observable.