Skip to content

Commit be5da3d

Browse files
committed
fix: resolve LIKE/GLOB cache collisions, REGEXP error propagation, alias shadowing, and MVCC visibility bugs
Major fixes: - LIKE/GLOB dynamic pattern cache collision: separate caches for LIKE vs GLOB, include escape char in LIKE cache key, skip fast paths for backslash-escaped patterns, fix trailing backslash handling in like_to_regex - REGEXP error propagation: add matches_checked()/eval_bool_checked() that return Result instead of swallowing errors. Propagate through FilteredResult, all streaming wrappers, materialization helpers, subquery paths, DML write paths, FFI, and the Rows iterator/advance API - SELECT alias shadowing WHERE clause: build_alias_map_excluding() skips aliases that shadow real table column names (including unqualified names from joins), preventing incorrect WHERE clause rewriting - MVCC visibility cache sentinel: change last_txn_id init from -1 to 0 in sum_column/min_column/max_column, fixing incorrect aggregation results after snapshot recovery (RECOVERY_TRANSACTION_ID = -1 collided with the sentinel) - RowVec/RowIdVec pool re-entrancy: use try_borrow_mut() instead of borrow_mut() to prevent RefCell panics on nested Drop - View WHERE missing execution context: use RowFilter with ctx instead of FilteredResult::with_defaults for parameterized predicates on views - EXISTS index-probe missing context: apply execution context to cached predicate filter before evaluation - INSERT...SELECT atomicity: materialize source query before writes in explicit transactions to prevent partial inserts on late runtime errors - ScannerResult missing last_error(): bridge scanner.err() to QueryResult trait - Parallel filter order preservation: use mark-and-extract (bool mask) instead of try_reduce to maintain deterministic row order - Rows close() on error: properly forward result.close() before setting closed New test files for regression coverage of all fixed bugs. Updated docs for LIKE ESCAPE, NOT variants, parameterized patterns, and Node.js driver clone() API.
1 parent 77bcab7 commit be5da3d

35 files changed

+2763
-232
lines changed

docs/_docs/drivers/nodejs.md

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ const db = await Database.open('file:///absolute/path/to/db');
9292
| Method | Returns | Description |
9393
|--------|---------|-------------|
9494
| `Database.open(path)` | `Promise<Database>` | Open a database |
95+
| `clone()` | `Database` | Clone handle (shared engine, own state) |
9596
| `execute(sql, params?)` | `Promise<RunResult>` | Execute DML statement |
9697
| `exec(sql)` | `Promise<void>` | Execute a DDL statement |
9798
| `query(sql, params?)` | `Promise<Object[]>` | Query rows as objects |
@@ -107,6 +108,7 @@ Sync methods run on the main thread. Faster for simple operations but blocks the
107108
| Method | Returns | Description |
108109
|--------|---------|-------------|
109110
| `Database.openSync(path)` | `Database` | Open a database |
111+
| `clone()` | `Database` | Clone handle (shared engine, own state) |
110112
| `executeSync(sql, params?)` | `RunResult` | Execute DML statement |
111113
| `execSync(sql)` | `void` | Execute a DDL statement |
112114
| `querySync(sql, params?)` | `Object[]` | Query rows as objects |
@@ -117,7 +119,7 @@ Sync methods run on the main thread. Faster for simple operations but blocks the
117119
| `prepare(sql)` | `PreparedStatement` | Create a prepared statement |
118120
| `closeSync()` | `void` | Close the database |
119121

120-
`RunResult` is `{ changes: number }`:
122+
`RunResult` is `{ changes: number }`. It can be imported as a type:
121123

122124
```ts
123125
import { Database, RunResult } from '@stoolap/node';
@@ -146,10 +148,10 @@ await db2.close();
146148
Pass configuration as query parameters in the path:
147149

148150
```js
149-
// Maximum durability (fsync on every WAL write)
151+
// Maximum durability: fsync on every WAL write
150152
const db = await Database.open('./mydata?sync=full');
151153

152-
// High throughput (no fsync, larger buffers)
154+
// High throughput: no fsync, larger buffers
153155
const db = await Database.open('./mydata?sync=none&wal_buffer_size=131072');
154156

155157
// Custom snapshot interval with compression
@@ -167,7 +169,7 @@ Controls the durability vs. performance trade-off:
167169

168170
| Mode | Value | Description |
169171
|------|-------|-------------|
170-
| `none` | `sync=none` | No fsync. Fastest, but data may be lost on crash |
172+
| `none` | `sync=none` | No fsync. Fastest, data may be lost on crash |
171173
| `normal` | `sync=normal` | Fsync on commit batches. Good balance (default) |
172174
| `full` | `sync=full` | Fsync on every WAL write. Slowest, maximum durability |
173175

@@ -185,9 +187,27 @@ Controls the durability vs. performance trade-off:
185187
| `sync_interval_ms` | `10` | Minimum ms between syncs (normal mode) |
186188
| `wal_compression` | `on` | LZ4 compression for WAL entries |
187189
| `snapshot_compression` | `on` | LZ4 compression for snapshots |
188-
| `compression` | -- | Set both `wal_compression` and `snapshot_compression` |
190+
| `compression` | | Set both `wal_compression` and `snapshot_compression` |
189191
| `compression_threshold` | `64` | Minimum bytes before compressing an entry |
190192

193+
## Cloning
194+
195+
`clone()` creates a new `Database` handle that shares the same underlying engine (data, indexes, transactions) but has its own executor and error state. Useful for concurrent access patterns such as worker threads.
196+
197+
```js
198+
const db = await Database.open('./mydata');
199+
const db2 = db.clone();
200+
201+
// Both see the same data
202+
await db.execute('INSERT INTO users VALUES ($1, $2)', [1, 'Alice']);
203+
const row = db2.queryOneSync('SELECT * FROM users WHERE id = $1', [1]);
204+
// { id: 1, name: 'Alice' }
205+
206+
// Each clone must be closed independently
207+
await db2.close();
208+
await db.close();
209+
```
210+
191211
## Raw Query Format
192212

193213
`queryRaw` / `queryRawSync` return `{ columns: string[], rows: any[][] }` instead of an array of objects. Faster when you don't need named keys.
@@ -413,7 +433,7 @@ await db.exec('CREATE TABLE embeddings (id INTEGER PRIMARY KEY, vec VECTOR(3))')
413433
// Insert vectors via SQL string literals
414434
await db.execute("INSERT INTO embeddings VALUES (1, '[0.1, 0.2, 0.3]')");
415435

416-
// Query vectors are returned as Float32Array
436+
// Query: vectors are returned as Float32Array
417437
const row = await db.queryOne('SELECT vec FROM embeddings WHERE id = 1');
418438
console.log(row.vec); // Float32Array(3) [0.1, 0.2, 0.3]
419439
console.log(row.vec instanceof Float32Array); // true

docs/_docs/sql-features/operators-expressions.md

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,9 @@ SELECT * FROM products WHERE name LIKE '%Pro%'; -- Contains 'Pro'
8282

8383
-- _ matches any single character
8484
SELECT * FROM products WHERE code LIKE 'A_C'; -- Matches 'ABC', 'A1C', etc.
85+
86+
-- NOT LIKE
87+
SELECT * FROM products WHERE name NOT LIKE '%test%';
8588
```
8689

8790
#### ILIKE (Case-Insensitive)
@@ -90,8 +93,25 @@ SELECT * FROM products WHERE code LIKE 'A_C'; -- Matches 'ABC', 'A1C', e
9093
-- Same as LIKE but ignores case
9194
SELECT * FROM products WHERE name ILIKE 'apple%'; -- Matches 'Apple', 'APPLE', 'apple'
9295
SELECT * FROM users WHERE email ILIKE '%@gmail.com';
96+
97+
-- NOT ILIKE
98+
SELECT * FROM users WHERE name NOT ILIKE 'admin%';
99+
```
100+
101+
#### LIKE with ESCAPE
102+
103+
Use the ESCAPE clause when your pattern needs to match literal `%` or `_` characters:
104+
105+
```sql
106+
-- Match values containing a literal '%' character
107+
SELECT * FROM metrics WHERE label LIKE '%!%%' ESCAPE '!';
108+
109+
-- Match values containing a literal '_' character
110+
SELECT * FROM codes WHERE code LIKE 'A!_B' ESCAPE '!';
93111
```
94112

113+
The escape character can be any single character. The character immediately after the escape is treated as a literal instead of a wildcard.
114+
95115
#### GLOB (Shell-Style Patterns)
96116

97117
```sql
@@ -103,15 +123,38 @@ SELECT * FROM files WHERE name GLOB 'file?.dat';
103123

104124
-- [...] matches any character in the set
105125
SELECT * FROM files WHERE name GLOB '[abc]*';
126+
127+
-- NOT GLOB
128+
SELECT * FROM files WHERE name NOT GLOB '*.tmp';
106129
```
107130

108131
#### REGEXP (Regular Expressions)
109132

110133
```sql
111-
-- Full regex pattern matching
134+
-- Full regex pattern matching (Rust regex syntax)
112135
SELECT * FROM logs WHERE message REGEXP 'error|warning';
113136
SELECT * FROM users WHERE email REGEXP '^[a-z]+@[a-z]+\.[a-z]+$';
114137
SELECT * FROM data WHERE value REGEXP '[0-9]{3}-[0-9]{4}';
138+
139+
-- NOT REGEXP
140+
SELECT * FROM logs WHERE message NOT REGEXP 'debug|trace';
141+
```
142+
143+
Invalid regex patterns return an error instead of silently matching nothing.
144+
145+
#### Parameterized Patterns
146+
147+
All pattern matching operators support parameterized patterns. The pattern is compiled once and reused for every row:
148+
149+
```sql
150+
-- LIKE with parameter
151+
SELECT * FROM products WHERE name LIKE $1;
152+
153+
-- REGEXP with parameter
154+
SELECT * FROM logs WHERE message REGEXP $1;
155+
156+
-- GLOB with parameter
157+
SELECT * FROM files WHERE name GLOB $1;
115158
```
116159

117160
## Range Operators

src/api/rows.rs

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,8 @@ pub struct Rows {
226226
/// Shared column names (Arc to avoid cloning per row)
227227
columns: CompactArc<Vec<String>>,
228228
closed: bool,
229+
/// Pending error from a filter runtime failure (e.g., invalid REGEXP)
230+
pending_error: Option<crate::core::Error>,
229231
}
230232

231233
impl Rows {
@@ -239,6 +241,7 @@ impl Rows {
239241
result,
240242
columns,
241243
closed: false,
244+
pending_error: None,
242245
}
243246
}
244247

@@ -269,7 +272,26 @@ impl Rows {
269272
if self.closed {
270273
return false;
271274
}
272-
self.result.next()
275+
if self.result.next() {
276+
return true;
277+
}
278+
// Check for runtime filter errors (e.g., invalid REGEXP)
279+
if let Some(err) = self.result.last_error() {
280+
self.pending_error = Some(err);
281+
}
282+
// Clean up resources (scanner close, etc.) now that iteration is done
283+
self.close();
284+
false
285+
}
286+
287+
/// Return any runtime error that caused `advance()` to return false.
288+
///
289+
/// After `advance()` returns `false`, call this to distinguish between
290+
/// normal end-of-stream (returns `None`) and a runtime filter error
291+
/// like an invalid parameterized REGEXP (returns `Some(error)`).
292+
#[inline]
293+
pub fn error(&mut self) -> Option<crate::core::Error> {
294+
self.pending_error.take()
273295
}
274296

275297
/// Get a reference to the current row (after a successful `advance()`).
@@ -318,6 +340,11 @@ impl Iterator for Rows {
318340
let row = self.result.take_row();
319341
// Arc clone is O(1) - just increments reference count
320342
Some(Ok(ResultRow::new(row, CompactArc::clone(&self.columns))))
343+
} else if let Some(err) = self.result.last_error() {
344+
// Surface runtime errors (e.g. invalid REGEXP pattern).
345+
// close() forwards to result.close() for proper scanner cleanup
346+
self.close();
347+
Some(Err(err))
321348
} else {
322349
None
323350
}

src/api/transaction.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,7 @@ impl Transaction {
336336
let mut setter = |row: Row| -> Result<(Row, bool)> {
337337
// Check WHERE clause if present (uses thread-local VM internally)
338338
if let Some(ref filter) = where_filter {
339-
if !filter.matches(&row) {
339+
if !filter.matches_checked(&row)? {
340340
return Ok((row, false));
341341
}
342342
}

src/core/row_vec.rs

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ thread_local! {
4646
#[inline]
4747
pub fn clear_row_vec_pool() {
4848
ROW_VEC_POOL.with(|pool| {
49-
pool.borrow_mut().clear();
49+
if let Ok(mut p) = pool.try_borrow_mut() {
50+
p.clear();
51+
}
5052
});
5153
}
5254

@@ -284,7 +286,9 @@ impl RowVec {
284286
/// Takes the largest available buffer (end of sorted list).
285287
#[inline]
286288
pub fn new() -> Self {
287-
let v = ROW_VEC_POOL.with(|pool| pool.borrow_mut().pop());
289+
// Use try_borrow_mut to avoid panic on re-entrant access
290+
// (can happen if Drop triggers nested RowVec creation on the same thread)
291+
let v = ROW_VEC_POOL.with(|pool| pool.try_borrow_mut().ok().and_then(|mut p| p.pop()));
288292
match v {
289293
Some(buf) => {
290294
track_hit!(buf.capacity());
@@ -303,8 +307,12 @@ impl RowVec {
303307
/// Uses binary search to find smallest buffer >= requested capacity.
304308
#[inline]
305309
pub fn with_capacity(capacity: usize) -> Self {
310+
// Use try_borrow_mut to avoid panic on re-entrant access
306311
let v = ROW_VEC_POOL.with(|pool| {
307-
let mut pool = pool.borrow_mut();
312+
let mut pool = match pool.try_borrow_mut() {
313+
Ok(p) => p,
314+
Err(_) => return None, // Pool already borrowed — allocate fresh
315+
};
308316
if pool.is_empty() {
309317
return None;
310318
}
@@ -462,7 +470,11 @@ impl Drop for RowVec {
462470
}
463471
v.clear();
464472
ROW_VEC_POOL.with(|pool| {
465-
let mut pool = pool.borrow_mut();
473+
// Use try_borrow_mut to avoid panic on re-entrant access
474+
let mut pool = match pool.try_borrow_mut() {
475+
Ok(p) => p,
476+
Err(_) => return, // Pool already borrowed — let buffer deallocate
477+
};
466478
if pool.len() < POOL_SIZE {
467479
// Pool has room - insert in sorted position (by capacity, ascending)
468480
let insert_idx = pool.partition_point(|b| b.capacity() < cap);
@@ -645,7 +657,9 @@ thread_local! {
645657
#[inline]
646658
pub fn clear_row_id_vec_pool() {
647659
ROW_ID_VEC_POOL.with(|pool| {
648-
pool.borrow_mut().clear();
660+
if let Ok(mut p) = pool.try_borrow_mut() {
661+
p.clear();
662+
}
649663
});
650664
}
651665

@@ -663,7 +677,7 @@ impl RowIdVec {
663677
/// Takes the largest available buffer (end of sorted list).
664678
#[inline]
665679
pub fn new() -> Self {
666-
let v = ROW_ID_VEC_POOL.with(|pool| pool.borrow_mut().pop());
680+
let v = ROW_ID_VEC_POOL.with(|pool| pool.try_borrow_mut().ok().and_then(|mut p| p.pop()));
667681
match v {
668682
Some(buf) => Self { inner: Some(buf) },
669683
None => Self {
@@ -677,7 +691,10 @@ impl RowIdVec {
677691
#[inline]
678692
pub fn with_capacity(capacity: usize) -> Self {
679693
let v = ROW_ID_VEC_POOL.with(|pool| {
680-
let mut pool = pool.borrow_mut();
694+
let mut pool = match pool.try_borrow_mut() {
695+
Ok(p) => p,
696+
Err(_) => return None,
697+
};
681698
if pool.is_empty() {
682699
return None;
683700
}
@@ -832,7 +849,10 @@ impl Drop for RowIdVec {
832849
}
833850
v.clear();
834851
ROW_ID_VEC_POOL.with(|pool| {
835-
let mut pool = pool.borrow_mut();
852+
let mut pool = match pool.try_borrow_mut() {
853+
Ok(p) => p,
854+
Err(_) => return,
855+
};
836856
if pool.len() < ROW_ID_POOL_SIZE {
837857
// Pool has room - insert in sorted position (by capacity, ascending)
838858
let insert_idx = pool.partition_point(|b| b.capacity() < cap);

src/executor/aggregation.rs

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1199,7 +1199,7 @@ impl Executor {
11991199

12001200
// Filter rows using the pre-compiled filter
12011201
for (id, row) in result_rows {
1202-
if having_filter.matches(&row) {
1202+
if having_filter.matches_checked(&row)? {
12031203
result_rows_with_ids.push((id, row));
12041204
}
12051205
}
@@ -4736,7 +4736,7 @@ impl Executor {
47364736
let mut filtered_rows = RowVec::new();
47374737
let mut new_id = 0i64;
47384738
for (_, row) in rows {
4739-
if having_filter.matches(&row) {
4739+
if having_filter.matches_checked(&row)? {
47404740
filtered_rows.push((new_id, row));
47414741
new_id += 1;
47424742
}
@@ -5741,6 +5741,9 @@ impl Executor {
57415741

57425742
// Sample first row to detect key type
57435743
if !result.next() {
5744+
if let Some(err) = result.last_error() {
5745+
return Err(err);
5746+
}
57445747
// Empty result - return empty aggregation
57455748
let mut result_columns = Vec::with_capacity(1 + aggregations.len());
57465749
result_columns.push(group_col_name.clone());
@@ -5875,6 +5878,9 @@ impl Executor {
58755878
let row = result.row();
58765879
process_row(row, &mut groups, &mut null_group, &state_template);
58775880
}
5881+
if let Some(err) = result.last_error() {
5882+
return Err(err);
5883+
}
58785884

58795885
// Build result columns
58805886
let mut result_columns = Vec::with_capacity(1 + aggregations.len());

src/executor/cte.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -798,7 +798,7 @@ impl Executor {
798798

799799
let mut rows = RowVec::with_capacity(result_rows.len());
800800
for (row_id, row) in result_rows {
801-
if where_eval.eval_bool(&row) {
801+
if where_eval.eval_bool_checked(&row)? {
802802
rows.push((row_id, row));
803803
}
804804
}
@@ -1123,7 +1123,7 @@ impl Executor {
11231123
let mut result = RowVec::new();
11241124
let mut row_id = 0i64;
11251125
for (_, row) in cte_rows {
1126-
if eval.eval_bool(&row) {
1126+
if eval.eval_bool_checked(&row)? {
11271127
result.push((row_id, row));
11281128
row_id += 1;
11291129
}
@@ -2296,7 +2296,7 @@ impl Executor {
22962296

22972297
let mut rows = RowVec::with_capacity(result_rows.len());
22982298
for (row_id, row) in result_rows {
2299-
if where_eval.eval_bool(&row) {
2299+
if where_eval.eval_bool_checked(&row)? {
23002300
rows.push((row_id, row));
23012301
}
23022302
}

src/executor/ddl.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,9 @@ impl Executor {
498498
while result.next() {
499499
rows.push(result.take_row());
500500
}
501+
if let Some(err) = result.last_error() {
502+
return Err(err);
503+
}
501504

502505
// Infer schema from the result columns and first row (if available)
503506
let mut schema_builder = SchemaBuilder::new(table_name);

0 commit comments

Comments
 (0)