Skip to content

SQL planner correctness & procedural execution safety (8 sub-items) #47

@hollanf

Description

@hollanf

Eight distinct bugs across the SQL planner, plan-conversion, and procedural executor. Two are security-adjacent (RLS predicate corruption in #7 from the companion parser epic, trigger fuel in #3 here, procedure cache collision in #7). The rest are correctness bugs that return wrong rows or silently drop user-written constraints.


1. const_fold::fold_binary uses unchecked integer arithmetic → debug panic, release silent wrap

File: nodedb-sql/src/planner/const_fold.rs:67-72

(SqlValue::Int(a), BinaryOp::Add, SqlValue::Int(b)) => SqlValue::Int(a + b),
(SqlValue::Int(a), BinaryOp::Sub, SqlValue::Int(b)) => SqlValue::Int(a - b),
(SqlValue::Int(a), BinaryOp::Mul, SqlValue::Int(b)) => SqlValue::Int(a * b),

No checked_*. Debug build panics (attempt to add with overflow) and drops the connection; release build silently wraps, embedding the wrong constant in the plan. Postgres raises integer out of range.

Repro:

SELECT 9223372036854775807 + 1;
INSERT INTO t(id, v) VALUES ('k', 9223372036854775807 * 2);

2. Procedural executor eval_binary_op panics on i64 overflow; float divide-by-zero returns silent ±Inf

File: nodedb/src/control/planner/procedural/executor/eval.rs:176-202

(Value::Integer(a), Value::Integer(b)) => match op {
    Plus => Some(Value::Integer(a + b)),
    Minus => Some(Value::Integer(a - b)),
    Multiply => Some(Value::Integer(a * b)),
    Divide if *b != 0 => Some(Value::Integer(a / b)),    // still panics on i64::MIN / -1
    ...
(Value::Float(a), Value::Float(b)) => match op {
    ...
    Divide if *b != 0.0 => Some(Value::Float(a / b)),    // -0.0 passes the guard

Trigger / stored-procedure bodies that go through ASSIGN/IF/FOR bounds reach this path. At i64::MAX, a trivial IF x + 1 > 0 … panics debug workers, wraps to negative in release (flipping the branch). Integer i64::MIN / -1 also panics. Float / -0.0 returns -inf, and evaluate_condition treats Infinity as truthy.

Repro:

DO $$
DECLARE x BIGINT := 9223372036854775807;
BEGIN
  IF x + 1 > 0 THEN RAISE EXCEPTION 'bad'; END IF;
END$$;

3. Triggers run with effectively unlimited fuel (u64::MAX) + 1-hour deadline

Files:

pub fn unlimited() -> Self {
    Self {
        fuel_remaining: u64::MAX,
        deadline: Instant::now() + std::time::Duration::from_secs(3600),
        max_iterations: u64::MAX,
        timeout_secs: 3600,
    }
}

execute_block (triggers) calls ExecutionBudget::unlimited() unconditionally. consume_iteration only decrements once per loop iteration, and with u64::MAX fuel the loop runs until the 3600-second wall clock — per triggered row. A trivial LOOP ; END LOOP; trigger body freezes a Control Plane worker for an hour and compounds under concurrency. No operator-facing knob caps the fuel.

Repro:

CREATE TABLE t (id TEXT PRIMARY KEY, v INT);
CREATE TRIGGER tr AFTER INSERT ON t FOR EACH ROW EXECUTE PROCEDURE
  $$BEGIN LOOP END LOOP; END$$;
INSERT INTO t(id, v) VALUES ('a', 1);
-- worker pinned for 1 hour, connection hangs, another insert pins another worker

4. convert_join RIGHT-JOIN swap forgets to swap inline_left / inline_right

File: nodedb/src/control/planner/sql_plan_convert/scan.rs:172-189

let inline_left  = if matches!(left, SqlPlan::Join { .. }) { ... } else { None };
let inline_right = super::aggregate::inline_join_side(right, tenant_id, ctx)?;

let mut on_keys = on.to_vec();
let effective_join_type = if join_type.as_str() == "right" {
    std::mem::swap(&mut left_collection, &mut right_collection);
    std::mem::swap(&mut left_alias, &mut right_alias);
    on_keys = on_keys.into_iter().map(|(l, r)| (r, l)).collect();
    "left".to_string()
} ...

For a RIGHT JOIN the collections/aliases/keys are swapped to "left join" form, but inline_left/inline_right were computed from the original plans and are not swapped — so after the rewrite, left_collection holds what was originally the right side while inline_left points to the original left plan. Any RIGHT JOIN with a nested join on the left produces wrong output because the hash/merge join iterates the wrong side as the build/probe input. Additionally, inner_tasks.into_iter().next() at line 174 silently drops all but the first task if the nested join returned more than one.

Repro:

CREATE TABLE t1 (id TEXT PRIMARY KEY, c TEXT);
CREATE TABLE t2 (id TEXT PRIMARY KEY, t1_id TEXT);
CREATE TABLE t3 (id TEXT PRIMARY KEY, t2_id TEXT);
SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.t1_id
  RIGHT JOIN t3 ON t2.id = t3.t2_id;

Compare against a known-correct engine; row set differs.


5. plan_recursive_cte collapses the UNION into a flat RecursiveScan and hardcodes max_iterations: 100

File: nodedb-sql/src/planner/cte.rs:34-95

let base = plan_cte_branch(left, catalog, functions)?;
let recursive = plan_cte_branch(right, catalog, functions)?;
let collection = extract_collection(&base)
    .or_else(|| extract_collection(&recursive))
    .unwrap_or_default();
Ok(SqlPlan::RecursiveScan {
    collection,
    base_filters: extract_filters(&base),
    recursive_filters: extract_filters(&recursive),
    max_iterations: 100,      // hardcoded
    distinct: true,           // hardcoded — breaks UNION ALL semantics
    limit: 10000,
})

The recursive branch of a WITH RECURSIVE CTE always references the CTE (typically via a join on the CTE name). This planner extracts only the underlying collection and a flat filters list from each branch, so:

  1. Projections/aliases in the CTE column list are dropped.
  2. Joins inside the recursive branch are thrown away — the engine just re-filters the base table N times, not the recursive result.
  3. max_iterations: 100 is a literal — no OPTION (MAXRECURSION …) / SET override; deep recursion returns partial results silently.
  4. distinct: true overrides any UNION ALL the user wrote — duplicates that should be preserved are removed.

Repro:

WITH RECURSIVE c(n) AS (
  SELECT 1
  UNION ALL
  SELECT n + 1 FROM c WHERE n < 5
)
SELECT * FROM c;
-- Expected: 1,2,3,4,5 (5 rows). Actual: varies — often just base row, or wrong collection.

6. extract_join_constraint keeps only the first non-equi predicate; NATURAL JOIN silently becomes cartesian

File: nodedb-sql/src/planner/join.rs:198-224

let cond = if non_equi.is_empty() {
    None
} else {
    Some(convert_expr(non_equi.first().unwrap())?)
};
...
ast::JoinConstraint::Natural => Ok((Vec::new(), None)),
ast::JoinConstraint::None    => Ok((Vec::new(), None)),

extract_equi_keys walks top-level ANDs, collecting col = col pairs in keys and everything else in non_equi. Only non_equi.first() is attached as the join condition; every subsequent non-equi predicate is silently dropped. Rows the user filtered out survive the join.

The Natural / None arms return empty keys + None condition — i.e. cartesian product with no error raised.

Repros:

-- Second non-equi condition silently dropped:
SELECT * FROM t1 JOIN t2
  ON t1.id = t2.id AND t1.x > t2.x AND t1.y < t2.y;

-- Cartesian product, no error:
SELECT * FROM t1 NATURAL JOIN t2;

7. ProcedureBlockCache keys on 64-bit hash with no fallback equality check → collisions return the wrong body

File: nodedb/src/control/planner/procedural/executor/plan_cache.rs:29-113

struct CacheEntry { block: Arc<ProceduralBlock> }    // ← no body_sql stored
...
if let Some(entry) = inner.entries.get(&key) {
    return Ok(Arc::clone(&entry.block));
}
...
fn hash_body(body_sql: &str) -> u64 {
    let mut hasher = DefaultHasher::new();
    body_sql.hash(&mut hasher);
    hasher.finish()
}

Two trigger/procedure bodies whose 64-bit SipHash output collides cause the cache to hand back the previously-cached block for a completely different body. Entries store only the parsed block, never the source string, so there is no equality check to catch the collision. Symptom is catastrophic: e.g. a BEFORE UPDATE trigger runs the compiled body of a different AFTER INSERT trigger.


8. evaluate_default_expr silently drops the column if DEFAULT isn't in a fixed keyword list

File: nodedb/src/control/planner/sql_plan_convert/value/defaults.rs:7-71 + caller at nodedb/src/control/planner/sql_plan_convert/dml.rs:37-43

match upper.as_str() {
    "UUID_V7" | ... => Some(...),
    "NOW()" => Some(...),
    _ => parse_parametric_or_literal(expr, &upper),
}

Only UUID_V7, NOW(), NANOID(N), CUID2(N), integer, float, and bare-quoted string are recognised. Any other DDL default (DEFAULT upper('x'), DEFAULT current_timestamp, DEFAULT date_add(now(),'1d'), DEFAULT gen_random_uuid()) returns None, and the caller in convert_insert treats None as "no default" → the column is omitted from the row:

if !expanded.iter().any(|(k, _)| k == col_name)
    && let Some(val) = super::value::evaluate_default_expr(default_expr)
{
    expanded.push((col_name.clone(), ...));
}

INSERT succeeds silently with the column missing.

Repro:

CREATE TABLE t (id TEXT PRIMARY KEY, a TEXT DEFAULT upper('x'));
INSERT INTO t(id) VALUES ('k');
-- Expected: a = 'X'. Actual: a is absent / null.

Checklist

  • 1. const_fold::fold_binary use checked_*; return error or keep expression un-folded on overflow
  • 2. Procedural eval_binary_op use checked_*; explicit divide-by-zero error including -0.0; guard i64::MIN / -1
  • 3. Cap trigger fuel by tenant config; remove ExecutionBudget::unlimited() for user-supplied bodies
  • 4. Swap inline_left/inline_right alongside the collections/aliases in RIGHT-JOIN rewrite; handle multi-task nested-join output
  • 5. Preserve CTE structure (projections, joins) and expose max_iterations / distinct as plan parameters
  • 6. Attach every non-equi predicate as AND-of-conditions; raise an error (or compute NATURAL join column list) for Natural/None
  • 7. Store body_sql in CacheEntry and compare on hit; or switch to HashMap<String, _>
  • 8. Route unknown DEFAULT expressions through the full planner/const-folder instead of a keyword whitelist

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions