feat(core,py): bulk-insert primitives — 2.4.0 / 0.3.0 by oldnordic · Pull Request #5 · oldnordic/sqlitegraph

oldnordic · 2026-05-15T23:41:34Z

Summary

Adds SqliteGraph::insert_entities_bulk / insert_edges_bulk — single-transaction, prepare_cached, rollback on error. Empty input is a no-op. Returns rowids in input order.
Adds GraphBackend::insert_nodes_bulk / insert_edges_bulk trait methods with default implementations that loop the single-insert path, so every existing GraphBackend consumer keeps working at 2.3 → 2.4 with no source changes. The &B blanket forwarders are wired through.
SqliteGraphBackend overrides both, dispatching to the new SqliteGraph bulk paths. Publisher events fire per row after commit (preserves single-insert observer semantics).
Python: Graph.add_nodes_bulk(items: list[dict]) and Graph.add_edges_bulk(items: list[dict]) — same field set as kwargs-style add_node/add_edge, one FFI call per batch.
Versions bumped: sqlitegraph-core 2.3.0 → 2.4.0, sqlitegraph-py 0.2.0 → 0.3.0 (SemVer minor — additive, no breakage).

Closes the build-time gap downstream consumers see when loading large graphs from grounded-index DBs. Benchmarks will be re-run in the grounded-graph repo after the wheels land on PyPI.

Implementation notes

All-or-nothing semantics on the bulk paths: validation first (rejects empty names / unknown endpoints / etc.), then BEGIN, then loop, then COMMIT. On any error: ROLLBACK and return the original error. Default trait impl inherits whatever atomicity single-insert provides.
prepare_cached is reused across rows, so the SQL statement is parsed once per batch (not once per row).
V3Backend inherits the default loop impl — a follow-up patch can route V3's bulk path through the existing WriteBatchGuard for native batched writes.

Test plan

cargo fmt --all --check — clean
RUSTFLAGS="-D warnings" cargo check --lib --bins — clean
cargo test -p sqlitegraph --lib -- --test-threads=1 — 1161 passed, 0 failed
cargo test -p sqlitegraph --test bulk_insert_tests — 8 passed (input-order IDs, empty input, rollback on validation error, edge bulk parity, observable-state parity)
maturin develop --release && pytest tests/ — 61 passed (10 new bulk-insert tests)

🤖 Generated with Claude Code

Adds insert_*_bulk methods that batch multiple inserts inside a single transaction with a reused prepare_cached statement. Closes the 8x build- time gap downstream consumers see when loading large graphs from grounded-index DBs (Python->Rust FFI per add_node was the bottleneck). Core (sqlitegraph-core): - SqliteGraph::insert_entities_bulk and insert_edges_bulk: BEGIN - prepare_cached(INSERT) - loop execute + last_insert_rowid - COMMIT. Empty input returns Ok(vec![]) without opening a transaction. On any error mid-batch: ROLLBACK and return the error; the database is left untouched. Returns rowids in input order. - GraphBackend::insert_nodes_bulk and insert_edges_bulk: trait methods with default implementations that loop the single-insert path, so any existing GraphBackend consumer keeps working at 2.3 -> 2.4 with no source changes. The &B blanket forwarders are wired through. - SqliteGraphBackend overrides both, dispatching to the new SqliteGraph bulk paths. Publisher events fire per row after commit to preserve single-insert observer semantics; no new batched event type. Python (sqlitegraph-py): - Graph.add_nodes_bulk(items: list[dict]) and add_edges_bulk(items): each dict carries the same fields as the kwargs-style add_node/add_edge. Missing required fields raise; valid items go through in one FFI call. Tests: - 8 Rust integration cases in tests/bulk_insert_tests.rs: input-order IDs, empty input, validation rollback, edge bulk parity, observable state matches a per-item loop. - 10 Python cases in tests/test_bulk_insert.py: both bulk paths, missing-field validation, data/file_path round-trip, parity with the per-item loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- sqlitegraph-core: 2.3.0 -> 2.4.0 (new GraphBackend::insert_*_bulk trait methods with default impls; SqliteGraph::insert_*_bulk transactional bulk paths; SqliteGraphBackend overrides). SemVer minor. - sqlitegraph-py: 0.2.0 -> 0.3.0 (Graph.add_nodes_bulk and add_edges_bulk Python methods). SemVer minor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Self-heals the python CI step on PR #5: - Replace bare PyException::new_err with InvalidArgumentError::new_err for the missing-field validators on add_nodes_bulk/add_edges_bulk so callers see a sqlitegraph-typed exception instead of a generic one. - Update test_bulk_insert.py to assert InvalidArgumentError specifically (silences ruff B017) and pass strict=True to zip (silences ruff B905). - Apply ruff format to the new test file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

oldnordic and others added 3 commits May 16, 2026 01:40

oldnordic merged commit 2f9b9d1 into main May 15, 2026
10 checks passed

oldnordic deleted the feat/bulk-insert branch May 15, 2026 23:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core,py): bulk-insert primitives — 2.4.0 / 0.3.0#5

feat(core,py): bulk-insert primitives — 2.4.0 / 0.3.0#5
oldnordic merged 3 commits into
mainfrom
feat/bulk-insert

oldnordic commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oldnordic commented May 15, 2026

Summary

Implementation notes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant