[READY FOR REVIEW] feature/ladybug-provider — add LadybugDB provider and optimize graph ingestion#63
Open
bharatsachya wants to merge 6 commits into
Open
[READY FOR REVIEW] feature/ladybug-provider — add LadybugDB provider and optimize graph ingestion#63bharatsachya wants to merge 6 commits into
bharatsachya wants to merge 6 commits into
Conversation
Collaborator
Author
|
Against #60 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1. Branch Naming
feat/ladbug(Note: branch contains a slight typo, but conforms to prefix guidelines and lowercase structure).2. PR Title
[DONE] feature/ladybug-provider — add LadybugDB provider and optimize graph ingestion3. PR Description
What changed
@bb/ladybugpackage implementing theIGraphDatabaseProvidercontract for LadybugDB (@ladybugdb/core).snapshotFilesToVersioninfileVersions.tsto useCREATEstatements instead of Neo4j-styleMERGEclauses, preventing unnecessary full-column scans in LadybugDB's columnar layout.files.tsto replace single-file loop ingestion withbulkUpsertFiles. It now accepts anAsyncIterable<UpsertFileNodeInput>stream, preventing OOM crashes by writing items to disk usingParquetWriterbefore executing a single bulkCOPY FROMcommand in a transaction.CONTAINSandHAS_KEYWORD) by passing explicit query routing options:COPY CONTAINS FROM '...' (FROM='Folder', TO='File')COPY HAS_KEYWORD FROM '...' (FROM='File', TO='Keyword')finallyblock.README.mdfiles atpackages/ladybug/README.mdandpackages/ladybug/src/README.mdto satisfy the monorepo's folder context contract rules.Why
We are migrating our graph database from Neo4j (OLTP) to LadybugDB (OLAP). The previous implementation performed individual record upserts (resulting in thousands of individual
COPY FROMcommands) and had severe Neo4j-style queries (MERGEstatements on append-only logs). These patterns caused high memory pressure, OOM risk for large repositories (50,000+ files), and database-level lock contentions. Shifting to disk-backed Parquet streams and single-transaction bulk copy operations addresses these bottlenecks.How to test
ladybug.lbugare terminated."sqlite_path": "
/.bytebell/data.sqlite",/.bytebell/ladybug.lbug""ladybug_path": "
make sure your config contain these values
Just In Case:
-v /Users/zeta/.bytebell:/database
-e LBUG_FILE=ladybug.lbug
--rm ghcr.io/ladybugdb/explorer:latest
run this to restart docker in case you do not see changes