Skip to content

feat: Implement persistent storage#108

Draft
SoarinSkySagar wants to merge 11 commits into
grandinetech:devnet-5from
SoarinSkySagar:feature/database
Draft

feat: Implement persistent storage#108
SoarinSkySagar wants to merge 11 commits into
grandinetech:devnet-5from
SoarinSkySagar:feature/database

Conversation

@SoarinSkySagar

@SoarinSkySagar SoarinSkySagar commented Jun 27, 2026

Copy link
Copy Markdown

This PR adds a persistent libmdbx-backed database for client data.

@ArtiomTr ArtiomTr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is not finished? there is nothing implemented yet?

@bomanaps

Copy link
Copy Markdown
Contributor

looks like this is not finished? there is nothing implemented yet?

This will take some thing as it won't be a one of pr

@ArtiomTr

Copy link
Copy Markdown
Collaborator

This will take some thing as it won't be a one of pr

Ok but there is absolutely zero functionality yet? We can do this in iterations, i.e. first we can persist only blocks, but there is no point of merging dead code

Comment thread lean_client/Cargo.toml
bls = { git = "https://github.com/grandinetech/grandine", package = "bls", features = ["blst"], rev = "64afdee3c6be79fceffb66933dcb69a943f3f1ae" }
bytesize = { version = '2', features = ['serde'] }
clap = { version = "4", features = ["derive"] }
database = { git = "https://github.com/grandinetech/grandine", package = "database", rev = "64afdee3c6be79fceffb66933dcb69a943f3f1ae" }

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the database crate from grandine is a bit flawed, I would suggest implementing your own. Specifically, current database impl forces snappy compression for all values - this is inefficient in some cases (e.g., if we save slot -> state_root indexes, there is no point of compressing state_root, as it is pure entropy), and not optimal in others (e.g., for blobs you probably want to use something more compressing, like zstd). Thus, forcing single compression algorithm for all values wasn't a good idea. Also, for cases where performance matters, looks like there are currently compression algorithms that are both more performant & give better compression ratios than snappy - like lz4.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I will be implementing my own database crate on top of this next

@SoarinSkySagar

Copy link
Copy Markdown
Author

Ok but there is absolutely zero functionality yet? We can do this in iterations, i.e. first we can persist only blocks, but there is no point of merging dead code

yeah this is still a WIP, I was asking for review on the initial architecture (KV pairs and grandine database), if this is the correct direction

@bomanaps

Copy link
Copy Markdown
Contributor

One more thing can we remove the OOMing framing, as that issue has been resolved?

@SoarinSkySagar

SoarinSkySagar commented Jul 1, 2026

Copy link
Copy Markdown
Author

@bomanaps but since all state data is being managed through memory right now, don't you think it is bound to OOM sometime when the node is running for a long time?

@SoarinSkySagar

Copy link
Copy Markdown
Author

On a separate note, after implementation of persistent db I'm planning to research into LRU cache implementation in the crate itself which it will manage automatically such that the db functions can be used without worrying about caching. What do you think about this?

@bomanaps

bomanaps commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

On a separate note, after implementation of persistent db I'm planning to research into LRU cache implementation in the crate itself which it will manage automatically such that the db functions can be used without worrying about caching. What do you think about this?

The best person to answer this is @ArtiomTr and also on the side have you tried running a node maybe 3 node setup or more depending on your laptop capacity as this should give you a better feel of how lean Ethereum runs?

@ArtiomTr

ArtiomTr commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

yeah this is still a WIP, I was asking for review on the initial architecture (KV pairs and grandine database), if this is the correct direction

It is hard to tell what is going on, without seeing actual implementation :). Better to implement something first, see if it works & is performant enough, then proceed with review. To avoid wasting much time, start with smaller scope - like just saving the blocks first. The database must keep blocks, because if you have blocks, you can reconstruct any historical state, at the cost of cpu time. This way you can scaffold database structure, get early feedback on that, and then proceed on implementing everything else.

@bomanaps but since all state data is being managed through memory right now, don't you think it is bound to OOM sometime when the node is running for a long time?

This is true only for some cases, e.g. during long non-finality periods - roughly speaking, validator has to track every "branch", to be able to properly converge into whatever branch eventually wins. However, even in those cases, I think there are clever algorithms to avoid keeping all unfinalized history in memory.

During normal operation, usually memory consumption won't grow indefinitely - node has to keep only last finalized state, maybe some older ones, but no more. Node also has to keep some historical blocks (I believe all blocks up to weak subjectivity period, although I don't remember exactly and may be wrong on this), but blocks usually take only a fraction of space comparing to states, so should be a non-issue. Grandine the beacon chain existed for quite some time without database at all, and worked really good.

On a separate note, after implementation of persistent db I'm planning to research into LRU cache implementation in the crate itself which it will manage automatically such that the db functions can be used without worrying about caching. What do you think about this?

It is kinda complicated topic. Ideally, the node should operate without depending on database at all. So in this sense, caching just wastes cpu/memory. However, sometimes you actually do want to have caches, for instance there may be cases when you need to load some state that is a bit older than finalized point on a hot path, so loading it quickly may be desirable. However, caches wont magically make loading quicker -- instead, you will pay some small performance cost once, for being able to query the same thing instantly next time. If you take straightforward approach, and cache every database query, then such caches are pointless -- it is very rare that same object is queried from database twice. But if you make them smart, by somehow, caching intermediate values that may be needed for both querying objects A and B, then such caches will be very useful. This is the approach I take when implementing new database layout for grandine beacon chain (https://github.com/ArtiomTr/grandine/blob/4ec3964cf42b04b8d1ac93791a6a14ff788b2d18/fork_choice_control/src/storage.rs#L907). Although this requires careful benchmarking & profiling first, so probably better to think about caches after you have working database.

Also, let me give you some advice on using libmdbx:

  1. Try to do sequential writes, they will be much quicker, as libmdbx is a B+ tree internally (there is a good blog post explaining why sequential writes are more performant for B+ trees https://planetscale.com/blog/btrees-and-database-indexes)
  2. Libmdbx allows to do range queries, so you can extract values only by key prefixes. Database also permits having large keys, up to 2022 bytes for default 4kb page size. This means you can put more information into key, allowing to drop indexes for example. In case of blocks, you can, for example, save slot + block_root + state_root in key. This way you can achieve three goals at the same time:
    • slot + block_root gives you a unique key per block, even for unfinalized chain, where proposers may differ (this is irrelevant for lean chain currently, as validators cannot enter/exit, and proposer is chosen via round-robin, though this will likely change when going to mainnet)
    • slot being at the beginning makes all writes to database sequential, except for backsync - although this shouldn't be a problem, as those writes are still probably gonna get into the same B-tree branch.
    • state_root in key allows to find which state corresponds to this block, without even reading/decompressing/decoding the block itself.
  3. Keep in mind that keys in libmdbx are not necessarily UTF-8 encoded strings, so you can use any byte sequence you want to.
  4. Use separate libmdbx databases for different types. This will allow to parallelize writes into different databases, although remember that by keeping values in different databases you lose atomicity guarantees -- so probably it is better to write "value" first, only then proceed with writing its indices, so you don't have dangling references. For deleting values, process should be reversed -- deleting indices first, values next. Or, you can just write/delete in any order you want to, while being careful and handle all edge cases, where write/delete into one database fails, and succeeds in other, and where index may point to non-existing value.

@ArtiomTr

ArtiomTr commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

also, don't forget to change target branch to devnet-5, instead of main - latest changes are on devnet-5 branch.

@SoarinSkySagar SoarinSkySagar changed the title feat: Implement libmdbx in lean client for persistent DB storage feat: Implement persistent storage Jul 1, 2026
@SoarinSkySagar SoarinSkySagar changed the base branch from main to devnet-5 July 1, 2026 22:52
@SoarinSkySagar

Copy link
Copy Markdown
Author

rebased and changed target branch to devnet-5. continuing work now, setting up lean's own database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants