Skip to content

ScheduledMerges: migrate union level late (when used in a new merge)#822

Draft
mheinzel wants to merge 9 commits intomainfrom
mheinzel/prototype-migrate-union-level-late
Draft

ScheduledMerges: migrate union level late (when used in a new merge)#822
mheinzel wants to merge 9 commits intomainfrom
mheinzel/prototype-migrate-union-level-late

Conversation

@mheinzel
Copy link
Collaborator

@mheinzel mheinzel commented Mar 3, 2026

Builds on top of #814.

When creating the union of multiple tables, we create a special union level containing a merging tree. Once the tree has been completed, i.e. merged into a single run, we want to get rid of the union level by migrating the run to the regular levels. The main motivation is that we want it to become part of a last level merge, so compaction can occur. Last level merges are especially useful for compaction, since they can drop all delete entries.

This PR demonstrates one of two approaches for migration. Since the main motivation is to allow for creating last level merges, we delay migration until the exact point where the completed union can become part of a merge. This avoids weakening the invariants of the regular levels, which would often be necessary to migrate the union earlier.

I still plan to add some tracing once we decided which approach for migrating the union we want to take.

Comment on lines +1232 to +1235
-- This case is an optimisation over the one below. We only do a
-- single batch of lookups, but there's an extra allocation to
-- construct the list of runs.
pure (lookupBatch k (Just wb) (runs ++ [r]))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimisation is probably worth it in the real implementation, but we probably don't need it in the prototype. I'll temporarily leave it here while we discuss the possible approaches, but will remove it before merging this PR. A short comment hinting at the possible optimisation should be sufficient.

@mheinzel mheinzel changed the title ScheduledMerges: Migrate union level late (when used in a new merge) ScheduledMerges: migrate union level late (when used in a new merge) Mar 3, 2026
@mheinzel mheinzel force-pushed the mheinzel/prototype-tests-nested-union branch from 83f94db to 9e43f0e Compare March 3, 2026 17:36
@mheinzel mheinzel force-pushed the mheinzel/prototype-migrate-union-level-late branch from 45c4485 to eaa28b6 Compare March 3, 2026 17:38
@mheinzel mheinzel force-pushed the mheinzel/prototype-tests-nested-union branch from 9e43f0e to b7bc5c2 Compare March 17, 2026 16:59
Base automatically changed from mheinzel/prototype-tests-nested-union to main March 24, 2026 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant