-
Notifications
You must be signed in to change notification settings - Fork 4
feat(metadata-db): eager file_metadata deletion during compaction #1407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
3ad7aa2
b3a77ec
e4b729c
b72bd93
d6368ee
aa913c4
74da03c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -104,17 +104,20 @@ impl Collector { | |
| ); | ||
| } | ||
|
|
||
| let paths_to_remove = metadata_db | ||
| .delete_file_ids(found_file_ids_to_paths.keys()) | ||
| // Delete from footer_cache (file_metadata was already deleted during compaction) | ||
| metadata_db | ||
| .delete_footer_cache(found_file_ids_to_paths.keys()) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding error handling or logging if the number of deleted entries doesn't match expectations. For example, if |
||
| .await | ||
| .map_err(CollectorError::file_metadata_delete( | ||
| .map_err(CollectorError::footer_cache_delete( | ||
| found_file_ids_to_paths.keys(), | ||
| ))? | ||
| .into_iter() | ||
| .filter_map(|file_id| found_file_ids_to_paths.get(&file_id).cloned()) | ||
| .collect::<BTreeSet<_>>(); | ||
| ))?; | ||
|
|
||
| tracing::debug!("Metadata entries deleted: {}", paths_to_remove.len()); | ||
| tracing::debug!( | ||
| "Footer cache entries deleted: {}", | ||
| found_file_ids_to_paths.len() | ||
| ); | ||
|
|
||
| let paths_to_remove: BTreeSet<_> = found_file_ids_to_paths.values().cloned().collect(); | ||
|
|
||
| if let Some(metrics) = &self.metrics { | ||
| metrics.inc_expired_entries_deleted( | ||
|
|
@@ -161,6 +164,19 @@ impl Collector { | |
| tracing::debug!("Expired files deleted: {}", files_deleted); | ||
| tracing::debug!("Expired files not found: {}", files_not_found); | ||
|
|
||
| // Delete from gc_manifest after physical files have been deleted | ||
| metadata_db | ||
| .delete_gc_manifest(found_file_ids_to_paths.keys()) | ||
| .await | ||
| .map_err(CollectorError::gc_manifest_delete( | ||
| found_file_ids_to_paths.keys(), | ||
| ))?; | ||
|
|
||
| tracing::debug!( | ||
| "GC manifest entries deleted: {}", | ||
| found_file_ids_to_paths.len() | ||
| ); | ||
|
|
||
| if let Some(metrics) = self.metrics.as_ref() { | ||
| metrics.inc_successful_collections(table_name.to_string()); | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| -- Drop the footer column from file_metadata | ||
| ALTER TABLE file_metadata DROP COLUMN footer; | ||
|
|
||
| -- Remove the foreign key constraint from footer_cache entirely | ||
| -- This allows us to delete from file_metadata eagerly during compaction | ||
| -- while keeping footer_cache entries until the Collector runs | ||
| ALTER TABLE footer_cache DROP CONSTRAINT footer_cache_file_id_fkey; | ||
|
|
||
| -- Remove the foreign key constraint from gc_manifest to file_metadata | ||
| -- This allows us to delete from file_metadata while keeping gc_manifest entries | ||
| -- for garbage collection tracking | ||
| ALTER TABLE gc_manifest DROP CONSTRAINT gc_manifest_file_id_fkey; | ||
|
|
||
| -- Remove the CHECK constraint on gc_manifest.expiration | ||
| -- This allows inserting entries with any expiration time | ||
| ALTER TABLE gc_manifest DROP CONSTRAINT gc_manifest_expiration_check; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Important: This migration makes breaking schema changes by dropping constraints and columns. Ensure you have:
The removal of FK constraints enables the new eager deletion pattern, but make sure to coordinate the deployment carefully.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #1403 Prepares the metadata db for the schema changes made here in this PR's migrations. |
||
Uh oh!
There was an error while loading. Please reload this page.