Bulk partition UUID#3798
Draft
aasthabharill wants to merge 6 commits into
Draft
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3798 +/- ##
============================================
+ Coverage 53.41% 59.50% +6.08%
+ Complexity 6629 2184 -4445
============================================
Files 1082 506 -576
Lines 65795 29521 -36274
Branches 7328 3240 -4088
============================================
- Hits 35147 17565 -17582
+ Misses 28288 10970 -17318
+ Partials 2360 986 -1374
🚀 New features to boost your workflow:
|
052d28a to
e6bd9dc
Compare
e6bd9dc to
4fd97fa
Compare
VardhanThigle
requested changes
May 14, 2026
Contributor
VardhanThigle
left a comment
There was a problem hiding this comment.
Please consider moving to binary index (similar to what we use for MySQL varbinary instead of a string for UUID. PG uses binary collation to compare UUIDs and that's more natural)
Do we need to take care of strict UUID version etc?
For PG, mostly not - https://www.db-fiddle.com/f/pVFVr6krWjQ2wHqstc44Hm/0 (please add this fiddle as a comment somewhere in your implementation)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This Pull Request modifies the uniform partitioning (uniformization) logic in the
sourcedb-to-spannertemplate to support tables partitioned on PostgreSQLUUIDprimary keys.Changes Made & Rationale
1. Map UUID Columns to a Virtual
"UUID"CollationPostgreSQLDialectAdapter.discoverTableIndexes, if thetypeNameof a column is"uuid", we assign"UUID"as its collation reference.CollationMapper.fromDBexpects a virtual"UUID"collation tag to trigger the static hexadecimal base-16 mapper (buildStaticUuidMapper). By assigning"UUID"during discovery, the splitter bypasses executing a database query to fetch collation rankings (which would fail or be extremely slow for a native UUID type that has no physical collation).2. Configure Virtual Type Length to
32for UUID ColumnsPostgreSQLDialectAdapter.discoverTableIndexes, iftypeLengthisnullandtypeNameis"uuid", we settypeLength = 32.CollationMapperstrips the hyphens out during mapping, leaving exactly 32 hexadecimal characters. Overriding the discovered length to32ensures that no additional padding (virtual zero-rank characters) is appended during range partitioning calculations, ensuring a clean 1-to-1 mapping and unmapping.3. Register State-Based Query and Parameter Cast Wrappers for UUID
PostgreSQLDialectAdapter.discoverTableIndexes, iftypeNameis"uuid", we register explicit SQL cast statements incolumnCastWrappersandcolumnParameterCastWrappersmaps.columnCastWrappers(CAST(%s AS TEXT)): Used ingetBoundaryQueryto queryMIN(CAST(col AS TEXT))andMAX(CAST(col AS TEXT)). This is necessary to retrieve the UUID boundaries safely as standard text strings compatible with JDBC. UUID doesnt have a MIN or MAX.columnParameterCastWrappers(CAST(? AS uuid)): Used ingetReadQueryandgetCountQueryto bind parameter boundary placeholders ascol >= CAST(? AS uuid). This is necessary because PostgreSQL does not support implicit comparison of standard JDBC string parameter bindings against nativeuuidcolumn types.4. Verify Changes with Unit & Integration Tests
testUuidCollationMapperin CollationMapperTest.java to verify that canonical UUID strings are mapped to 128-bit BigIntegers and unmapped back with correct formatting and hyphen insertion.testDiscoverTableIndexesWithUuidin PostgreSQLDialectAdapterTest.java verifying index discovery mappings, boundary query wrapping, and read/count query parameter bindings.getExpectedDatain PostgreSQLWithUniformizationIT.java to support assertions for tables with non-integer primary keys (uuid_pk).