-
Notifications
You must be signed in to change notification settings - Fork 3k
API: Use unsigned byte-wise comparison to fix UUID comparison bug #14500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for handling this! |
dimas-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change looks reasonable to me given the explanation in #14216... however, I cannot assess the impact on existing Iceberg clients and tables. I hope other reviewers can help with that.
api/src/test/java/org/apache/iceberg/expressions/TestInclusiveManifestEvaluator.java
Outdated
Show resolved
Hide resolved
api/src/test/java/org/apache/iceberg/expressions/TestInclusiveManifestEvaluator.java
Outdated
Show resolved
Hide resolved
api/src/test/java/org/apache/iceberg/expressions/TestInclusiveMetricsEvaluator.java
Outdated
Show resolved
Hide resolved
api/src/test/java/org/apache/iceberg/expressions/TestInclusiveMetricsEvaluator.java
Outdated
Show resolved
Hide resolved
api/src/test/java/org/apache/iceberg/expressions/TestInclusiveMetricsEvaluator.java
Outdated
Show resolved
Hide resolved
|
overall the change LGTM, just left a few minor comments but we need to decide in the community how we want to proceed with this change |
|
@bodduv: Could you please help me understand the effect of this change on the current tables?
Am I correct, that after the upgrade the metadata filtering will skip the new file (UUID_MIDDLE < UUID_MAX) - filtered out by the wrong min value? |
|
Thank you for the comment @pvary
It matters how a query engine prepares min, max values for UUID columns to hand them over for writing manifest file and manifest lists. Some engines could use min and max values as prepared by Parquet Java (which is RFC compliant) during writes.
Yes, if the min and max metrics persisted in manifest file and manifest list are constructed using the faulty non-RFC compliant UUID comparisons, then yes we would not be able to read the new file back with such a filter (on UUID column) after upgrading. What is even more problematic: even an equality filter A remedy would be to migrate the table (doing a full table scan) to rewrite metrics accurately. Note: This issue is only in Java implementation of the spec. Go, Rust, Cpp implementations are RFC compliant making the bug more severe. I.e., If the same table is read with a filter using Go implementation, it produced correct, but different records than what Java implementation produces. |
|
This is a serious behavioral change which could effect correctness. I would try to resurrect the thread with a summary (short/easily understandable problem statement), and with a focused more detailed description. Also, I would add this to the next community sync topics. |
It is. But I also think this is a serious data correctness bug in Java implementation of the Iceberg spec. If we would like to preserve the old non-RFC compliant way of UUID comparisons, then there is a disparity with other implementations of the spec. So the actual question is: What does Iceberg spec say about UUID comparisons, how should UUID values be compared?
I should clarify regarding ^this. If we stick to RFC compliance, then we do NOT need a solution for implementations other than Java as other implementations are not affected by this UUID comparison bug. Let me clarify: If one uses Go implementation of the spec to create Iceberg table with a UUID column just like above. In this case, min=UUID_MIN and max=UUID_MAX compliant with RFC. No surprises while using a filter on UUID_MIDDLE, the new file should be read correctly. Query engines using Java implementation of the spec might need to revisit UUID comparisons. There is another approach of disabling any manifest entry filtering (data file filtering) and manifest file filtering (partition pruning) so as to not trigger any UUID comparisons (via Iceberg Java APIs). This are taking this approach and
Thank you @pvary for effort and taking a closer look into this. [1] Trino fixed trinodb/trino#12834 by trinodb/trino#12911. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I echo @pvary concern that this is a pretty significant behavior change, as the ordering won't be stable for UUIDs between versions. I'm digging into the RFCs exact language but I also came across https://stackoverflow.com/questions/79489893/does-the-uuid-comparison-in-java-violate-the-uuid-standard it sounds like this sorting behavior is particularly defined for UUIDV6/7 and sorting is not prescribed in V4? At the same time, the openJDK folks acknowledge that this is a bug, so I'm not sure which yet (like I said, still digging into it) https://bugs.openjdk.org/browse/JDK-7025832
In general, I think we should close this ambiguity on the sorting of UUIDs in the spec definition; as you pointed out, implementations are inconsistent in how this is performed. Whether or not this definition should be based off the RFC or if there's a good argument to retroactively work from the Java reference implementation behavior is something I'm still thinking through, and I think we should discuss in the mailing list.
I have addressed how RFCs define ordering among UUIDs in #14216. RFC 9562 section "6.1.1. Sorting" mentions "UUID formats created by this specification are intended to be lexicographically sortable while in the textual representation."
Maintaining backward compatibility is an argument in this case. But on the other hand, looking forward into the future, is there a possibility to communicate breaking changes and provide table migration strategy(ies) in release docs? I suppose its a topic for the mailing list. |
|
Anything which could be a breaking change needs to be discussed on the dev list. If needed, discussed on the sync |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
@pvary Thank you for promptly jumping in (during previous community sync) while I could not recollect the details of why we wanted to pause on the behavior change. It was discussed that evaluating expressions once with RFC-compliant comparator and again with signed comparator to then allow the filter to pass if either of the expression evaluations is true. I attempt this in fe9faf5. Following is a short description of the changes.
Implementation around expression evaluation was not conducive to such two fold evaluations. So I had to force it. I hope the increased code complexity is acceptable. |
Java's UUID.compareTo() uses signed comparison of most significant bits and least significant bits, which is not compliant with RFC 4122, RFC 9562. This causes incorrect ManifestEntry and ManifestFile filtering/pruning in the presence of UUID filters.
…sses either RFC or non-RFC compliant UUID comparator
1f74764 to
fe9faf5
Compare
Fixes #14216
The problem is described in the issue as well as in the dev mailing list post.
The changes proposed in the PR take a simple approach to fix the comparisons bug. This directly addresses correctness issues at both
ManifestEntryas well asManifestFilefiltering. Note that the changes make UUID values comparison forward-compatible, but compliant with UUID RFCs.