Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "iceberg-testing"]
path = iceberg-testing
url = https://github.com/sungwy/iceberg-testing.git
26 changes: 26 additions & 0 deletions conformance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Conformance fixtures (POC — submodule wired, test deferred)

This pins [`sungwy/iceberg-testing`](https://github.com/sungwy/iceberg-testing) as a
git submodule at `iceberg-testing/`: language-neutral conformance fixtures for
Apache Iceberg, modeled on `apache/parquet-testing`. Three surfaces:

- `table-spec/types/` — type-string parse + exact re-serialize (canonical)
- `table-spec/transforms/bucket/` — the Appendix B 32-bit bucket hash, as a
Known-Answer-Test, including byte-boundary decimals and a non-BMP string
- `table-spec/delete-formats/` — positional and equality delete-file decode by
field-id

A C++ (`ctest`) harness that walks these fixtures is **not included in this POC**.
It was prepared in an environment without a C++ toolchain, so rather than commit an
unverified, possibly non-compiling test, the wiring is left as a TODO. The PyIceberg
fork has the worked example, including how a consumer keeps a local staged-adoption
list for cases it does not yet satisfy:
https://github.com/sungwy/iceberg-python/pull/1

To consume: `git submodule update --init`, then walk `iceberg-testing/table-spec/**`
and apply each surface README's assertion (parse + re-serialize; compute the bucket
hash and compare; decode delete-file columns by field-id).

The bucket Known-Answer-Test is the highest-signal surface to wire first: the
expected hashes are taken from the spec, and the byte-boundary decimal cases
(`-1.28`, `-327.68`) are where minimal two's-complement encoding has diverged.
1 change: 1 addition & 0 deletions iceberg-testing
Submodule iceberg-testing added at 0f4dce
Loading