Full Unicode case-mapping parity for case-insensitive field matching

### Context

Issue #613 asks for case-insensitive field matching consistent with iceberg-java and iceberg-python (both Unicode-aware), with `İ` (U+0130) as the example. PR #760 (Part of #613) made `StringUtils::ToLower` Unicode-aware using utf8proc **simple (1:1)** case mapping and added allocation-free ASCII fast paths.

This issue captures the **design and remaining plan** to reach full parity and tracks the follow-up PRs.

### Remaining gap

utf8proc's simple mapping still diverges from java for the few code points where simple ≠ full case mapping — chiefly `İ`:

| input | iceberg-cpp (simple) | iceberg-java `toLowerCase(Locale.ROOT)` / Python `str.lower()` |
|---|---|---|
| `İ` (U+0130) | `i` (U+0069) | `i̇` (U+0069 U+0307) |

So `EqualsIgnoreCase("İD", "id")` is **true** in iceberg-cpp but **false** in java/python — the inconsistency #613 is about.

### Design questions

- Match iceberg-java `toLowerCase(Locale.ROOT)` exactly; confirm the operation PyIceberg uses for matching and that it agrees.
- Full lowercase mapping vs. Unicode case folding: utf8proc offers full case folding (`utf8proc_map` + `UTF8PROC_CASEFOLD`); verify it reproduces the java/python result, or add a small explicit mapping.
- Keep the ASCII fast path; stream the non-ASCII path rather than materialize.

### Work Items

- [ ] Full case mapping to close the `İ` / java-parity gap
- [ ] Streaming / allocation-free non-ASCII comparison in `EqualsIgnoreCase` / `StartsWithIgnoreCase` (deferred from #760)

### References

- Issue: #613 (origin)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Full Unicode case-mapping parity for case-insensitive field matching #808

Context

Remaining gap

Design questions

Work Items

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Full Unicode case-mapping parity for case-insensitive field matching #808

Description

Context

Remaining gap

Design questions

Work Items

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions