Skip to content

[lumina] Use VECTOR length for index dimension#8235

Open
QuakeWang wants to merge 1 commit into
apache:masterfrom
QuakeWang:lumina-vector-dim
Open

[lumina] Use VECTOR length for index dimension#8235
QuakeWang wants to merge 1 commit into
apache:masterfrom
QuakeWang:lumina-vector-dim

Conversation

@QuakeWang

Copy link
Copy Markdown
Contributor

Purpose

Lumina vector writers used lumina.index.dimension for both ARRAY and VECTOR fields. For VECTOR fields, the schema already carries the fixed length, so falling back to the default dimension could make write validation and index metadata disagree with the actual field type.

This PR resolves the effective dimension from VectorType#getLength(), rejects explicit dimension conflicts, and builds native Lumina options from the resolved dimension so PQ options are capped against the real dimension. ArrayType<FLOAT> continues to use lumina.index.dimension.

Tests

  • mvn -pl paimon-lumina -am -DfailIfNoTests=false -Dtest=LuminaVectorGlobalIndexWriterTest,LuminaVectorOptionsTest test

Lumina vector writers previously used lumina.index.dimension for both ARRAY and VECTOR fields. For VECTOR fields this could fall back to the default dimension even though the schema already carries the fixed length, causing write validation and index metadata to disagree with the actual field type.

Resolve the effective dimension from VECTOR length, reject explicit dimension conflicts, and build native Lumina options from that resolved dimension so PQ options are capped against the real dimension.

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant