Skip to content

KAFKA-20400: Enable VersionedKeyValueStoreWithHeaders in DSL#21960

Open
Shekharrajak wants to merge 7 commits intoapache:trunkfrom
Shekharrajak:feature/versioned-kv-store-with-headers
Open

KAFKA-20400: Enable VersionedKeyValueStoreWithHeaders in DSL#21960
Shekharrajak wants to merge 7 commits intoapache:trunkfrom
Shekharrajak:feature/versioned-kv-store-with-headers

Conversation

@Shekharrajak
Copy link
Copy Markdown

@Shekharrajak Shekharrajak commented Apr 4, 2026

Ref https://issues.apache.org/jira/browse/KAFKA-20400

This PR extends the KIP-1285 headers-aware state store infrastructure to versioned key-value stores. Prior to this change, KIP-1285 was implemented for timestamped, windowed, and session stores, but versioned stores silently dropped record headers on both the write and read paths.


Record headers carry metadata (tracing IDs, schema references, auth tokens) that downstream processors and interactive query clients may need. Versioned stores (RocksDBVersionedStore) were the only remaining store type that discarded headers. This PR closes that gap.


API changes:

VersionedRecord -- Added Headers headers field with new constructors and a headers() accessor. Existing constructors default to empty headers (backward compatible).

Stores.persistentVersionedKeyValueStoreWithHeaders(String, Duration) -- Factory method returning a dual-interface supplier.

Stores.versionedKeyValueStoreBuilderWithHeaders(VersionedBytesStoreSupplier, Serde, Serde) -- Builder factory method.

RocksDBVersionedStoreWithHeaders -- Extends RocksDBVersionedStore, encodes headers into value bytes using [headersSize(varint)][headersBytes][rawValue] format before delegating to the parent store. Decodes on read.

ChangeLoggingVersionedKeyValueBytesStore -- Added put(key, value, timestamp, headers) override to forward real headers to the changelog instead of empty headers.

KeyValueStoreWrapper -- get() now returns VersionedRecord.headers() instead of empty headers; put() forwards headers to VersionedKeyValueStoreWithHeaders when available.

@github-actions github-actions bot added triage PRs from the community streams labels Apr 4, 2026
@Shekharrajak Shekharrajak changed the title Enable VersionedKeyValueStoreWithHeaders in DSL KAFKA-20400 Enable VersionedKeyValueStoreWithHeaders in DSL Apr 4, 2026
@Shekharrajak Shekharrajak changed the title KAFKA-20400 Enable VersionedKeyValueStoreWithHeaders in DSL KAFKA-20400: Enable VersionedKeyValueStoreWithHeaders in DSL Apr 4, 2026
private final V value;
private final long timestamp;
private final Optional<Long> validTo;
private final Headers headers;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the equals() method also check headers equality?

Copy link
Copy Markdown
Author

@Shekharrajak Shekharrajak Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review.

This was an intentional design choice to preserve backward compatibility. VersionedRecord.equals() is a pre-existing contract -- adding headers to it would change comparison semantics for existing code that relies on the current behavior. Headers are metadata, not part of the record version identity (two records with the same value/timestamp but different tracing headers are the same logical version). That said, I can see the argument for consistency with ValueTimestampHeaders.equals(). If the consensus is to include headers in equality, I can make that change -- though it would need a note in the doc about the behavioral change. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO headers should be part of equals and hashCode for consistency.
For backwards compatibility, new RecordHeaders() will be empty I guess, so behaviour of equals method and hashcode is still ok.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. For consistency with ValueTimestampHeaders.equals() and the general Java convention that all significant fields participate in equality, I will include headers in both equals() and hashCode(). The backward compatibility risk is low since existing code constructs VersionedRecord with the no-headers constructors (which default to empty RecordHeaders), and non-headers stores also return records with empty headers. I will update this in the next push.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with ValueTimestampHeaders.equals() and the general Java convention that all significant fields participate in equality

Does it really written by you?

@github-actions github-actions bot removed the triage PRs from the community label Apr 5, 2026
@Shekharrajak Shekharrajak force-pushed the feature/versioned-kv-store-with-headers branch from 0ecf489 to 6260c3c Compare April 5, 2026 17:13
Copy link
Copy Markdown
Contributor

@muralibasani muralibasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for addressing the comments.

@Shekharrajak
Copy link
Copy Markdown
Author

@mjsax @aliehsaeedii Please have a look.

@UladzislauBlok
Copy link
Copy Markdown
Contributor

@Shekharrajak why do we bring the same code twice? #21995
This doesn't look correct

@Shekharrajak
Copy link
Copy Markdown
Author

@Shekharrajak why do we bring the same code twice? #21995 This doesn't look correct

Already mentioned about it here https://issues.apache.org/jira/browse/KAFKA-20400?focusedCommentId=18071886&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-18071886

Please let me know if both the tickets should be addressed in single PR/branch. My understanding was KAFKA-20399 is for the store-level implementation ( store, retrieve - versioned key-value data along with record headers in RocksDB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants