Skip to content

[server][dvc] Fail-fast on blob-transfer PartitionState/StoreVersionState schema-version mismatch#2811

Open
jingy-li wants to merge 5 commits into
linkedin:mainfrom
jingy-li:partition-version-check
Open

[server][dvc] Fail-fast on blob-transfer PartitionState/StoreVersionState schema-version mismatch#2811
jingy-li wants to merge 5 commits into
linkedin:mainfrom
jingy-li:partition-version-check

Conversation

@jingy-li
Copy link
Copy Markdown
Contributor

Problem Statement

P2P blob transfer ships RocksDB snapshot files plus a BlobTransferPartitionMetadata payload that embeds Avro-serialized
PartitionState (offset record) and StoreVersionState. Today the two peers do not negotiate the protocol versions they used to serialize that metadata, the receiver only discovers a mismatch when it tries to deserialize the body, which is:

  • Late: the client has already paid for every file byte by the time metadata is parsed, so a doomed transfer holds the bootstrap slot until per-host receive timeout.
  • Indistinguishable from real corruption: the failure surfaces as a generic Avro deserialization error rather than a typed signal the orchestrator can use to fall over to the next peer / Kafka bootstrap.

This is the fast path during rolling deploys, where peer-vs-local binary skew on PartitionState / StoreVersionState versions is exactly when this misfires.

Solution

Both sides (client/server) advertise their compiled-in AvroProtocolDefinition.{PARTITION_STATE,STORE_VERSION_STATE}.getCurrentProtocolVersion() and check at the earliest possible point:

  • Client → server sets X-Blob-Transfer-Partition-State-Schema-Version / …-Store-Version-State-Schema-Version on the GET in NettyFileTransferClient.
  • Server-side gate (P2PFileTransferServerHandler) compares against local versions next to the existing table-format check, before any file work begins. On mismatch it returns 400 BAD_REQUEST with marker header X-Blob-Transfer-Schema-Mismatch: true plus the server's own versions echoed back.
  • Server → client stamps the same two version headers on the metadata response.
  • Client-side gate (P2PFileTransferClientHandler) validates the metadata response headers at header-parse time (before consuming the body) and on the 400 error path, throwing the new typed VeniceBlobTransferIncompatibleSchemaException (in venice-client-common). The exception carries peer host + peer/local versions for both protocols so the orchestrator can log/route on it.

Policy is exact equality (not "peer ≤ local"): blob transfer is the fast path; on skew we'd rather step aside to Kafka bootstrap than rely on cross-version Avro promotion of partition metadata.

Rolling-deploy compatibility is intentional:

  • Both headers absent → pass through (peer is on an older binary that doesn't emit them).
  • One header present, one absent → pass through; check only what's advertised.
  • Non-numeric / out-of-byte-range value → log a warning and pass through. The existing deserialization-time exception remains the safety net for the truly incompatible case, so there is no regression for old peers.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

Jingyan Li and others added 5 commits April 30, 2026 16:23
The previous commit added a fast-fail check on the metadata response, but
the server's response order is files first, metadata last — so a client
that catches the mismatch at the metadata stage has already paid for the
entire file transfer. This commit moves the primary check to the request
side, modeled on the existing snapshot-table-format check next to it.

Client `prepareRequest` now stamps the two schema-version headers on the
GET. Server `channelRead0` calls a new
`BlobTransferUtils.compareRequestedSchemaVersionsAgainstLocal(...)` right
after the table-format validation. On mismatch it returns 400 BAD_REQUEST
with an `X-Blob-Transfer-Schema-Mismatch: true` marker header and echoes
its own protocol versions in the response, BEFORE any file work begins.

Client `P2PFileTransferClientHandler.channelRead0` recognizes the marker
on a non-OK response and throws the typed
`VeniceBlobTransferIncompatibleSchemaException` with full peer-vs-local
context (peer = server's local versions echoed in the rejection).

Backward compat preserved: a request without the new headers (older
client) is not rejected — the server falls through to the existing flow.
The response-side check from the prior commit stays as the safety net for
the inverse case (older server that does not yet validate requests will
still send full metadata; new client catches at metadata-parse time
instead of waiting for the receive timeout).

Tests cover: server rejects mismatched request with 400 + marker + no
file responses; server doesn't reject requests without the headers
(backward compat); client builds the typed exception from the rejection
response with correct peer/local versions populated. Existing integration
tests in TestNettyP2PBlobTransferManager (real client↔server flow) all
pass unchanged because the new client always advertises versions that
match the new server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

[blob-transfer] add fast-fail schema version check on metadata response

The P2P blob-transfer client deserialized the peer's PartitionState bytes
inside P2PMetadataTransferHandler without a SchemaReader, so any peer that
serialized PartitionState with a protocol version higher than this binary
knows triggered VeniceMessageException at the Netty pipeline tail after
the body had been fully transferred. The transfer future was never failed
explicitly, so the replica had to wait for blobReceiveTimeoutInMin before
falling back to Kafka bootstrap. We saw this on ltx1-app12860.stg with
PartitionState v21 during the rolling deploy that introduced PR linkedin#2707.

Server now stamps two new headers on the metadata response:
  X-Blob-Transfer-Partition-State-Schema-Version
  X-Blob-Transfer-Store-Version-State-Schema-Version
each carrying the local AvroProtocolDefinition.X.getCurrentProtocolVersion().

Client compares peer's value to its own current version at HTTP header-parse
time, before any body is consumed, and throws the new
VeniceBlobTransferIncompatibleSchemaException on mismatch. Throwing from
channelRead0 flows through the existing exceptionCaught ->
completeExceptionally
-> ctx.close() path, so the per-host transfer future fails immediately and
the orchestrator can pick the next peer (or fall back to Kafka) without
waiting on the receive timeout.

Policy is exact equality. Blob transfer is the fast path; if binaries are
not in lock-step we'd rather skip P2P and let Kafka handle bootstrap than
rely on cross-version metadata promotion. Skew is bounded to rolling-deploy
windows.

Backward-compat is preserved during rollout: a missing header passes through
(peer not yet upgraded), and a malformed/out-of-range header logs a warning
and passes through (parse bug must not crash the channel). The existing
deserialize-time exception in InternalAvroSpecificSerializer remains as the
safety net for the truly incompatible case, so no regression vs. today.

Tests cover: header stamped on metadata response only (not on file responses),
known-version pass-through, mismatched PartitionState fails fast, mismatched
StoreVersionState fails fast, single-header-missing pass-through, and
malformed/out-of-range pass-through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VeniceBlobTransferIncompatibleSchemaException

Covers the constructor, all getters, message formatting, and both
branches of render() (known version and VERSION_UNKNOWN sentinel).
Bumps venice-client-common diff branch coverage from 91.6% to 92.74%.
testIsVeniceException compared a VeniceBlobTransferIncompatibleSchemaException
against VeniceException via instanceof. Since the class extends
VeniceException,
the relationship is enforced at compile time and SpotBugs flags the runtime
check as vacuous (BC_VACUOUS_INSTANCEOF). The remaining three tests still
cover the constructor, all getters, both branches of render(), and the
VERSION_UNKNOWN sentinel.
@jingy-li jingy-li requested a review from sixpluszero May 20, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant