[vpj] Support running VTConsistencyCheckerJob via VenicePushJob by Mohith22 · Pull Request #2805 · linkedin/venice

Mohith22 · 2026-05-20T02:28:16Z

Problem Statement

VTConsistencyCheckerJob (recently introduced) is a Spark job that scans Version Topics across two DCs to detect cross-region inconsistencies. Currently, we integrate it via VBnP using HadoopJavaOperator which complicates the wiring process. We want to integrate it to VenicePushOperator so it can be seamlessly added to the airflow DAGs. Also, the spark job today doesn't fail the operator when inconsistencies are found, rather it silently ends by writing the errors to parquet. We need to throw an error, so the DAG step can be failed.

Solution

VPJ as the entry point. Added a vt.consistency.check.only flag. When set, VenicePushJob.run() short-circuits the push and delegates to VTConsistencyCheckerJob.run(props) via runVTConsistencyCheck().
Fail-fast on findings. Added a LongAccumulator inconsistenciesFound in the checker, incremented per detected mismatch. After the Parquet write completes (so forensics are preserved on disk), the job throws VeniceException if the count is non-zero.

Code changes

Added new code behind a config. If so list the config names and their default values in the PR description.
Introduced new log lines.
Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

Code has no race conditions or thread safety issues.
Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

New unit tests added.
New integration tests added.
Modified or extended existing tests.
Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

No. You can skip the rest of this section.
Yes. Clearly explain the behavior change and its impact.

sushantmane · 2026-05-20T11:17:34Z

+  /**
+   * When set to {@code true}, {@link com.linkedin.venice.hadoop.VenicePushJob#run()} skips the
+   * push and instead invokes {@link com.linkedin.venice.spark.consistency.VTConsistencyCheckerJob}
+   * against the store's current version topic.
+   */
+  public static final String VT_CONSISTENCY_CHECK_ONLY = "vt.consistency.check.only";


For heartbeat push jobs that want produce + verify in one run, the natural UX is: flip a single flag and the job does the push and then the consistency check. Today VT_CONSISTENCY_CHECK_ONLY=true is mutually exclusive with the push, so the user has to schedule a second VPJ run with the flag flipped. The second run also targets whatever the current version is at that later time, not necessarily the version just pushed.

A simpler shape: keep this flag for the "skip push, check only" path, and add a sibling vt.consistency.check.after.push that runs the checker as a follow-on phase inside the same run() after a successful push. Heartbeat jobs then flip one config and get produce + post-push consistency check in one invocation.

I intentionally kept this separately because of two reasons:

It doesn't really make sense to run this right after VPJ because we need nearline writes for this validation to make sense. This is more relevant for AA stores.

I want it to be light weighted so we can trigger and quickly do the validation, instead of waiting for the VPJ to finish.

sushantmane reviewed May 20, 2026

View reviewed changes

sushantmane added the requested-changes Reviewer requested changes label May 20, 2026

[vpj] Support running VTConsistencyCheckerJob via VenicePushJob

8f9e7be

Mohith22 force-pushed the mdamarap/add-vt-consistency-checker-venice-operator branch from 7b6450b to 8f9e7be Compare May 20, 2026 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vpj] Support running VTConsistencyCheckerJob via VenicePushJob#2805

[vpj] Support running VTConsistencyCheckerJob via VenicePushJob#2805
Mohith22 wants to merge 1 commit into
linkedin:mainfrom
Mohith22:mdamarap/add-vt-consistency-checker-venice-operator

Mohith22 commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sushantmane May 20, 2026

Uh oh!

Mohith22 May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mohith22 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem Statement

Solution

Code changes

Concurrency-Specific Checks

How was this PR tested?

Does this PR introduce any user-facing or breaking changes?

Uh oh!

Uh oh!

Uh oh!

sushantmane May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Mohith22 May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mohith22 commented May 20, 2026 •

edited

Loading