feat(spanner-to-sourcedb): add Spanner-to-Spanner reverse replication support by srozsnyai · Pull Request #3773 · GoogleCloudPlatform/DataflowTemplates

srozsnyai · 2026-05-08T11:57:20Z

Extends the spanner-to-sourcedb reverse replication template to support Cloud Spanner as a target database.

Change stream events from a source Spanner instance are converted to mutations and written to a target Spanner database, coordinated through the existing shadow-table mechanism that prevents duplicate and out-of-order writes.

Because calling any Spanner write API inside an active readWriteTransaction causes a nested-transaction error, the target write is deferred: the mutation is generated inside the shadow-table transaction (where ordering and filtering decisions are made), then committed via writeAtLeastOnce() after the shadow-table transaction completes.

gemini-code-assist · 2026-05-08T12:08:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enables reverse replication from Spanner to Spanner by extending the existing template infrastructure. It introduces a new set of DAO and DML generation classes specifically for Spanner, ensuring that mutation data is correctly formatted and committed. A key aspect of this change is the modification of the write flow to defer target database operations until after the shadow-table transaction has committed, preventing illegal nested transaction errors.

Highlights

Feature Addition: Added support for Cloud Spanner as a target database in the spanner-to-sourcedb reverse replication template.
Transaction Handling: Implemented a deferred write mechanism in SourceWriterFn to avoid nested transaction errors when writing to Spanner.
Infrastructure: Introduced Spanner-specific components including SpannerDMLGenerator, SpannerTargetDao, and SpannerConnectionHelper.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request implements support for Cloud Spanner as a target database in the Spanner-to-SourceDb template. It introduces new components for connection management, DML generation, and data access specifically for Spanner targets. Feedback identifies critical issues including a potential data loss scenario in SourceWriterFn due to non-atomic writes, type mismatch errors in SpannerDMLGenerator when handling custom transformations or null array values, and an initialization bug in SpannerConnectionHelper that affects multi-project deployments.

gemini-code-assist · 2026-05-08T12:11:16Z

+      if (request.getCustomTransformationResponse() != null
+          && request.getCustomTransformationResponse().containsKey(targetColName)) {
+        Object customVal = request.getCustomTransformationResponse().get(targetColName);
+        builder.set(targetColName).to(customVal == null ? (String) null : customVal.toString());


This line assumes that all custom transformation results should be treated as strings. If the target Spanner column is of a different type (e.g., INT64, BOOL, TIMESTAMP), calling .to(String) on the Mutation.WriteBuilder will cause a runtime IllegalArgumentException or a type mismatch error during the commit. The value should be set using a type-aware method similar to the logic in setColumnValue, based on the target column's type.

Fixed in 7a5274c. Custom-transformation values are now routed through setCustomColumnValue / appendCustomKeyComponent, which dispatch on Column.type().getCode() and call the matching typed .to(...) / Key.Builder.append(...) overload. null custom values emit a properly typed NULL via setNullValue. Added tests.

gemini-code-assist · 2026-05-08T12:11:16Z

+        if (pendingSpannerWrite.get() != null) {
+          IDao sourceDao = sourceProcessor.getSourceDao(shardId);
+          sourceDao.write(pendingSpannerWrite.get(), null);
+        }


Deferring the Spanner target write until after the shadow-table transaction commits introduces a critical consistency risk. If the shadow-table transaction succeeds but the subsequent sourceDao.write call fails (e.g., due to a transient network issue or target database constraint), the record will be marked as processed in the shadow table. Upon retry from the Dead Letter Queue (DLQ), the isSourceAhead check (line 270) will see that the record's timestamp matches the shadow table and skip the write, resulting in permanent data loss on the target. To ensure atomicity, the target write should be performed within the same transaction as the shadow table update. If the target is the same database, use shadowTransaction.buffer(mutation). If it is a different database, calling writeAtLeastOnce inside the readWriteTransaction block will ensure that a failure in the target write rolls back the shadow table update, allowing for a safe retry.

shadowTransaction.buffer(mutation) only works when the mutation targets the same database as the transaction. The shadow table currently lives in a separate metadata database, so buffer() cannot span them.

writeAtLeastOnce inside readWriteTransaction throws INTERNAL: Nested transactions are not supported for every record, regardless of which DatabaseClient instance is used. The same error reproduces with the Spanner JDBC driver, so JDBC is not an escape hatch. MySQL/Postgres can use this pattern only because their JDBC writes target a non-Spanner engine.

The residual risk is narrow in practice: the client retries ABORTED / UNAVAILABLE / DEADLINE_EXCEEDED internally, errors that escape are typically permanent and operator-visible, and Spanner mutations are idempotent.

IMHO the only real fix is colocating the reverse-replication metadata in the target Spanner database so tx.buffer() becomes available. That is an architectural change and will be addressed as a follow-up once there is team consensus, not as a side effect of this PR.

gemini-code-assist · 2026-05-08T12:11:16Z

+  private static Map<String, DatabaseClient> clientMap = new ConcurrentHashMap<>();
+  private static Spanner spannerService;


The use of static variables for clientMap and spannerService combined with the early return in init (lines 45-48) can lead to issues if shards belong to different Google Cloud projects. If init is called first for a shard in Project A, the spannerService is initialized for Project A. A subsequent call for a shard in Project B will return early because clientMap is not empty, and getConnection will later attempt to use the Project A service to access Project B, which may fail due to permission or configuration mismatches. Consider mapping spannerService by project ID or removing the static singleton pattern in favor of a more robust initialization that ensures all requested shards are processed.

Same idempotent-init pattern as JdbcConnectionHelper and CassandraConnectionHelper — all shards are processed in a single init() call, so the early return doesn't skip later shards.

gemini-code-assist · 2026-05-08T12:11:16Z

+        builder.set(targetColName).to(Value.json(null));
+        break;
+      case ARRAY:
+        builder.set(targetColName).to(Value.stringArray(null));


When setting a null value for an array column, Value.stringArray(null) is always used. This will cause a type mismatch error if the target column is an array of a different type (e.g., ARRAY<INT64>). The null value should be created using the appropriate element type, such as Value.int64Array((Iterable<Long>) null).

Fixed in a00dd66b5. setNullValue now dispatches on
type.getArrayElementType().getCode() and emits the matching
Value.*Array((Iterable) null) so NULL ARRAY values carry the correct
element type. Added tests.

codecov · 2026-05-12T15:25:14Z

Codecov Report

❌ Patch coverage is 46.65493% with 303 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.41%. Comparing base (f8472fe) to head (b793209).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
.../v2/templates/dbutils/dml/SpannerDMLGenerator.java	51.46%	120 Missing and 29 partials ⚠️
...ner/sourceddl/SpannerInformationSchemaScanner.java	0.00%	72 Missing ⚠️
...anner/migrations/utils/SpannerShardFileReader.java	0.00%	27 Missing ⚠️
...es/dbutils/connection/SpannerConnectionHelper.java	37.14%	22 Missing ⚠️
...leport/v2/templates/transforms/SourceWriterFn.java	34.61%	15 Missing and 2 partials ⚠️
...cloud/teleport/v2/templates/SpannerToSourceDb.java	0.00%	11 Missing ⚠️
...plates/dbutils/processor/InputRecordProcessor.java	92.68%	1 Missing and 2 partials ⚠️
...templates/dbutils/dao/source/SpannerTargetDao.java	88.23%	1 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (46.65%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3773      +/-   ##
============================================
- Coverage     53.41%   53.41%   -0.01%     
+ Complexity     6629     6329     -300     
============================================
  Files          1082     1091       +9     
  Lines         65795    66964    +1169     
  Branches       7328     7483     +155     
============================================
+ Hits          35147    35767     +620     
- Misses        28288    28772     +484     
- Partials       2360     2425      +65

Components	Coverage Δ
spanner-templates	`72.25% <46.65%> (-0.56%)`	⬇️
spanner-import-export	`68.64% <ø> (+0.02%)`	⬆️
spanner-live-forward-migration	`79.74% <17.50%> (-1.19%)`	⬇️
spanner-live-reverse-replication	`74.87% <46.65%> (-2.18%)`	⬇️
spanner-bulk-migration	`90.30% <17.50%> (-0.80%)`	⬇️
gcs-spanner-dv	`84.25% <17.50%> (-1.50%)`	⬇️

Files with missing lines	Coverage Δ
...ort/v2/spanner/migrations/constants/Constants.java	`0.00% <ø> (ø)`
...port/v2/spanner/migrations/shard/SpannerShard.java	`100.00% <100.00%> (ø)`
...eport/v2/spanner/sourceddl/SourceDatabaseType.java	`100.00% <100.00%> (ø)`
...ates/dbutils/processor/SourceProcessorFactory.java	`84.14% <100.00%> (+2.45%)`	⬆️
...templates/dbutils/dao/source/SpannerTargetDao.java	`88.23% <88.23%> (ø)`
...plates/dbutils/processor/InputRecordProcessor.java	`86.53% <92.68%> (+1.43%)`	⬆️
...cloud/teleport/v2/templates/SpannerToSourceDb.java	`0.00% <0.00%> (ø)`
...leport/v2/templates/transforms/SourceWriterFn.java	`71.42% <34.61%> (-5.35%)`	⬇️
...es/dbutils/connection/SpannerConnectionHelper.java	`37.14% <37.14%> (ø)`
...anner/migrations/utils/SpannerShardFileReader.java	`0.00% <0.00%> (ø)`
... and 2 more

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…pannerDMLGenerator

feat: add spanner to spanner replication

2d39667

srozsnyai requested a review from a team as a code owner May 8, 2026 11:57

srozsnyai requested review from manitgupta and shreyakhajanchi May 8, 2026 11:57

pull-request-size Bot added the size/XXL label May 8, 2026

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

srozsnyai added 3 commits May 12, 2026 15:49

fix(spanner-dml): bind custom transformation values type-aware

7a5274c

fix(spanner-dml): emit typed NULL for ARRAY columns

a00dd66

feat(spanner-common): add Spanner target shard model and schema scanner

4b99444

srozsnyai added 2 commits May 12, 2026 17:39

style: apply spotless formatting

07b6006

test: add unit tests for SpannerShard, SpannerConnectionHelper, and S…

b793209

…pannerDMLGenerator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spanner-to-sourcedb): add Spanner-to-Spanner reverse replication support#3773

feat(spanner-to-sourcedb): add Spanner-to-Spanner reverse replication support#3773
srozsnyai wants to merge 6 commits into
GoogleCloudPlatform:mainfrom
srozsnyai:spanner-to-spanner

srozsnyai commented May 8, 2026

Uh oh!

gemini-code-assist Bot commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

srozsnyai May 12, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

srozsnyai May 12, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

srozsnyai May 12, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

srozsnyai May 12, 2026

Uh oh!

codecov Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		private static Map<String, DatabaseClient> clientMap = new ConcurrentHashMap<>();
		private static Spanner spannerService;

Conversation

srozsnyai commented May 8, 2026

Uh oh!

gemini-code-assist Bot commented May 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

srozsnyai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

srozsnyai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

srozsnyai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

srozsnyai May 12, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 12, 2026 •

edited

Loading