Skip to content

Conversation

@adiom-mark
Copy link
Collaborator

@adiom-mark adiom-mark commented Jan 10, 2026

Summary by CodeRabbit

  • Refactor

    • Streamlined update-only sync with gated flow, early exit for non-initial syncs, improved state emission, dynamic updates partitioning and randomized partition order.
  • New Features

    • Runtime options and CLI flags to set connector ID, plan parallelism, and docs-per-segment; connector info now reports the configured ID.
    • Localstack endpoint resolution via environment fallback.
  • Bug Fixes

    • Graceful handling of missing table/details, consistent NULL handling, fixed-width binary lengths, and throttling detection returning resource-exhausted responses.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 10, 2026

📝 Walkthrough

Walkthrough

GeneratePlan now tolerates missing/incompatible DynamoDB streams (creates/restarts them on localstack), encodes/emits UpdatesPartitions when updates are requested, short-circuits when InitialSync is false, exposes Options.ID via GetInfo and new configurators wired to CLI flags, improves NULL conversions, and respects AWS endpoint overrides.

Changes

Cohort / File(s) Summary
DynamoDB connector core
connectors/dynamodb/conn.go
GeneratePlan: treat ErrNotFound as a warning, create/restart streams on localstack (or return errors when non-local), fetch stream state, encode and emit UpdatesPartition(s), and short-circuit when InitialSync is false. Added throttling detection and ResourceExhausted handling; AWS endpoint resolution updated to respect AWS_ENDPOINT_URL.
Connector options & public API
connectors/dynamodb/conn.go
Added exported ID string to Options; added WithID, WithPlanParallelism, WithDocsPerSegment configurators; GetInfo now returns Id populated from options.ID. New configurators wired into NewConn.
CLI flags & connector registration
internal/app/options/connectorflags.go
Added CLI flags doc-partition (int), plan-parallelism (int), id (string); action signature now accepts *cli.Context and passes flag values into dynamodb.NewConn via the new configurators for both localstack and non-local paths.
Conversion & wire encoding
connectors/dynamodb/conv.go
fromBson/toBson now handle NULL attr values; stream NULL mapping added; error wrapping added for key/id and to-bson conversions; wire encoding writes B/N/S lengths as int32.
Dependencies & imports
go.mod, connectors/dynamodb/conn.go
AWS SDK/service versions bumped in go.mod; added os import and endpoint override logic in connector.

Sequence Diagram(s)

sequenceDiagram
  participant Requestor
  participant Conn as DynamoDB Connector
  participant Dynamo as DynamoDB/TableManager
  participant StateCh as Updates Channel

  Requestor->>Conn: GeneratePlan(r.Msg)
  alt InitialSync disabled
    Conn-->>Requestor: return (early exit)
  else InitialSync enabled
    alt Updates requested
      Conn->>Dynamo: GetTableDetails
      Dynamo-->>Conn: TableDetails or ErrNotFound
      alt No stream or IncompatibleStream
        Conn->>Dynamo: Start/Restart Stream (localstack) / error (non-local)
        Dynamo-->>Conn: StreamARN/status
      end
      Conn->>Dynamo: GetStreamState
      Dynamo-->>Conn: StreamState
      Conn->>StateCh: Encode stateMap → UpdatesPartition(s)
    else Updates not requested
      Conn-->>Requestor: skip state emission
    end
    Conn-->>Requestor: return plan (may include UpdatesPartitions)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇 I nudged a stream with whiskers bright,

Flags tucked in pockets, ready for flight.
IDs and partitions neatly spun,
Localstack wakes and joins the fun.
Updates hop onward—job well done!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Allow dynamodb to run initial sync only' is vague and doesn't clearly reflect the main changes in the changeset, which include stream handling, throttling detection, CLI flag configuration, and AWS endpoint resolution. Consider a more specific title that reflects the primary changes, such as 'DynamoDB: Add stream handling, throttling detection, and configurable options' or 'Add DynamoDB CLI flags and improve stream/throttling handling'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
connectors/dynamodb/conn.go (1)

70-105: Fix loop-variable capture + use the post-start StreamARN when fetching state (localstack path).

Right now:

  • Line 71 captures name from the loop (can make all goroutines use the last table name).
  • Lines 79-85 can start a stream but Lines 100-105 still call GetStreamState(..., tableDetails.StreamARN) which is still "" unless you refresh it.
Proposed fix
 for _, name := range tableNames {
+    name := name // avoid loop-variable capture
     eg.Go(func() error {
         tableDetails, err := c.client.TableDetails(egCtx, name)
         if err != nil {
             return err
         }

         if r.Msg.GetUpdates() {
-            if tableDetails.StreamARN == "" {
+            streamARN := tableDetails.StreamARN
+            if streamARN == "" {
                 if c.spec == "localstack" {
                     slog.Debug("No stream found, starting stream", "table", name)
                     _, err := c.client.StartStream(egCtx, name, false)
                     if err != nil {
                         return err
                     }
+                    // Refresh StreamARN after enabling streams.
+                    tableDetails, err = c.client.TableDetails(egCtx, name)
+                    if err != nil {
+                        return err
+                    }
+                    streamARN = tableDetails.StreamARN
+                    if streamARN == "" {
+                        return fmt.Errorf("no stream found after starting stream for table %s", name)
+                    }
                 } else {
-                    return fmt.Errorf("no stream found")
+                    return fmt.Errorf("no stream found for table %s", name)
                 }
             } else if tableDetails.IncompatibleStream {
                 if c.spec == "localstack" {
                     slog.Debug("Incompatible stream found, restarting stream", "table", name)
                     _, err := c.client.StartStream(egCtx, name, true)
                     if err != nil {
                         return err
                     }
+                    tableDetails, err = c.client.TableDetails(egCtx, name)
+                    if err != nil {
+                        return err
+                    }
+                    streamARN = tableDetails.StreamARN
                 } else {
-                    return fmt.Errorf("incompatible stream found")
+                    return fmt.Errorf("incompatible stream found for table %s", name)
                 }
             }

-            state, err := c.client.GetStreamState(ctx, tableDetails.StreamARN)
+            state, err := c.client.GetStreamState(egCtx, streamARN)
             if err != nil {
                 return err
             }
             statesCh <- state
         }
🧹 Nitpick comments (1)
connectors/dynamodb/conn.go (1)

107-109: Optional: short-circuit GeneratePlan earlier when neither InitialSync nor Updates are requested.

This early return works, but it still does TableDetails(...) per table before returning when Updates == false and InitialSync == false.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4858a8e and 74afcb6.

📒 Files selected for processing (1)
  • connectors/dynamodb/conn.go
🧰 Additional context used
🧬 Code graph analysis (1)
connectors/dynamodb/conn.go (1)
gen/adiom/v1/messages.pb.go (6)
  • UpdatesPartition (188-198)
  • UpdatesPartition (211-211)
  • UpdatesPartition (226-228)
  • GeneratePlanResponse (649-655)
  • GeneratePlanResponse (668-668)
  • GeneratePlanResponse (683-685)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
connectors/dynamodb/conn.go (1)

148-162: Confirm UpdatesPartition.Namespaces semantics when the request omits namespaces.

When r.Msg.GetNamespaces() is empty, you currently emit an updates partition with Namespaces: namespaces (empty). That’s fine if downstream interprets empty as “all namespaces” (similar to StreamUpdates behavior), but it’s worth confirming this contract since the plan response is what drives the next call.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)

358-371: CLI options not applied for regular DynamoDB connections.

The doc-partition and plan-parallelism flags are only passed to NewConn in the localstack branch. For regular AWS DynamoDB (the else branch at line 369), these user-specified options are silently ignored.

🐛 Proposed fix to apply options consistently
 		}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
+			opts := []func(*dynamodb.Options){
+				dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
+				dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
+			}
 			if strings.EqualFold(args[0], "dynamodb://localstack") {
 				_, connString, ok := strings.Cut(args[0], "://")
 				if !ok {
 					return nil, fmt.Errorf("invalid connection string %v", args[0])
 				}
-				return dynamodb.NewConn(connString,
-					dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
-					dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
-				), nil
+				return dynamodb.NewConn(connString, opts...), nil
 			} else {
-				return dynamodb.NewConn(""), nil
+				return dynamodb.NewConn("", opts...), nil
 			}
 		}),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 77-105: When starting or restarting a stream in the goroutine,
capture and update tableDetails.StreamARN with the ARN returned by
c.client.StartStream(...) instead of discarding it so subsequent calls use the
new ARN; then call c.client.GetStreamState using egCtx (not ctx) to respect the
errgroup cancellation; specifically, in the branches that call
c.client.StartStream(egCtx, name, ...) assign the returned ARN to
tableDetails.StreamARN (or a local var that you then pass to GetStreamState) and
replace the GetStreamState(ctx, tableDetails.StreamARN) call with
GetStreamState(egCtx, tableDetails.StreamARN) so statesCh receives the correct
stream state.
🧹 Nitpick comments (1)
connectors/dynamodb/conn.go (1)

115-118: Redundant check for GetInitialSync().

Since the code returns early on line 108 when !r.Msg.GetInitialSync(), this check is always true when reached. Consider simplifying.

Suggested simplification
-			if r.Msg.GetInitialSync() && c.options.DocsPerSegment > 0 {
+			if c.options.DocsPerSegment > 0 {
 				totalSegments = int(tableDetails.Count / uint64(c.options.DocsPerSegment))
 				totalSegments = max(1, min(1000000, totalSegments))
 			}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74afcb6 and fc19035.

📒 Files selected for processing (2)
  • connectors/dynamodb/conn.go
  • internal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
  • NewConn (404-426)
  • WithDocsPerSegment (398-402)
  • WithPlanParallelism (392-396)
connectors/dynamodb/conn.go (2)
gen/adiom/v1/messages.pb.go (6)
  • UpdatesPartition (188-198)
  • UpdatesPartition (211-211)
  • UpdatesPartition (226-228)
  • GeneratePlanResponse (649-655)
  • GeneratePlanResponse (668-668)
  • GeneratePlanResponse (683-685)
java/src/main/java/adiom/v1/Messages.java (2)
  • UpdatesPartition (1041-1957)
  • GeneratePlanResponse (8653-9710)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
internal/app/options/connectorflags.go (1)

347-357: LGTM on CLI flag definitions.

The new doc-partition and plan-parallelism flags are properly defined with sensible defaults that match the connector's internal defaults.

connectors/dynamodb/conn.go (3)

107-109: LGTM!

The early return when InitialSync is disabled correctly allows the connector to run in updates-only mode.


148-157: LGTM!

The conditional building of the UpdatesPartition slice correctly handles the updates-only flow, encoding the stream state map into the cursor.


392-402: LGTM!

The option configurators follow idiomatic Go functional options pattern and integrate cleanly with NewConn.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)

358-371: CLI flags are ignored for non-localstack DynamoDB connections.

When using dynamodb (not dynamodb://localstack), line 369 calls dynamodb.NewConn("") without passing the WithDocsPerSegment and WithPlanParallelism options. Users would expect --doc-partition and --plan-parallelism flags to work regardless of the connection string.

Proposed fix
 			}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
+				opts := []func(*dynamodb.Options){
+					dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
+					dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
+				}
 				if strings.EqualFold(args[0], "dynamodb://localstack") {
 					_, connString, ok := strings.Cut(args[0], "://")
 					if !ok {
 						return nil, fmt.Errorf("invalid connection string %v", args[0])
 					}
-					return dynamodb.NewConn(connString,
-						dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
-						dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
-					), nil
+					return dynamodb.NewConn(connString, opts...), nil
 				} else {
-					return dynamodb.NewConn(""), nil
+					return dynamodb.NewConn("", opts...), nil
 				}
 			}),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 100-104: The call to c.client.GetStreamState uses the parent ctx
variable instead of the errgroup context egCtx, so if another goroutine errors
out GetStreamState won't be cancelled; update the invocation to use egCtx (i.e.,
replace ctx with egCtx in the GetStreamState call) ensuring the call runs with
the errgroup's context and preserves existing error handling that returns err
and sends state to statesCh.
- Around line 148-157: The UpdatesPartition being appended uses the wrong
variable for Namespaces; when auto-discovery populates tableNames (via
GetAllTableNames()) namespaces can be empty. In the block that builds updates
(the append to updates in the r.Msg.GetUpdates() branch), change the Namespaces
field from using namespaces to using tableNames so UpdatesPartition.Namespaces
contains the actual resolved table names (i.e., set Namespaces to tableNames
when creating the adiomv1.UpdatesPartition with the encoded stateMap cursor).
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fc19035 and f754f61.

📒 Files selected for processing (3)
  • connectors/dynamodb/conn.go
  • connectors/dynamodb/conv.go
  • internal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
connectors/dynamodb/conn.go (1)
gen/adiom/v1/messages.pb.go (6)
  • UpdatesPartition (188-198)
  • UpdatesPartition (211-211)
  • UpdatesPartition (226-228)
  • GeneratePlanResponse (649-655)
  • GeneratePlanResponse (668-668)
  • GeneratePlanResponse (683-685)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
  • NewConn (404-426)
  • WithDocsPerSegment (398-402)
  • WithPlanParallelism (392-396)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
connectors/dynamodb/conv.go (2)

204-211: LGTM!

Good improvement wrapping errors with contextual messages. This will help with debugging when conversion failures occur in itemsToBson.


270-297: LGTM!

The int32 cast is correct for binary.Write which requires fixed-size types. This ensures consistent 4-byte length prefixes across platforms. Given DynamoDB's 400KB attribute value limit, overflow is not a concern.

connectors/dynamodb/conn.go (2)

392-402: LGTM!

Clean functional options pattern implementation. The configurators correctly modify the Options struct fields.


107-109: LGTM!

The early return correctly allows "updates only" mode by skipping partition generation when GetInitialSync() is false.

Comment on lines 148 to 166
var updates []*adiomv1.UpdatesPartition
if r.Msg.GetUpdates() {
var buf bytes.Buffer
enc := gob.NewEncoder(&buf)
err = enc.Encode(stateMap)
if err != nil {
return nil, connect.NewError(connect.CodeInternal, err)
}
updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full context around lines 148-157 and the earlier part (lines 34-35)
cd connectors/dynamodb && cat -n conn.go | sed -n '30,165p'

Repository: adiom-data/dsync

Length of output: 4683


🏁 Script executed:

# Search for UpdatesPartition usage to see how Namespaces is consumed
rg -n -B2 -A5 'UpdatesPartition' --type=go -g '!*_test.go'

Repository: adiom-data/dsync

Length of output: 20733


🏁 Script executed:

# Look for the proto definition of UpdatesPartition to understand the Namespaces field
find . -name '*.proto' -type f | head -20

Repository: adiom-data/dsync

Length of output: 150


🏁 Script executed:

# Verify the fix: check what tableNames contains and how it should be used
cat -n connectors/dynamodb/conn.go | sed -n '148,157p'

Repository: adiom-data/dsync

Length of output: 461


🏁 Script executed:

# Check the proto definition for UpdatesPartition to understand the expected semantics
cat proto/adiom/v1/messages.proto | grep -A 10 "message UpdatesPartition"

Repository: adiom-data/dsync

Length of output: 397


Use tableNames instead of namespaces for UpdatesPartition.Namespaces at line 156.

When r.Msg.GetNamespaces() is empty (lines 34-40), tableNames is populated via GetAllTableNames(), but the namespaces variable remains empty. Line 156 incorrectly uses the empty namespaces variable instead of the resolved tableNames, causing UpdatesPartition.Namespaces to be empty in auto-discovery scenarios. This is inconsistent with regular Partition objects (line 112) which correctly use actual table names, and breaks downstream code in pkg/test/connector.go and internal/app/verify.go that expects Namespaces to contain the tables being streamed.

Change line 156 to: updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})

🤖 Prompt for AI Agents
In @connectors/dynamodb/conn.go around lines 148 - 157, The UpdatesPartition
being appended uses the wrong variable for Namespaces; when auto-discovery
populates tableNames (via GetAllTableNames()) namespaces can be empty. In the
block that builds updates (the append to updates in the r.Msg.GetUpdates()
branch), change the Namespaces field from using namespaces to using tableNames
so UpdatesPartition.Namespaces contains the actual resolved table names (i.e.,
set Namespaces to tableNames when creating the adiomv1.UpdatesPartition with the
encoded stateMap cursor).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
connectors/dynamodb/conn.go (2)

111-113: Inconsistent condition: GetInitialSync() is checked twice.

Line 111 checks !r.Msg.GetInitialSync() to return early, but line 119 also checks r.Msg.GetInitialSync() as a condition. After line 111, the code can only continue if GetInitialSync() is true, making the condition on line 119 always true and redundant.

♻️ Suggested simplification
 			if !r.Msg.GetInitialSync() {
 				return nil
 			}

 			// TODO: reconsider how to map namespaces properly
 			ns := name
 			totalSegments := 1

-			if r.Msg.GetInitialSync() && c.options.DocsPerSegment > 0 {
+			if c.options.DocsPerSegment > 0 {
 				totalSegments = int(tableDetails.Count / uint64(c.options.DocsPerSegment))
 				totalSegments = max(1, min(1000000, totalSegments))
 			}

144-150: Dead code path: ErrNotFound is already handled in the loop.

Since ErrNotFound is handled at line 74-77 by logging and returning nil, the errors.Is(err, ErrNotFound) check at line 146 will never be true. The error group will not receive an ErrNotFound from the table processing loop.

This check could be removed or kept for defensive purposes if ErrNotFound handling may change in the future.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f754f61 and 95b9f9c.

📒 Files selected for processing (3)
  • connectors/dynamodb/conn.go
  • connectors/dynamodb/conv.go
  • internal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
  • ErrNotFound (95-95)
gen/adiom/v1/messages.pb.go (6)
  • UpdatesPartition (188-198)
  • UpdatesPartition (211-211)
  • UpdatesPartition (226-228)
  • GeneratePlanResponse (649-655)
  • GeneratePlanResponse (668-668)
  • GeneratePlanResponse (683-685)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
  • NewConn (408-430)
  • WithDocsPerSegment (402-406)
  • WithPlanParallelism (396-400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (6)
connectors/dynamodb/conv.go (2)

206-210: LGTM! Good error context addition.

Wrapping errors with context ("err in key to bson" and "err in to bson") improves debuggability when BSON encoding fails during item conversion.


273-292: LGTM! Explicit int32 ensures consistent cross-platform encoding.

Switching to explicit int32 for length encoding guarantees 4-byte lengths regardless of platform architecture, which is important for binary protocol consistency with state encoding used in conn.go.

connectors/dynamodb/conn.go (3)

74-78: LGTM! Graceful handling of missing tables.

When a table is not found during plan generation, logging a warning and continuing allows partial plans to be generated, which is a reasonable approach for dynamic environments.


152-161: LGTM! Updates partition construction is correct.

The state map is properly gob-encoded and used to construct an UpdatesPartition only when updates are requested.


396-406: LGTM! Option configurators follow idiomatic functional options pattern.

The WithPlanParallelism and WithDocsPerSegment configurators are well-structured and align with Go's functional options pattern.

internal/app/options/connectorflags.go (1)

347-371: CLI flags are only applied for localstack, not for standard DynamoDB.

The doc-partition and plan-parallelism flags are defined but only passed to dynamodb.NewConn when using dynamodb://localstack. For the standard dynamodb case (line 369), the flags are ignored and the connector is created with default options only.

If this is intentional (e.g., production DynamoDB should always use defaults), this is fine. Otherwise, consider applying the flags consistently.

♻️ Suggested fix if flags should apply to both cases
 			}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
+				opts := []func(*dynamodb.Options){
+					dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
+					dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
+				}
 				if strings.EqualFold(args[0], "dynamodb://localstack") {
 					_, connString, ok := strings.Cut(args[0], "://")
 					if !ok {
 						return nil, fmt.Errorf("invalid connection string %v", args[0])
 					}
-					return dynamodb.NewConn(connString,
-						dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
-						dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
-					), nil
+					return dynamodb.NewConn(connString, opts...), nil
 				} else {
-					return dynamodb.NewConn(""), nil
+					return dynamodb.NewConn("", opts...), nil
 				}
 			}),

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 370-377: The localstack endpoint logic in AWSClientHelper is
shadowing the outer variable: change the inner declaration so you assign to the
existing outer variable (use = instead of :=) when setting endpoint based on
connStr == "localstack", ensure AWSClientHelper uses that endpoint variable
later when creating the dynamodb.Client and dynamodbstreams.Client so the
localstack URL is actually passed through.
🧹 Nitpick comments (1)
internal/app/options/connectorflags.go (1)

358-371: CLI flags are ignored for non-localstack DynamoDB connections.

The doc-partition and plan-parallelism flags are only applied when using dynamodb://localstack. Users specifying dynamodb will have their flag values silently ignored (line 369 passes no options).

Consider applying the flags consistently:

♻️ Proposed fix
 		}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
+			opts := []func(*dynamodb.Options){
+				dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
+				dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
+			}
 			if strings.EqualFold(args[0], "dynamodb://localstack") {
 				_, connString, ok := strings.Cut(args[0], "://")
 				if !ok {
 					return nil, fmt.Errorf("invalid connection string %v", args[0])
 				}
-				return dynamodb.NewConn(connString,
-					dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
-					dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
-				), nil
+				return dynamodb.NewConn(connString, opts...), nil
 			} else {
-				return dynamodb.NewConn(""), nil
+				return dynamodb.NewConn("", opts...), nil
 			}
 		}),
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95b9f9c and 3ed6f01.

📒 Files selected for processing (3)
  • connectors/dynamodb/conn.go
  • connectors/dynamodb/conv.go
  • internal/app/options/connectorflags.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • connectors/dynamodb/conv.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
  • NewConn (412-434)
  • WithDocsPerSegment (406-410)
  • WithPlanParallelism (400-404)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
  • ErrNotFound (95-95)
gen/adiom/v1/messages.pb.go (6)
  • UpdatesPartition (188-198)
  • UpdatesPartition (211-211)
  • UpdatesPartition (226-228)
  • GeneratePlanResponse (649-655)
  • GeneratePlanResponse (668-668)
  • GeneratePlanResponse (683-685)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
connectors/dynamodb/conn.go (4)

75-80: LGTM! Good defensive handling.

Gracefully skipping tables that are not found instead of failing the entire plan is a reasonable approach for robustness.


82-114: Logic flow concern: early exit placement.

The early exit at lines 112-114 (if !r.Msg.GetInitialSync()) is placed after the updates handling block (lines 82-110). This means if GetUpdates() is true but GetInitialSync() is false, the code will still perform stream setup and state retrieval before returning early without generating partitions.

Verify this is the intended behavior—when only streaming updates without initial sync, you still want to prepare the stream state but skip partition generation.


400-410: LGTM! Clean functional options pattern.

The option configurators follow Go's idiomatic functional options pattern and integrate well with NewConn.


153-162: LGTM!

The dynamic building of UpdatesPartition entries based on the GetUpdates() flag is clean. The gob encoding of the state map is consistent with how it's decoded in StreamUpdates.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)

363-376: CLI flags are ignored for non-localstack DynamoDB connections.

The flags --doc-partition, --plan-parallelism, and --id are only passed to NewConn when using dynamodb://localstack. For plain dynamodb connections (line 374), these flags are silently ignored because NewConn("") is called without the option functions.

Consider applying the same options to both paths:

🔧 Proposed fix
-			}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
-				if strings.EqualFold(args[0], "dynamodb://localstack") {
-					_, connString, ok := strings.Cut(args[0], "://")
-					if !ok {
-						return nil, fmt.Errorf("invalid connection string %v", args[0])
-					}
-					return dynamodb.NewConn(connString,
-						dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
-						dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
-						dynamodb.WithID(c.String("id")),
-					), nil
-				} else {
-					return dynamodb.NewConn(""), nil
-				}
+			}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) {
+				connString := ""
+				if strings.EqualFold(args[0], "dynamodb://localstack") {
+					_, connString, _ = strings.Cut(args[0], "://")
+				}
+				return dynamodb.NewConn(connString,
+					dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
+					dynamodb.WithPlanParallelism(c.Int("plan-parallelism")),
+					dynamodb.WithID(c.String("id")),
+				), nil
 			}),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 371-378: In AWSClientHelper the inner if uses := which shadows the
outer endpoint, leaving the outer endpoint empty and preventing BaseEndpoint
from being set; change the inner declaration to assign to the existing variable
(use = instead of :=) so the outer endpoint gets the localstack URL (from
AWS_ENDPOINT_URL or default "http://localhost:4566"), then ensure any subsequent
use (e.g., setting BaseEndpoint) reads that populated endpoint variable.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ed6f01 and 9d40246.

📒 Files selected for processing (3)
  • connectors/dynamodb/conn.go
  • connectors/dynamodb/conv.go
  • internal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (4)
  • NewConn (420-442)
  • WithDocsPerSegment (414-418)
  • WithPlanParallelism (408-412)
  • WithID (402-406)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
  • ErrNotFound (95-95)
gen/adiom/v1/messages.pb.go (3)
  • GetInfoResponse (424-433)
  • GetInfoResponse (446-446)
  • GetInfoResponse (461-463)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (8)
connectors/dynamodb/conv.go (2)

206-210: LGTM! Good addition of error context.

Wrapping errors with context helps debugging by indicating which step failed during BSON conversion.


270-297: Good fix for cross-platform consistency, but verify backward compatibility.

Using int32 ensures fixed-width (4-byte) length encoding regardless of platform architecture. This is important for binary protocol consistency.

However, if there's existing encoded data from a 64-bit system where int was 8 bytes, this change could cause decoding issues. Verify that no persistent state relies on the previous encoding format.

internal/app/options/connectorflags.go (1)

347-362: LGTM! CLI flags are well-defined.

The flag names and default values align with the option configurators in conn.go.

connectors/dynamodb/conn.go (5)

75-78: LGTM! Good error handling for missing tables.

Logging a warning and continuing instead of failing allows the sync to proceed when tables are removed during planning.


82-114: LGTM! Flow logic correctly supports initial-sync-only mode.

The conditional structure properly gates stream setup behind GetUpdates() and allows early return when only updates are requested without initial sync.


153-162: LGTM! State encoding is properly gated.

The UpdatesPartition is only created when GetUpdates() is true, correctly aligning with when stream state is collected.


396-418: LGTM! Clean functional options implementation.

The option configurators follow a standard pattern and integrate well with NewConn.


171-188: LGTM! GetInfo now returns the configured ID.

This allows users to set a custom identifier for the connector via the --id flag.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conv.go`:
- Around line 74-75: The nil branch currently returns an
AttributeValueMemberNULL with the zero-value (false); update the nil case to
return &types.AttributeValueMemberNULL{Value: true} so DynamoDB NULL attributes
serialize correctly—locate the nil case in connectors/dynamodb/conv.go (the
switch/case that returns types.AttributeValueMemberNULL) and set Value: true on
the returned struct.

In `@internal/app/options/connectorflags.go`:
- Around line 347-375: The non-localstack branch currently returns
dynamodb.NewConn("") and ignores CLI flags; change it to parse the connection
string (reusing strings.Cut if a scheme is present) and call dynamodb.NewConn
with the same option builders used for localstack:
dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), and
dynamodb.WithID(c.String("id")). Update the else branch in the Create helper
(the CreateHelper block that calls dynamodb.NewConn) so both the localstack and
non-localstack paths construct the connection string and pass the options to
dynamodb.NewConn instead of returning an empty connection.
♻️ Duplicate comments (2)
connectors/dynamodb/conn.go (2)

82-109: Stream ARN is still discarded after StartStream.

The ARN returned by StartStream is ignored, so GetStreamState can use a stale/empty ARN. This was flagged earlier. Please capture and use the returned ARN.

🔧 Proposed fix
 if r.Msg.GetUpdates() {
 	if tableDetails.StreamARN == "" {
 		if c.spec == "localstack" {
 			slog.Debug("No stream found, starting stream", "table", name)
-			_, err := c.client.StartStream(egCtx, name, false)
+			tableDetails.StreamARN, err = c.client.StartStream(egCtx, name, false)
 			if err != nil {
 				return err
 			}
 		} else {
 			return fmt.Errorf("no stream found")
 		}
 	} else if tableDetails.IncompatibleStream {
 		if c.spec == "localstack" {
 			slog.Debug("Incompatible stream found, restarting stream", "table", name)
-			_, err := c.client.StartStream(egCtx, name, true)
+			tableDetails.StreamARN, err = c.client.StartStream(egCtx, name, true)
 			if err != nil {
 				return err
 			}
 		} else {
 			return fmt.Errorf("incompatible stream found")
 		}
 	}

 	state, err := c.client.GetStreamState(egCtx, tableDetails.StreamARN)

153-162: UpdatesPartition should use resolved table names.

When namespaces are auto-discovered, namespaces is empty but tableNames is populated. Using namespaces produces empty UpdatesPartition.Namespaces. This was previously flagged.

🔧 Proposed fix
-		updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()})
+		updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})

@adiom-mark adiom-mark force-pushed the dynamodb_separate branch 4 times, most recently from 7dc6a97 to 10b8acc Compare January 22, 2026 18:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conn.go`:
- Around line 420-441: The WithPlanParallelism setter currently assigns the raw
integer to Options.PlanParallelism which can be zero or negative and lead to
deadlocks or disabled limits when later passed to errgroup.SetLimit; update the
WithPlanParallelism function to clamp the input to a minimum of 1 before
assigning (e.g., if n < 1 then set n = 1) so Options.PlanParallelism is always
>= 1 and downstream calls like errgroup.SetLimit(Options.PlanParallelism) are
safe.
♻️ Duplicate comments (1)
connectors/dynamodb/conn.go (1)

156-165: Use resolved table names for UpdatesPartition.Namespaces.

When r.Msg.GetNamespaces() is empty, namespaces stays empty while tableNames holds the resolved tables. Using namespaces here yields empty UpdatesPartition namespaces in auto-discovery scenarios.

✅ Proposed fix
-        updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()})
+        updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})

Comment on lines +420 to +449
ID string
DocsPerSegment int
PlanParallelism int
}

func WithID(s string) func(*Options) {
return func(o *Options) {
o.ID = s
}
}

func WithPlanParallelism(n int) func(*Options) {
return func(o *Options) {
o.PlanParallelism = n
}
}

func WithDocsPerSegment(n int) func(*Options) {
return func(o *Options) {
o.DocsPerSegment = n
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find the file and examine the context around the WithPlanParallelism function
fd -t f "conn.go" | grep dynamodb

Repository: adiom-data/dsync

Length of output: 87


🏁 Script executed:

# Search for where PlanParallelism is used in the codebase
rg "PlanParallelism" --type go -B 3 -A 3

Repository: adiom-data/dsync

Length of output: 2523


🏁 Script executed:

# Search for errgroup.SetLimit calls to understand how PlanParallelism is used
rg "SetLimit" --type go -B 5 -A 5

Repository: adiom-data/dsync

Length of output: 4530


🌐 Web query:

golang.org/x/sync/errgroup SetLimit panic behavior zero negative values

💡 Result:

  • SetLimit(n) does not panic just because n is 0 or negative.

    • n < 0: disables the limit (“no limit”) by setting the internal semaphore to nil. [1], [2]
    • n == 0: sets a limit of zero (“prevent any new goroutines from being added”). Practically, Group.Go(...) will block forever trying to acquire the limit, and TryGo(...) will always return false (it can never acquire a slot). [1], [2]
  • When it does panic: SetLimit panics if you call it while the group still has active goroutines (i.e., the limit is being modified while tokens are currently held). The implementation checks len(g.sem) != 0 and panics in that case. [2]

Sources:
[1] golang.org/x/sync/errgroup docs on pkg.go.dev (SetLimit behavior for negative/zero, and “must not be modified while any goroutines … are active”). (pkg.go.dev)
[2] Go codereview showing the SetLimit implementation and its panic condition. (groups.google.com)

Citations:


Guard PlanParallelism to prevent deadlock or unintended unlimited concurrency.

While errgroup.SetLimit does not panic on zero or negative values, passing these values causes problematic behavior: n == 0 causes Go() calls to block indefinitely, and n < 0 disables the limit entirely. Clamp the value to a minimum of 1.

♻️ Suggested guard
 func WithPlanParallelism(n int) func(*Options) {
 	return func(o *Options) {
-		o.PlanParallelism = n
+		if n < 1 {
+			n = 1
+		}
+		o.PlanParallelism = n
 	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ID string
DocsPerSegment int
PlanParallelism int
}
func WithID(s string) func(*Options) {
return func(o *Options) {
o.ID = s
}
}
func WithPlanParallelism(n int) func(*Options) {
return func(o *Options) {
o.PlanParallelism = n
}
}
func WithDocsPerSegment(n int) func(*Options) {
return func(o *Options) {
o.DocsPerSegment = n
}
}
ID string
DocsPerSegment int
PlanParallelism int
}
func WithID(s string) func(*Options) {
return func(o *Options) {
o.ID = s
}
}
func WithPlanParallelism(n int) func(*Options) {
return func(o *Options) {
if n < 1 {
n = 1
}
o.PlanParallelism = n
}
}
func WithDocsPerSegment(n int) func(*Options) {
return func(o *Options) {
o.DocsPerSegment = n
}
}
🤖 Prompt for AI Agents
In `@connectors/dynamodb/conn.go` around lines 420 - 441, The WithPlanParallelism
setter currently assigns the raw integer to Options.PlanParallelism which can be
zero or negative and lead to deadlocks or disabled limits when later passed to
errgroup.SetLimit; update the WithPlanParallelism function to clamp the input to
a minimum of 1 before assigning (e.g., if n < 1 then set n = 1) so
Options.PlanParallelism is always >= 1 and downstream calls like
errgroup.SetLimit(Options.PlanParallelism) are safe.

Mark C added 2 commits January 28, 2026 09:56
Expose doc-partition and plan-parallelism params.
shuffle tasks
@adiom-mark adiom-mark enabled auto-merge (squash) January 28, 2026 17:58
@adiom-mark adiom-mark disabled auto-merge January 28, 2026 17:58
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conn.go`:
- Around line 77-80: When handling ErrNotFound for a table (errors.Is(err,
ErrNotFound) with the "name" variable), do not silently return nil when the user
provided explicit namespaces; only ignore missing tables when namespaces were
auto-discovered. Modify the branch to check the namespace source (e.g., a
boolean like namespacesExplicit or a config field that indicates user-specified
namespaces) and: if the table is missing and namespaces were auto-discovered,
log the warning and return nil; if namespaces were explicitly provided, return
the ErrNotFound (or propagate err) instead of nil so the caller can handle the
missing table.

Comment on lines +77 to +80
if errors.Is(err, ErrNotFound) {
slog.Warn("Table not found. Ignoring.", "name", name)
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid silently skipping missing tables when namespaces are explicit.

If the user specified namespaces, ignoring ErrNotFound can lead to silent data loss. Consider only skipping when namespaces were auto-discovered.

🔧 Suggested fix
-			if err != nil {
-				if errors.Is(err, ErrNotFound) {
-					slog.Warn("Table not found. Ignoring.", "name", name)
-					return nil
-				}
-				return err
-			}
+			if err != nil {
+				if errors.Is(err, ErrNotFound) {
+					if len(namespaces) == 0 {
+						slog.Warn("Table not found. Ignoring.", "name", name)
+						return nil
+					}
+				}
+				return err
+			}
🤖 Prompt for AI Agents
In `@connectors/dynamodb/conn.go` around lines 77 - 80, When handling ErrNotFound
for a table (errors.Is(err, ErrNotFound) with the "name" variable), do not
silently return nil when the user provided explicit namespaces; only ignore
missing tables when namespaces were auto-discovered. Modify the branch to check
the namespace source (e.g., a boolean like namespacesExplicit or a config field
that indicates user-specified namespaces) and: if the table is missing and
namespaces were auto-discovered, log the warning and return nil; if namespaces
were explicitly provided, return the ErrNotFound (or propagate err) instead of
nil so the caller can handle the missing table.

@adiom-mark adiom-mark merged commit c652536 into main Jan 28, 2026
2 checks passed
@adiom-mark adiom-mark deleted the dynamodb_separate branch January 28, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants