-
Notifications
You must be signed in to change notification settings - Fork 11
Allow dynamodb to run initial sync only #361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughGeneratePlan now tolerates missing/incompatible DynamoDB streams (creates/restarts them on localstack), encodes/emits UpdatesPartitions when updates are requested, short-circuits when InitialSync is false, exposes Options.ID via GetInfo and new configurators wired to CLI flags, improves NULL conversions, and respects AWS endpoint overrides. Changes
Sequence Diagram(s)sequenceDiagram
participant Requestor
participant Conn as DynamoDB Connector
participant Dynamo as DynamoDB/TableManager
participant StateCh as Updates Channel
Requestor->>Conn: GeneratePlan(r.Msg)
alt InitialSync disabled
Conn-->>Requestor: return (early exit)
else InitialSync enabled
alt Updates requested
Conn->>Dynamo: GetTableDetails
Dynamo-->>Conn: TableDetails or ErrNotFound
alt No stream or IncompatibleStream
Conn->>Dynamo: Start/Restart Stream (localstack) / error (non-local)
Dynamo-->>Conn: StreamARN/status
end
Conn->>Dynamo: GetStreamState
Dynamo-->>Conn: StreamState
Conn->>StateCh: Encode stateMap → UpdatesPartition(s)
else Updates not requested
Conn-->>Requestor: skip state emission
end
Conn-->>Requestor: return plan (may include UpdatesPartitions)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
connectors/dynamodb/conn.go (1)
70-105: Fix loop-variable capture + use the post-start StreamARN when fetching state (localstack path).Right now:
- Line 71 captures
namefrom the loop (can make all goroutines use the last table name).- Lines 79-85 can start a stream but Lines 100-105 still call
GetStreamState(..., tableDetails.StreamARN)which is still""unless you refresh it.Proposed fix
for _, name := range tableNames { + name := name // avoid loop-variable capture eg.Go(func() error { tableDetails, err := c.client.TableDetails(egCtx, name) if err != nil { return err } if r.Msg.GetUpdates() { - if tableDetails.StreamARN == "" { + streamARN := tableDetails.StreamARN + if streamARN == "" { if c.spec == "localstack" { slog.Debug("No stream found, starting stream", "table", name) _, err := c.client.StartStream(egCtx, name, false) if err != nil { return err } + // Refresh StreamARN after enabling streams. + tableDetails, err = c.client.TableDetails(egCtx, name) + if err != nil { + return err + } + streamARN = tableDetails.StreamARN + if streamARN == "" { + return fmt.Errorf("no stream found after starting stream for table %s", name) + } } else { - return fmt.Errorf("no stream found") + return fmt.Errorf("no stream found for table %s", name) } } else if tableDetails.IncompatibleStream { if c.spec == "localstack" { slog.Debug("Incompatible stream found, restarting stream", "table", name) _, err := c.client.StartStream(egCtx, name, true) if err != nil { return err } + tableDetails, err = c.client.TableDetails(egCtx, name) + if err != nil { + return err + } + streamARN = tableDetails.StreamARN } else { - return fmt.Errorf("incompatible stream found") + return fmt.Errorf("incompatible stream found for table %s", name) } } - state, err := c.client.GetStreamState(ctx, tableDetails.StreamARN) + state, err := c.client.GetStreamState(egCtx, streamARN) if err != nil { return err } statesCh <- state }
🧹 Nitpick comments (1)
connectors/dynamodb/conn.go (1)
107-109: Optional: short-circuitGeneratePlanearlier when neither InitialSync nor Updates are requested.This early return works, but it still does
TableDetails(...)per table before returning whenUpdates == falseandInitialSync == false.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
connectors/dynamodb/conn.go
🧰 Additional context used
🧬 Code graph analysis (1)
connectors/dynamodb/conn.go (1)
gen/adiom/v1/messages.pb.go (6)
UpdatesPartition(188-198)UpdatesPartition(211-211)UpdatesPartition(226-228)GeneratePlanResponse(649-655)GeneratePlanResponse(668-668)GeneratePlanResponse(683-685)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (1)
connectors/dynamodb/conn.go (1)
148-162: ConfirmUpdatesPartition.Namespacessemantics when the request omits namespaces.When
r.Msg.GetNamespaces()is empty, you currently emit an updates partition withNamespaces: namespaces(empty). That’s fine if downstream interprets empty as “all namespaces” (similar toStreamUpdatesbehavior), but it’s worth confirming this contract since the plan response is what drives the next call.
74afcb6 to
fc19035
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)
358-371: CLI options not applied for regular DynamoDB connections.The
doc-partitionandplan-parallelismflags are only passed toNewConnin the localstack branch. For regular AWS DynamoDB (theelsebranch at line 369), these user-specified options are silently ignored.🐛 Proposed fix to apply options consistently
}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { + opts := []func(*dynamodb.Options){ + dynamodb.WithDocsPerSegment(c.Int("doc-partition")), + dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), + } if strings.EqualFold(args[0], "dynamodb://localstack") { _, connString, ok := strings.Cut(args[0], "://") if !ok { return nil, fmt.Errorf("invalid connection string %v", args[0]) } - return dynamodb.NewConn(connString, - dynamodb.WithDocsPerSegment(c.Int("doc-partition")), - dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), - ), nil + return dynamodb.NewConn(connString, opts...), nil } else { - return dynamodb.NewConn(""), nil + return dynamodb.NewConn("", opts...), nil } }),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 77-105: When starting or restarting a stream in the goroutine,
capture and update tableDetails.StreamARN with the ARN returned by
c.client.StartStream(...) instead of discarding it so subsequent calls use the
new ARN; then call c.client.GetStreamState using egCtx (not ctx) to respect the
errgroup cancellation; specifically, in the branches that call
c.client.StartStream(egCtx, name, ...) assign the returned ARN to
tableDetails.StreamARN (or a local var that you then pass to GetStreamState) and
replace the GetStreamState(ctx, tableDetails.StreamARN) call with
GetStreamState(egCtx, tableDetails.StreamARN) so statesCh receives the correct
stream state.
🧹 Nitpick comments (1)
connectors/dynamodb/conn.go (1)
115-118: Redundant check forGetInitialSync().Since the code returns early on line 108 when
!r.Msg.GetInitialSync(), this check is always true when reached. Consider simplifying.Suggested simplification
- if r.Msg.GetInitialSync() && c.options.DocsPerSegment > 0 { + if c.options.DocsPerSegment > 0 { totalSegments = int(tableDetails.Count / uint64(c.options.DocsPerSegment)) totalSegments = max(1, min(1000000, totalSegments)) }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
connectors/dynamodb/conn.gointernal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
NewConn(404-426)WithDocsPerSegment(398-402)WithPlanParallelism(392-396)
connectors/dynamodb/conn.go (2)
gen/adiom/v1/messages.pb.go (6)
UpdatesPartition(188-198)UpdatesPartition(211-211)UpdatesPartition(226-228)GeneratePlanResponse(649-655)GeneratePlanResponse(668-668)GeneratePlanResponse(683-685)java/src/main/java/adiom/v1/Messages.java (2)
UpdatesPartition(1041-1957)GeneratePlanResponse(8653-9710)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
internal/app/options/connectorflags.go (1)
347-357: LGTM on CLI flag definitions.The new
doc-partitionandplan-parallelismflags are properly defined with sensible defaults that match the connector's internal defaults.connectors/dynamodb/conn.go (3)
107-109: LGTM!The early return when
InitialSyncis disabled correctly allows the connector to run in updates-only mode.
148-157: LGTM!The conditional building of the
UpdatesPartitionslice correctly handles the updates-only flow, encoding the stream state map into the cursor.
392-402: LGTM!The option configurators follow idiomatic Go functional options pattern and integrate cleanly with
NewConn.
fc19035 to
f754f61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)
358-371: CLI flags are ignored for non-localstack DynamoDB connections.When using
dynamodb(notdynamodb://localstack), line 369 callsdynamodb.NewConn("")without passing theWithDocsPerSegmentandWithPlanParallelismoptions. Users would expect--doc-partitionand--plan-parallelismflags to work regardless of the connection string.Proposed fix
}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { + opts := []func(*dynamodb.Options){ + dynamodb.WithDocsPerSegment(c.Int("doc-partition")), + dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), + } if strings.EqualFold(args[0], "dynamodb://localstack") { _, connString, ok := strings.Cut(args[0], "://") if !ok { return nil, fmt.Errorf("invalid connection string %v", args[0]) } - return dynamodb.NewConn(connString, - dynamodb.WithDocsPerSegment(c.Int("doc-partition")), - dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), - ), nil + return dynamodb.NewConn(connString, opts...), nil } else { - return dynamodb.NewConn(""), nil + return dynamodb.NewConn("", opts...), nil } }),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 100-104: The call to c.client.GetStreamState uses the parent ctx
variable instead of the errgroup context egCtx, so if another goroutine errors
out GetStreamState won't be cancelled; update the invocation to use egCtx (i.e.,
replace ctx with egCtx in the GetStreamState call) ensuring the call runs with
the errgroup's context and preserves existing error handling that returns err
and sends state to statesCh.
- Around line 148-157: The UpdatesPartition being appended uses the wrong
variable for Namespaces; when auto-discovery populates tableNames (via
GetAllTableNames()) namespaces can be empty. In the block that builds updates
(the append to updates in the r.Msg.GetUpdates() branch), change the Namespaces
field from using namespaces to using tableNames so UpdatesPartition.Namespaces
contains the actual resolved table names (i.e., set Namespaces to tableNames
when creating the adiomv1.UpdatesPartition with the encoded stateMap cursor).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
connectors/dynamodb/conn.goconnectors/dynamodb/conv.gointernal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
connectors/dynamodb/conn.go (1)
gen/adiom/v1/messages.pb.go (6)
UpdatesPartition(188-198)UpdatesPartition(211-211)UpdatesPartition(226-228)GeneratePlanResponse(649-655)GeneratePlanResponse(668-668)GeneratePlanResponse(683-685)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
NewConn(404-426)WithDocsPerSegment(398-402)WithPlanParallelism(392-396)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
connectors/dynamodb/conv.go (2)
204-211: LGTM!Good improvement wrapping errors with contextual messages. This will help with debugging when conversion failures occur in
itemsToBson.
270-297: LGTM!The
int32cast is correct forbinary.Writewhich requires fixed-size types. This ensures consistent 4-byte length prefixes across platforms. Given DynamoDB's 400KB attribute value limit, overflow is not a concern.connectors/dynamodb/conn.go (2)
392-402: LGTM!Clean functional options pattern implementation. The configurators correctly modify the
Optionsstruct fields.
107-109: LGTM!The early return correctly allows "updates only" mode by skipping partition generation when
GetInitialSync()is false.
| var updates []*adiomv1.UpdatesPartition | ||
| if r.Msg.GetUpdates() { | ||
| var buf bytes.Buffer | ||
| enc := gob.NewEncoder(&buf) | ||
| err = enc.Encode(stateMap) | ||
| if err != nil { | ||
| return nil, connect.NewError(connect.CodeInternal, err) | ||
| } | ||
| updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()}) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the full context around lines 148-157 and the earlier part (lines 34-35)
cd connectors/dynamodb && cat -n conn.go | sed -n '30,165p'Repository: adiom-data/dsync
Length of output: 4683
🏁 Script executed:
# Search for UpdatesPartition usage to see how Namespaces is consumed
rg -n -B2 -A5 'UpdatesPartition' --type=go -g '!*_test.go'Repository: adiom-data/dsync
Length of output: 20733
🏁 Script executed:
# Look for the proto definition of UpdatesPartition to understand the Namespaces field
find . -name '*.proto' -type f | head -20Repository: adiom-data/dsync
Length of output: 150
🏁 Script executed:
# Verify the fix: check what tableNames contains and how it should be used
cat -n connectors/dynamodb/conn.go | sed -n '148,157p'Repository: adiom-data/dsync
Length of output: 461
🏁 Script executed:
# Check the proto definition for UpdatesPartition to understand the expected semantics
cat proto/adiom/v1/messages.proto | grep -A 10 "message UpdatesPartition"Repository: adiom-data/dsync
Length of output: 397
Use tableNames instead of namespaces for UpdatesPartition.Namespaces at line 156.
When r.Msg.GetNamespaces() is empty (lines 34-40), tableNames is populated via GetAllTableNames(), but the namespaces variable remains empty. Line 156 incorrectly uses the empty namespaces variable instead of the resolved tableNames, causing UpdatesPartition.Namespaces to be empty in auto-discovery scenarios. This is inconsistent with regular Partition objects (line 112) which correctly use actual table names, and breaks downstream code in pkg/test/connector.go and internal/app/verify.go that expects Namespaces to contain the tables being streamed.
Change line 156 to: updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})
🤖 Prompt for AI Agents
In @connectors/dynamodb/conn.go around lines 148 - 157, The UpdatesPartition
being appended uses the wrong variable for Namespaces; when auto-discovery
populates tableNames (via GetAllTableNames()) namespaces can be empty. In the
block that builds updates (the append to updates in the r.Msg.GetUpdates()
branch), change the Namespaces field from using namespaces to using tableNames
so UpdatesPartition.Namespaces contains the actual resolved table names (i.e.,
set Namespaces to tableNames when creating the adiomv1.UpdatesPartition with the
encoded stateMap cursor).
f754f61 to
95b9f9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
connectors/dynamodb/conn.go (2)
111-113: Inconsistent condition:GetInitialSync()is checked twice.Line 111 checks
!r.Msg.GetInitialSync()to return early, but line 119 also checksr.Msg.GetInitialSync()as a condition. After line 111, the code can only continue ifGetInitialSync()is true, making the condition on line 119 always true and redundant.♻️ Suggested simplification
if !r.Msg.GetInitialSync() { return nil } // TODO: reconsider how to map namespaces properly ns := name totalSegments := 1 - if r.Msg.GetInitialSync() && c.options.DocsPerSegment > 0 { + if c.options.DocsPerSegment > 0 { totalSegments = int(tableDetails.Count / uint64(c.options.DocsPerSegment)) totalSegments = max(1, min(1000000, totalSegments)) }
144-150: Dead code path:ErrNotFoundis already handled in the loop.Since
ErrNotFoundis handled at line 74-77 by logging and returningnil, theerrors.Is(err, ErrNotFound)check at line 146 will never be true. The error group will not receive anErrNotFoundfrom the table processing loop.This check could be removed or kept for defensive purposes if
ErrNotFoundhandling may change in the future.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
connectors/dynamodb/conn.goconnectors/dynamodb/conv.gointernal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
ErrNotFound(95-95)gen/adiom/v1/messages.pb.go (6)
UpdatesPartition(188-198)UpdatesPartition(211-211)UpdatesPartition(226-228)GeneratePlanResponse(649-655)GeneratePlanResponse(668-668)GeneratePlanResponse(683-685)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
NewConn(408-430)WithDocsPerSegment(402-406)WithPlanParallelism(396-400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (6)
connectors/dynamodb/conv.go (2)
206-210: LGTM! Good error context addition.Wrapping errors with context (
"err in key to bson"and"err in to bson") improves debuggability when BSON encoding fails during item conversion.
273-292: LGTM! Explicit int32 ensures consistent cross-platform encoding.Switching to explicit
int32for length encoding guarantees 4-byte lengths regardless of platform architecture, which is important for binary protocol consistency with state encoding used inconn.go.connectors/dynamodb/conn.go (3)
74-78: LGTM! Graceful handling of missing tables.When a table is not found during plan generation, logging a warning and continuing allows partial plans to be generated, which is a reasonable approach for dynamic environments.
152-161: LGTM! Updates partition construction is correct.The state map is properly gob-encoded and used to construct an
UpdatesPartitiononly when updates are requested.
396-406: LGTM! Option configurators follow idiomatic functional options pattern.The
WithPlanParallelismandWithDocsPerSegmentconfigurators are well-structured and align with Go's functional options pattern.internal/app/options/connectorflags.go (1)
347-371: CLI flags are only applied forlocalstack, not for standard DynamoDB.The
doc-partitionandplan-parallelismflags are defined but only passed todynamodb.NewConnwhen usingdynamodb://localstack. For the standarddynamodbcase (line 369), the flags are ignored and the connector is created with default options only.If this is intentional (e.g., production DynamoDB should always use defaults), this is fine. Otherwise, consider applying the flags consistently.
♻️ Suggested fix if flags should apply to both cases
}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { + opts := []func(*dynamodb.Options){ + dynamodb.WithDocsPerSegment(c.Int("doc-partition")), + dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), + } if strings.EqualFold(args[0], "dynamodb://localstack") { _, connString, ok := strings.Cut(args[0], "://") if !ok { return nil, fmt.Errorf("invalid connection string %v", args[0]) } - return dynamodb.NewConn(connString, - dynamodb.WithDocsPerSegment(c.Int("doc-partition")), - dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), - ), nil + return dynamodb.NewConn(connString, opts...), nil } else { - return dynamodb.NewConn(""), nil + return dynamodb.NewConn("", opts...), nil } }),
95b9f9c to
3ed6f01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 370-377: The localstack endpoint logic in AWSClientHelper is
shadowing the outer variable: change the inner declaration so you assign to the
existing outer variable (use = instead of :=) when setting endpoint based on
connStr == "localstack", ensure AWSClientHelper uses that endpoint variable
later when creating the dynamodb.Client and dynamodbstreams.Client so the
localstack URL is actually passed through.
🧹 Nitpick comments (1)
internal/app/options/connectorflags.go (1)
358-371: CLI flags are ignored for non-localstack DynamoDB connections.The
doc-partitionandplan-parallelismflags are only applied when usingdynamodb://localstack. Users specifyingdynamodbwill have their flag values silently ignored (line 369 passes no options).Consider applying the flags consistently:
♻️ Proposed fix
}, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { + opts := []func(*dynamodb.Options){ + dynamodb.WithDocsPerSegment(c.Int("doc-partition")), + dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), + } if strings.EqualFold(args[0], "dynamodb://localstack") { _, connString, ok := strings.Cut(args[0], "://") if !ok { return nil, fmt.Errorf("invalid connection string %v", args[0]) } - return dynamodb.NewConn(connString, - dynamodb.WithDocsPerSegment(c.Int("doc-partition")), - dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), - ), nil + return dynamodb.NewConn(connString, opts...), nil } else { - return dynamodb.NewConn(""), nil + return dynamodb.NewConn("", opts...), nil } }),
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
connectors/dynamodb/conn.goconnectors/dynamodb/conv.gointernal/app/options/connectorflags.go
🚧 Files skipped from review as they are similar to previous changes (1)
- connectors/dynamodb/conv.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (3)
NewConn(412-434)WithDocsPerSegment(406-410)WithPlanParallelism(400-404)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
ErrNotFound(95-95)gen/adiom/v1/messages.pb.go (6)
UpdatesPartition(188-198)UpdatesPartition(211-211)UpdatesPartition(226-228)GeneratePlanResponse(649-655)GeneratePlanResponse(668-668)GeneratePlanResponse(683-685)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
connectors/dynamodb/conn.go (4)
75-80: LGTM! Good defensive handling.Gracefully skipping tables that are not found instead of failing the entire plan is a reasonable approach for robustness.
82-114: Logic flow concern: early exit placement.The early exit at lines 112-114 (
if !r.Msg.GetInitialSync()) is placed after the updates handling block (lines 82-110). This means ifGetUpdates()is true butGetInitialSync()is false, the code will still perform stream setup and state retrieval before returning early without generating partitions.Verify this is the intended behavior—when only streaming updates without initial sync, you still want to prepare the stream state but skip partition generation.
400-410: LGTM! Clean functional options pattern.The option configurators follow Go's idiomatic functional options pattern and integrate well with
NewConn.
153-162: LGTM!The dynamic building of
UpdatesPartitionentries based on theGetUpdates()flag is clean. The gob encoding of the state map is consistent with how it's decoded inStreamUpdates.
3ed6f01 to
9d40246
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/app/options/connectorflags.go (1)
363-376: CLI flags are ignored for non-localstack DynamoDB connections.The flags
--doc-partition,--plan-parallelism, and--idare only passed toNewConnwhen usingdynamodb://localstack. For plaindynamodbconnections (line 374), these flags are silently ignored becauseNewConn("")is called without the option functions.Consider applying the same options to both paths:
🔧 Proposed fix
- }, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { - if strings.EqualFold(args[0], "dynamodb://localstack") { - _, connString, ok := strings.Cut(args[0], "://") - if !ok { - return nil, fmt.Errorf("invalid connection string %v", args[0]) - } - return dynamodb.NewConn(connString, - dynamodb.WithDocsPerSegment(c.Int("doc-partition")), - dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), - dynamodb.WithID(c.String("id")), - ), nil - } else { - return dynamodb.NewConn(""), nil - } + }, func(c *cli.Context, args []string, _ AdditionalSettings) (adiomv1connect.ConnectorServiceHandler, error) { + connString := "" + if strings.EqualFold(args[0], "dynamodb://localstack") { + _, connString, _ = strings.Cut(args[0], "://") + } + return dynamodb.NewConn(connString, + dynamodb.WithDocsPerSegment(c.Int("doc-partition")), + dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), + dynamodb.WithID(c.String("id")), + ), nil }),
🤖 Fix all issues with AI agents
In @connectors/dynamodb/conn.go:
- Around line 371-378: In AWSClientHelper the inner if uses := which shadows the
outer endpoint, leaving the outer endpoint empty and preventing BaseEndpoint
from being set; change the inner declaration to assign to the existing variable
(use = instead of :=) so the outer endpoint gets the localstack URL (from
AWS_ENDPOINT_URL or default "http://localhost:4566"), then ensure any subsequent
use (e.g., setting BaseEndpoint) reads that populated endpoint variable.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
connectors/dynamodb/conn.goconnectors/dynamodb/conv.gointernal/app/options/connectorflags.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/app/options/connectorflags.go (1)
connectors/dynamodb/conn.go (4)
NewConn(420-442)WithDocsPerSegment(414-418)WithPlanParallelism(408-412)WithID(402-406)
connectors/dynamodb/conn.go (2)
connectors/dynamodb/client.go (1)
ErrNotFound(95-95)gen/adiom/v1/messages.pb.go (3)
GetInfoResponse(424-433)GetInfoResponse(446-446)GetInfoResponse(461-463)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (8)
connectors/dynamodb/conv.go (2)
206-210: LGTM! Good addition of error context.Wrapping errors with context helps debugging by indicating which step failed during BSON conversion.
270-297: Good fix for cross-platform consistency, but verify backward compatibility.Using
int32ensures fixed-width (4-byte) length encoding regardless of platform architecture. This is important for binary protocol consistency.However, if there's existing encoded data from a 64-bit system where
intwas 8 bytes, this change could cause decoding issues. Verify that no persistent state relies on the previous encoding format.internal/app/options/connectorflags.go (1)
347-362: LGTM! CLI flags are well-defined.The flag names and default values align with the option configurators in
conn.go.connectors/dynamodb/conn.go (5)
75-78: LGTM! Good error handling for missing tables.Logging a warning and continuing instead of failing allows the sync to proceed when tables are removed during planning.
82-114: LGTM! Flow logic correctly supports initial-sync-only mode.The conditional structure properly gates stream setup behind
GetUpdates()and allows early return when only updates are requested without initial sync.
153-162: LGTM! State encoding is properly gated.The
UpdatesPartitionis only created whenGetUpdates()is true, correctly aligning with when stream state is collected.
396-418: LGTM! Clean functional options implementation.The option configurators follow a standard pattern and integrate well with
NewConn.
171-188: LGTM! GetInfo now returns the configured ID.This allows users to set a custom identifier for the connector via the
--idflag.
9d40246 to
6a2f956
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conv.go`:
- Around line 74-75: The nil branch currently returns an
AttributeValueMemberNULL with the zero-value (false); update the nil case to
return &types.AttributeValueMemberNULL{Value: true} so DynamoDB NULL attributes
serialize correctly—locate the nil case in connectors/dynamodb/conv.go (the
switch/case that returns types.AttributeValueMemberNULL) and set Value: true on
the returned struct.
In `@internal/app/options/connectorflags.go`:
- Around line 347-375: The non-localstack branch currently returns
dynamodb.NewConn("") and ignores CLI flags; change it to parse the connection
string (reusing strings.Cut if a scheme is present) and call dynamodb.NewConn
with the same option builders used for localstack:
dynamodb.WithDocsPerSegment(c.Int("doc-partition")),
dynamodb.WithPlanParallelism(c.Int("plan-parallelism")), and
dynamodb.WithID(c.String("id")). Update the else branch in the Create helper
(the CreateHelper block that calls dynamodb.NewConn) so both the localstack and
non-localstack paths construct the connection string and pass the options to
dynamodb.NewConn instead of returning an empty connection.
♻️ Duplicate comments (2)
connectors/dynamodb/conn.go (2)
82-109: Stream ARN is still discarded after StartStream.The ARN returned by
StartStreamis ignored, soGetStreamStatecan use a stale/empty ARN. This was flagged earlier. Please capture and use the returned ARN.🔧 Proposed fix
if r.Msg.GetUpdates() { if tableDetails.StreamARN == "" { if c.spec == "localstack" { slog.Debug("No stream found, starting stream", "table", name) - _, err := c.client.StartStream(egCtx, name, false) + tableDetails.StreamARN, err = c.client.StartStream(egCtx, name, false) if err != nil { return err } } else { return fmt.Errorf("no stream found") } } else if tableDetails.IncompatibleStream { if c.spec == "localstack" { slog.Debug("Incompatible stream found, restarting stream", "table", name) - _, err := c.client.StartStream(egCtx, name, true) + tableDetails.StreamARN, err = c.client.StartStream(egCtx, name, true) if err != nil { return err } } else { return fmt.Errorf("incompatible stream found") } } state, err := c.client.GetStreamState(egCtx, tableDetails.StreamARN)
153-162: UpdatesPartition should use resolved table names.When namespaces are auto-discovered,
namespacesis empty buttableNamesis populated. Usingnamespacesproduces emptyUpdatesPartition.Namespaces. This was previously flagged.🔧 Proposed fix
- updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()}) + updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})
7dc6a97 to
10b8acc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conn.go`:
- Around line 420-441: The WithPlanParallelism setter currently assigns the raw
integer to Options.PlanParallelism which can be zero or negative and lead to
deadlocks or disabled limits when later passed to errgroup.SetLimit; update the
WithPlanParallelism function to clamp the input to a minimum of 1 before
assigning (e.g., if n < 1 then set n = 1) so Options.PlanParallelism is always
>= 1 and downstream calls like errgroup.SetLimit(Options.PlanParallelism) are
safe.
♻️ Duplicate comments (1)
connectors/dynamodb/conn.go (1)
156-165: Use resolved table names for UpdatesPartition.Namespaces.When
r.Msg.GetNamespaces()is empty,namespacesstays empty whiletableNamesholds the resolved tables. Usingnamespaceshere yields empty UpdatesPartition namespaces in auto-discovery scenarios.✅ Proposed fix
- updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: namespaces, Cursor: buf.Bytes()}) + updates = append(updates, &adiomv1.UpdatesPartition{Namespaces: tableNames, Cursor: buf.Bytes()})
| ID string | ||
| DocsPerSegment int | ||
| PlanParallelism int | ||
| } | ||
|
|
||
| func WithID(s string) func(*Options) { | ||
| return func(o *Options) { | ||
| o.ID = s | ||
| } | ||
| } | ||
|
|
||
| func WithPlanParallelism(n int) func(*Options) { | ||
| return func(o *Options) { | ||
| o.PlanParallelism = n | ||
| } | ||
| } | ||
|
|
||
| func WithDocsPerSegment(n int) func(*Options) { | ||
| return func(o *Options) { | ||
| o.DocsPerSegment = n | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, find the file and examine the context around the WithPlanParallelism function
fd -t f "conn.go" | grep dynamodbRepository: adiom-data/dsync
Length of output: 87
🏁 Script executed:
# Search for where PlanParallelism is used in the codebase
rg "PlanParallelism" --type go -B 3 -A 3Repository: adiom-data/dsync
Length of output: 2523
🏁 Script executed:
# Search for errgroup.SetLimit calls to understand how PlanParallelism is used
rg "SetLimit" --type go -B 5 -A 5Repository: adiom-data/dsync
Length of output: 4530
🌐 Web query:
golang.org/x/sync/errgroup SetLimit panic behavior zero negative values
💡 Result:
-
SetLimit(n)does not panic just becausenis0or negative.n < 0: disables the limit (“no limit”) by setting the internal semaphore tonil. [1], [2]n == 0: sets a limit of zero (“prevent any new goroutines from being added”). Practically,Group.Go(...)will block forever trying to acquire the limit, andTryGo(...)will always returnfalse(it can never acquire a slot). [1], [2]
-
When it does panic:
SetLimitpanics if you call it while the group still has active goroutines (i.e., the limit is being modified while tokens are currently held). The implementation checkslen(g.sem) != 0and panics in that case. [2]
Sources:
[1] golang.org/x/sync/errgroup docs on pkg.go.dev (SetLimit behavior for negative/zero, and “must not be modified while any goroutines … are active”). (pkg.go.dev)
[2] Go codereview showing the SetLimit implementation and its panic condition. (groups.google.com)
Citations:
- 1: https://pkg.go.dev/golang.org/x/sync/errgroup?utm_source=openai
- 2: https://groups.google.com/g/golang-codereviews/c/7R9CSJ8ZC24?utm_source=openai
Guard PlanParallelism to prevent deadlock or unintended unlimited concurrency.
While errgroup.SetLimit does not panic on zero or negative values, passing these values causes problematic behavior: n == 0 causes Go() calls to block indefinitely, and n < 0 disables the limit entirely. Clamp the value to a minimum of 1.
♻️ Suggested guard
func WithPlanParallelism(n int) func(*Options) {
return func(o *Options) {
- o.PlanParallelism = n
+ if n < 1 {
+ n = 1
+ }
+ o.PlanParallelism = n
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ID string | |
| DocsPerSegment int | |
| PlanParallelism int | |
| } | |
| func WithID(s string) func(*Options) { | |
| return func(o *Options) { | |
| o.ID = s | |
| } | |
| } | |
| func WithPlanParallelism(n int) func(*Options) { | |
| return func(o *Options) { | |
| o.PlanParallelism = n | |
| } | |
| } | |
| func WithDocsPerSegment(n int) func(*Options) { | |
| return func(o *Options) { | |
| o.DocsPerSegment = n | |
| } | |
| } | |
| ID string | |
| DocsPerSegment int | |
| PlanParallelism int | |
| } | |
| func WithID(s string) func(*Options) { | |
| return func(o *Options) { | |
| o.ID = s | |
| } | |
| } | |
| func WithPlanParallelism(n int) func(*Options) { | |
| return func(o *Options) { | |
| if n < 1 { | |
| n = 1 | |
| } | |
| o.PlanParallelism = n | |
| } | |
| } | |
| func WithDocsPerSegment(n int) func(*Options) { | |
| return func(o *Options) { | |
| o.DocsPerSegment = n | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@connectors/dynamodb/conn.go` around lines 420 - 441, The WithPlanParallelism
setter currently assigns the raw integer to Options.PlanParallelism which can be
zero or negative and lead to deadlocks or disabled limits when later passed to
errgroup.SetLimit; update the WithPlanParallelism function to clamp the input to
a minimum of 1 before assigning (e.g., if n < 1 then set n = 1) so
Options.PlanParallelism is always >= 1 and downstream calls like
errgroup.SetLimit(Options.PlanParallelism) are safe.
ba679b6 to
b6bafb5
Compare
Expose doc-partition and plan-parallelism params.
shuffle tasks
b6bafb5 to
d0546b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@connectors/dynamodb/conn.go`:
- Around line 77-80: When handling ErrNotFound for a table (errors.Is(err,
ErrNotFound) with the "name" variable), do not silently return nil when the user
provided explicit namespaces; only ignore missing tables when namespaces were
auto-discovered. Modify the branch to check the namespace source (e.g., a
boolean like namespacesExplicit or a config field that indicates user-specified
namespaces) and: if the table is missing and namespaces were auto-discovered,
log the warning and return nil; if namespaces were explicitly provided, return
the ErrNotFound (or propagate err) instead of nil so the caller can handle the
missing table.
| if errors.Is(err, ErrNotFound) { | ||
| slog.Warn("Table not found. Ignoring.", "name", name) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid silently skipping missing tables when namespaces are explicit.
If the user specified namespaces, ignoring ErrNotFound can lead to silent data loss. Consider only skipping when namespaces were auto-discovered.
🔧 Suggested fix
- if err != nil {
- if errors.Is(err, ErrNotFound) {
- slog.Warn("Table not found. Ignoring.", "name", name)
- return nil
- }
- return err
- }
+ if err != nil {
+ if errors.Is(err, ErrNotFound) {
+ if len(namespaces) == 0 {
+ slog.Warn("Table not found. Ignoring.", "name", name)
+ return nil
+ }
+ }
+ return err
+ }🤖 Prompt for AI Agents
In `@connectors/dynamodb/conn.go` around lines 77 - 80, When handling ErrNotFound
for a table (errors.Is(err, ErrNotFound) with the "name" variable), do not
silently return nil when the user provided explicit namespaces; only ignore
missing tables when namespaces were auto-discovered. Modify the branch to check
the namespace source (e.g., a boolean like namespacesExplicit or a config field
that indicates user-specified namespaces) and: if the table is missing and
namespaces were auto-discovered, log the warning and return nil; if namespaces
were explicitly provided, return the ErrNotFound (or propagate err) instead of
nil so the caller can handle the missing table.
Summary by CodeRabbit
Refactor
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.