Improve signal-to-noise ratio during deployment (TUI)#1750
Improve signal-to-noise ratio during deployment (TUI)#1750jordanstephens wants to merge 43 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds a Bubble Tea TUI for compose up, replaces polling with a streaming WatchServiceState+callback, introduces Monitor to coordinate monitoring and tailing, refactors service listing to ServiceEndpoint with healthcheck logic, updates agent CLI interfaces/mocks, and bumps a Nix vendor hash and Go deps. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/CLI
participant Compose as Compose Command
participant UI as Bubble Tea UI
participant Monitor as Monitor / WatchServiceState
participant Provider as Backend Provider
User->>Compose: defang compose up
Compose->>Compose: choose mode (interactive vs verbose/tail)
alt Interactive
Compose->>UI: start MonitorWithUI
Compose->>Monitor: Monitor(ctx, project, provider, timeout, deploymentID, callback)
Monitor->>Provider: subscribe stream
Provider-->>Monitor: SubscribeResponse messages
Monitor->>UI: invoke callback / send updates
UI->>User: render live service statuses
Monitor->>Compose: return final states
Compose->>UI: stop UI and optionally tail logs
else Verbose / Non-interactive
Compose->>Monitor: TailAndMonitor (monitor + tail)
Monitor->>Provider: stream states and logs
Provider-->>User: logs streamed to terminal
Monitor->>Compose: return final states
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
d17d7ad to
2fa03a8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/pkg/cli/subscribe.go:
- Around line 71-74: The goroutine that reads from serverStream can exit without
sending anything to errChan when serverStream.Receive() returns false and
serverStream.Err() is nil, which causes the main select loop to block; modify
the reader goroutine (the one calling serverStream.Receive() and
serverStream.Err()) to always signal completion by sending a sentinel error
(e.g., io.EOF or a package-level errStreamClosed) to errChan or by closing
errChan before returning, and update the main select loop that waits on errChan
to treat that sentinel/closed channel as stream completion and exit or continue
appropriately (handle errStreamClosed/io.EOF in the select case that currently
expects errors from errChan).
🧹 Nitpick comments (4)
src/cmd/cli/command/compose.go (2)
174-179: Silently swallowing UI errors may hide issues.The error from
p.Run()is discarded. While UI errors may not be critical, at minimum consider logging them for debugging purposes.🔎 Proposed fix
go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debug("UI error:", err) } }()
203-215: Duplicate function:splitManagedAndUnmanagedServicesexists intailAndMonitor.go.This function is duplicated from
src/pkg/cli/tailAndMonitor.go(lines 134-145). Consider importing it from theclipackage instead of duplicating it here.🔎 Proposed fix
-func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) { - var managedServices []string - var unmanagedServices []string - for _, service := range serviceInfos { - if cli.CanMonitorService(&service) { - unmanagedServices = append(unmanagedServices, service.Name) - } else { - managedServices = append(managedServices, service.Name) - } - } - - return managedServices, unmanagedServices -}Then use the function from the
clipackage directly where needed, or exportsplitManagedAndUnmanagedServicesfrom theclipackage if it isn't already.src/pkg/cli/tailAndMonitor.go (2)
19-21: Panic for invalid input may be too aggressive for a library function.Panicking on empty
deploymentIDwill crash the program. If this is called with user-provided or dynamically-determined input, consider returning an error instead. If this is strictly an internal function where the caller guarantees validity, the panic is acceptable as a programming error detector.🔎 Proposed fix to return an error instead
func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) { if deploymentID == "" { - panic("deploymentID must be a valid deployment ID") + return nil, errors.New("deploymentID must be a valid deployment ID") }
53-56: Clarify the purpose of the 2-second sleep before returning.The
SleepWithContextcall at line 54 delays the return by 2 seconds. There's also another 2-second sleep at line 79 inTailAndMonitor. If both are executed in theTailAndMonitorpath, that's 4 seconds of delay. Consider adding a comment explaining why this delay is needed, or consolidate if both serve the same purpose.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
pkgs/defang/cli.nixsrc/cmd/cli/command/compose.gosrc/go.modsrc/pkg/cli/client/errors.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.gosrc/pkg/cli/tailAndMonitor.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.
Applied to files:
src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/pkg/cli/subscribe_test.gosrc/pkg/cli/client/errors.gosrc/pkg/cli/tailAndMonitor.gosrc/pkg/cli/subscribe.gosrc/cmd/cli/command/compose.go
🧬 Code graph analysis (2)
src/pkg/cli/tailAndMonitor.go (6)
src/pkg/cli/client/provider.go (1)
Provider(54-80)src/protos/io/defang/v1/fabric.pb.go (10)
Provider(28-28)Provider(66-68)Provider(70-72)Provider(79-81)SubscribeResponse(4202-4211)SubscribeResponse(4224-4224)SubscribeResponse(4239-4241)Deployment(2822-2838)Deployment(2851-2851)Deployment(2866-2868)src/pkg/cli/subscribe.go (2)
ServiceStates(15-15)WatchServiceState(17-114)src/pkg/cli/waitForCdTaskExit.go (1)
WaitForCdTaskExit(14-41)src/pkg/utils.go (1)
SleepWithContext(116-125)src/pkg/cli/tail.go (1)
TailOptions(45-58)
src/pkg/cli/subscribe.go (3)
src/protos/io/defang/v1/fabric.pb.go (15)
SubscribeResponse(4202-4211)SubscribeResponse(4224-4224)SubscribeResponse(4239-4241)Status(3248-3253)Status(3266-3266)Status(3281-3283)ServiceState(135-135)ServiceState(199-201)ServiceState(203-205)ServiceState(212-214)ServiceState_BUILD_FAILED(151-151)ServiceState_DEPLOYMENT_FAILED(150-150)Service(3835-3840)Service(3853-3853)Service(3868-3870)src/pkg/types/etag.go (1)
ETag(9-9)src/pkg/cli/client/errors.go (2)
ErrDeploymentSucceeded(8-8)ErrDeploymentFailed(10-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: nix-shell-test
- GitHub Check: go-playground-test
- GitHub Check: Analyze (go)
🔇 Additional comments (9)
src/pkg/cli/client/errors.go (1)
3-8: LGTM! Sentinel error for deployment success signaling.Using
ErrDeploymentSucceededas a sentinel to break out of the streaming loop is a clean pattern. The callback inWatchServiceStatecorrectly converts this to anilreturn, so it won't propagate as an actual error to callers.pkgs/defang/cli.nix (1)
10-10: LGTM! Vendor hash updated for new dependencies.The hash change is expected given the new Bubble Tea ecosystem dependencies added to
go.mod.src/pkg/cli/subscribe_test.go (1)
225-227: LGTM! Corrected test assertion message.The previous message "Unexpected error" was misleading since this branch handles the case where
err == nilbut an error was expected. "Expected error but got nil" accurately describes the failure condition.src/cmd/cli/command/compose.go (2)
374-374: Redundant error check aftermonitorWithUI.Line 374 checks
if err != nilbuterris only set by thetailbranch (line 368). For themonitorWithUIbranch, the error is returned directly inserviceStates, err = monitorWithUI(...)at line 372, buterris checked again at line 374.The issue is that
errfrom line 372 won't be caught by line 374's condition because the code flow doesn't assign toerraftermonitorWithUI. Looking more carefully, line 372 does assign toerr, so this should work, but the structure is confusing because line 368 returns early.Actually, re-reading: line 368 has
return errinside theif tailblock, so line 374 only applies to themonitorWithUIpath. This is correct but the flow is a bit hard to follow.
362-373: Verify behavior whenmonitorWithUIreturns both states and error.When
monitorWithUIreturns an error, the code continues to lines 374+ which handles deployment errors. However, ifserviceStatesis partially populated on error, the subsequent logic at lines 379-388 (checking for unbuilt services) depends on those states being correct. Ensure this is the intended behavior.src/pkg/cli/tailAndMonitor.go (1)
65-66: Usingcontext.Background()fortailCtxdisconnects it from the parent context.The tail context is derived from
context.Background()rather thanctx, meaning parent cancellation (e.g., user pressing Ctrl+C) won't directly cancel tailing. Cancellation only happens viacancelTail(errMonitoringDone)when monitoring completes. This appears intentional to ensure logs continue until monitoring finishes, but verify this matches the expected UX when a user cancels mid-deployment.src/pkg/cli/subscribe.go (2)
82-86: Good pattern: non-blocking send with context check.The select with
ctx.Done()prevents the goroutine from blocking forever if the main loop has exited due to context cancellation. This is the correct approach for channel sends in goroutines.
116-139: LGTM! Clean wrapper preserving the original API.
WaitServiceStateprovides a clean wrapper aroundWatchServiceStatewith a callback that implements the original target-state-based completion logic. The use ofErrDeploymentSucceededas a sentinel to signal success is handled correctly byWatchServiceState(converted tonilreturn at line 108).src/go.mod (1)
37-39: LGTM! Bubble Tea ecosystem dependencies for TUI.The Charm Bracelet libraries (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0) are well-established for building terminal UIs in Go. These versions are current stable releases: lipgloss v1.1.0 is the latest available, while bubbles and bubbletea have stable v0.21.0 and v1.3.10 releases respectively (newer pre-release versions exist but are not yet stable).
40fd11a to
e492b30
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pkg/cli/subscribe.go (1)
39-42: Critical: Data race on serverStream variable between goroutines.The cleanup goroutine (lines 39-42) reads and calls
serverStream.Close(), while the receiver goroutine (line 64) reassignsserverStreamduring reconnection. This creates a data race that could result in:
- Closing the wrong stream (old vs. new)
- Double-close panics
- Accessing a closed stream
The receiver goroutine should manage its own stream lifecycle, or use synchronization (e.g., a mutex) to protect concurrent access.
🔒 Proposed fix using atomic value or local variable
Option 1: Keep stream local to receiver goroutine
- serverStream, err := provider.Subscribe(ctx, &subscribeRequest) - if err != nil { - return nil, err - } - ctx, cancel := context.WithCancel(ctx) defer cancel() // to ensure we close the stream and clean-up this context - go func() { - <-ctx.Done() - serverStream.Close() - }() - // ... // Run stream receiving in a separate goroutine go func() { + serverStream, err := provider.Subscribe(ctx, &subscribeRequest) + if err != nil { + errChan <- err + return + } + defer serverStream.Close() + for { + // Check context before blocking on Receive + select { + case <-ctx.Done(): + return + default: + } + if !serverStream.Receive() { // ... reconnection logic } // ... } }()This eliminates the race by making the receiver goroutine own the stream lifecycle entirely.
Option 2: Use sync.Mutex to protect access
Add a mutex to protect concurrent access to
serverStreamif it must be shared.Also applies to: 64-68
🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 42-46: The services map on deploymentModel is accessed
concurrently; add a sync.RWMutex field to deploymentModel and use it to guard
all accesses to services (use Lock/Unlock when adding/removing/modifying entries
in Update and the monitoring goroutine that calls p.Send, and RLock/RUnlock when
reading/iterating the map in View). Ensure you acquire the mutex before any map
read/write, and when iterating in View prefer to either hold RLock for the
duration of the read or copy the needed keys/values under lock into a local
slice then release the lock before doing heavier UI work.
- Around line 203-215: The function splitManagedAndUnmanagedServices is
duplicated; extract a single implementation into the cli package and have both
call sites use it: move the logic into the existing cli package as an exported
function (e.g., SplitManagedAndUnmanagedServices or keep the same name but
exported if needed), preserving the signature (accepting compose.Services and
returning ([]string, []string)), update the places that currently define their
own copy (the command package and the tail/monitor code) to call
cli.SplitManagedAndUnmanagedServices, and remove the duplicate implementations;
ensure imports and visibility are adjusted so compose.Services is referenced
correctly from the new shared function.
In @src/pkg/cli/subscribe.go:
- Around line 71-74: The receiver goroutine can exit without signaling when
serverStream.Receive() returns false and serverStream.Err() is nil; modify the
goroutine that calls serverStream.Receive() to push a sentinel value (e.g., a
predefined error variable like errStreamClosed) onto errChan before returning
whenever serverStream.Err() == nil, and ensure the main select/loop that reads
from errChan (the loop handling responses and errors) treats errStreamClosed as
a clean/finished condition (stop waiting and exit gracefully) instead of
blocking; reference the receiver that calls serverStream.Receive(), the check
serverStream.Err(), and the errChan/errStreamClosed symbols so the change is
applied in the right places.
In @src/pkg/cli/tailAndMonitor.go:
- Around line 19-21: Replace the panic on invalid deploymentID with a proper
error return: change the check that currently does if deploymentID == "" {
panic(...) } to return an error (e.g., return fmt.Errorf("invalid deploymentID:
empty string")) from the enclosing function, update the function signature to
return (error) if needed, and propagate that error to callers so they can handle
it instead of crashing the process; reference the deploymentID validation block
in tailAndMonitor.go and update any callers to handle the returned error.
- Around line 61-63: The TailAndMonitor function currently calls
panic("tailOptions.Deployment must be a valid deployment ID") when
tailOptions.Deployment is empty; change this to return a proper error instead:
validate tailOptions.Deployment at the start of TailAndMonitor, and if empty
return a descriptive error (e.g., fmt.Errorf or errors.New) rather than
panicking so callers can handle it; update the function signature if necessary
to return an error and propagate that error from callers of TailAndMonitor
(search for TailAndMonitor and tailOptions.Deployment to locate sites to
update).
🧹 Nitpick comments (5)
src/pkg/cli/client/errors.go (1)
8-8: Document the unconventional use of an error for success.Using an error to signal successful deployment completion is unconventional and may confuse maintainers. Consider adding a comment explaining that this sentinel error is used to signal completion in callback-based monitoring flows, distinguishing it from actual failure errors.
📝 Suggested documentation
+// ErrDeploymentSucceeded is a sentinel error returned by watch callbacks +// to signal successful deployment completion (not an actual error condition). var ErrDeploymentSucceeded = errors.New("deployment succeeded")src/cmd/cli/command/compose.go (3)
80-80: Close the update channel to prevent goroutine leaks.The
updateChis never closed, which could lead to goroutines waiting on this channel to never terminate properly. While the Bubble Tea program'sQuitwill stop the UI, it's good practice to close channels when done sending.Consider closing the channel in the monitoring goroutine after all updates are sent:
go func() { defer wg.Done() serviceStates, monitorErr = cli.Monitor(ctx, project, provider, waitTimeout, deploymentID, func(msg *defangv1.SubscribeResponse, states *cli.ServiceStates) error { // Send service status updates to the bubbletea model for name, state := range *states { p.Send(serviceUpdate{ name: name, status: state.String(), }) } return nil }) close(model.updateCh) // Close channel when done // Quit the UI when monitoring is done p.Quit() }()Note: Verify that no other goroutines are sending to this channel before closing it.
174-179: Handle UI errors instead of silently ignoring them.The error returned by
p.Run()is currently ignored. While UI errors might be rare, they should at least be logged for debugging purposes.go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debug("Bubble Tea UI error:", err) } }()
474-474: Minor: Grammar improvement.The message changed from "was previously deployed" to "has already deployed". While both are understandable, the grammar could be improved.
✏️ Suggested grammar fix
- term.Info("This project has already deployed to the following locations:") + term.Info("This project has already been deployed to the following locations:")src/pkg/cli/subscribe.go (1)
54-88: Consider explicit context check in receiver loop for clarity.The receiver goroutine doesn't explicitly check
ctx.Done()before calling the blockingserverStream.Receive(). While the cleanup goroutine (lines 39-42) closes the stream on context cancellation, adding an explicit check would make the cancellation path more evident:for { select { case <-ctx.Done(): return default: } if !serverStream.Receive() { // ... } // ... }This is especially helpful given that the stream reassignment race (identified separately) means the cleanup goroutine might not reliably close the current stream.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
pkgs/defang/cli.nixsrc/cmd/cli/command/compose.gosrc/go.modsrc/pkg/cli/client/errors.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.gosrc/pkg/cli/tailAndMonitor.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkgs/defang/cli.nix
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-07T03:07:48.228Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.
Applied to files:
src/pkg/cli/client/errors.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/pkg/cli/client/errors.gosrc/pkg/cli/subscribe_test.gosrc/pkg/cli/subscribe.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/tailAndMonitor.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.
Applied to files:
src/pkg/cli/subscribe_test.go
🧬 Code graph analysis (2)
src/pkg/cli/subscribe.go (4)
src/pkg/cli/client/provider.go (1)
Provider(54-80)src/protos/io/defang/v1/fabric.pb.go (19)
Provider(28-28)Provider(66-68)Provider(70-72)Provider(79-81)SubscribeResponse(4286-4295)SubscribeResponse(4308-4308)SubscribeResponse(4323-4325)Status(3332-3337)Status(3350-3350)Status(3365-3367)ServiceState(135-135)ServiceState(199-201)ServiceState(203-205)ServiceState(212-214)ServiceState_BUILD_FAILED(151-151)ServiceState_DEPLOYMENT_FAILED(150-150)Service(3919-3924)Service(3937-3937)Service(3952-3954)src/pkg/types/etag.go (1)
ETag(9-9)src/pkg/cli/client/errors.go (2)
ErrDeploymentSucceeded(8-8)ErrDeploymentFailed(10-13)
src/pkg/cli/tailAndMonitor.go (5)
src/pkg/cli/client/provider.go (1)
Provider(54-80)src/pkg/cli/subscribe.go (2)
ServiceStates(15-15)WatchServiceState(17-114)src/pkg/cli/waitForCdTaskExit.go (1)
WaitForCdTaskExit(14-41)src/pkg/utils.go (1)
SleepWithContext(116-125)src/pkg/cli/tail.go (1)
TailOptions(45-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Analyze (go)
- GitHub Check: go-test
🔇 Additional comments (7)
src/pkg/cli/subscribe_test.go (1)
226-226: LGTM! Clearer error message.The updated message accurately describes the failure condition when the test expects an error but receives nil, improving test readability.
src/cmd/cli/command/compose.go (2)
80-80: Consider whether channel buffer size is sufficient.The
updateChhas a buffer of 100 messages. For deployments with many services or rapid state transitions, this buffer could fill up and cause the monitoring goroutine to block when sending updates viap.Send.Can you estimate the maximum number of state updates expected during a typical deployment? Each service goes through multiple states (BUILD_QUEUED, BUILD_PROVISIONING, BUILD_RUNNING, etc.), so for N services, you might see 5-10 updates per service. With 100 services, that could exceed the buffer.
Consider either:
- Increasing the buffer size significantly (e.g., 1000)
- Using an unbuffered channel (relying on Bubble Tea's internal buffering)
- Making the channel buffering configurable based on the number of services
364-373: LGTM! Clean separation of tail vs. UI monitoring paths.The conditional logic appropriately preserves the tail behavior when
--tailis specified while introducing the new UI-based monitoring as the default. The error handling for both paths is consistent.src/pkg/cli/tailAndMonitor.go (2)
18-57: LGTM! Well-structured monitoring function.The
Monitorfunction properly:
- Validates inputs (though should return error instead of panic)
- Applies timeout when specified
- Runs monitoring and CD task checking concurrently
- Aggregates errors from both sources
- Provides callback mechanism for state updates
The use of
sync.WaitGroupand proper context cancellation ensures clean goroutine lifecycle management.
59-118: Verify the 2-second sleep is necessary.Lines 54 and 79 include a 2-second sleep before cancellation. While the comment at line 79 explains it's to "make sure we get last status messages," this seems like a brittle approach that relies on timing rather than proper synchronization.
Consider whether there's a more reliable way to ensure all status messages are received before cancellation, such as:
- Waiting for a completion signal from the subscription
- Using a done channel to signal message processing is complete
- Flushing any buffered messages
The hardcoded 2-second delay may be insufficient under heavy load or excessive when messages arrive quickly.
src/go.mod (1)
37-39: Bubble Tea library versions are valid and secure.The added dependencies (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0, termenv v0.16.0) are the latest stable releases as of January 2026 with no known published vulnerabilities or CVEs.
src/pkg/cli/subscribe.go (1)
77-80: The nil check afterReceive()is necessary and justified.The
serverStream.Msg()call can legitimately returnnileven afterReceive()returns true, as evidenced by the test implementation insubscribe_test.gowhereMockSubscribeServerStreamForReconnectTest.Msg()explicitly returnsnil. This defensive check is appropriate for robust stream handling. Consider adding a comment explaining this edge case for future maintainers.
src/cmd/cli/command/compose.go
Outdated
| func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) { | ||
| var managedServices []string | ||
| var unmanagedServices []string | ||
| for _, service := range serviceInfos { | ||
| if cli.CanMonitorService(&service) { | ||
| unmanagedServices = append(unmanagedServices, service.Name) | ||
| } else { | ||
| managedServices = append(managedServices, service.Name) | ||
| } | ||
| } | ||
|
|
||
| return managedServices, unmanagedServices | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
rg -n "func splitManagedAndUnmanagedServices" --type=goRepository: DefangLabs/defang
Length of output: 313
🏁 Script executed:
# Read the function from tailAndMonitor.go
echo "=== tailAndMonitor.go ===" && sed -n '134,146p' src/pkg/cli/tailAndMonitor.go
# Read the function from compose.go
echo "" && echo "=== compose.go ===" && sed -n '203,215p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 940
🏁 Script executed:
# Check package declarations
echo "=== tailAndMonitor.go package ===" && head -20 src/pkg/cli/tailAndMonitor.go | grep -E "^package"
echo "=== compose.go package ===" && head -20 src/cmd/cli/command/compose.go | grep -E "^package"
# Check imports to understand CanMonitorService references
echo "" && echo "=== tailAndMonitor.go imports ===" && sed -n '1,30p' src/pkg/cli/tailAndMonitor.go | grep -A 20 "^import"
echo "" && echo "=== compose.go imports ===" && sed -n '1,30p' src/cmd/cli/command/compose.go | grep -A 20 "^import"Repository: DefangLabs/defang
Length of output: 1448
Extract duplicate function to shared location to avoid code duplication.
The splitManagedAndUnmanagedServices function is duplicated in two locations with identical logic:
src/pkg/cli/tailAndMonitor.go(line 134, packagecli)src/cmd/cli/command/compose.go(line 203, packagecommand)
Since compose.go already imports the cli package, this function should be moved to src/pkg/cli to eliminate the duplication and follow the DRY principle.
🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 203 - 215, The function
splitManagedAndUnmanagedServices is duplicated; extract a single implementation
into the cli package and have both call sites use it: move the logic into the
existing cli package as an exported function (e.g.,
SplitManagedAndUnmanagedServices or keep the same name but exported if needed),
preserving the signature (accepting compose.Services and returning ([]string,
[]string)), update the places that currently define their own copy (the command
package and the tail/monitor code) to call cli.SplitManagedAndUnmanagedServices,
and remove the duplicate implementations; ensure imports and visibility are
adjusted so compose.Services is referenced correctly from the new shared
function.
| if deploymentID == "" { | ||
| panic("deploymentID must be a valid deployment ID") | ||
| } |
There was a problem hiding this comment.
Replace panic with error return for production code.
Using panic for input validation in production code can crash the entire application. This should return an error instead, allowing callers to handle the validation failure gracefully.
🛡️ Proposed fix
-func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
+func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
if deploymentID == "" {
- panic("deploymentID must be a valid deployment ID")
+ return nil, errors.New("deploymentID must be a valid deployment ID")
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if deploymentID == "" { | |
| panic("deploymentID must be a valid deployment ID") | |
| } | |
| if deploymentID == "" { | |
| return nil, errors.New("deploymentID must be a valid deployment ID") | |
| } |
🤖 Prompt for AI Agents
In @src/pkg/cli/tailAndMonitor.go around lines 19 - 21, Replace the panic on
invalid deploymentID with a proper error return: change the check that currently
does if deploymentID == "" { panic(...) } to return an error (e.g., return
fmt.Errorf("invalid deploymentID: empty string")) from the enclosing function,
update the function signature to return (error) if needed, and propagate that
error to callers so they can handle it instead of crashing the process;
reference the deploymentID validation block in tailAndMonitor.go and update any
callers to handle the returned error.
| if tailOptions.Deployment == "" { | ||
| panic("tailOptions.Deployment must be a valid deployment ID") | ||
| } |
There was a problem hiding this comment.
Replace panic with error return for production code.
Similar to the issue in the Monitor function, using panic for input validation in TailAndMonitor can crash the application. Return an error instead.
🛡️ Proposed fix
tailOptions.Follow = true
if tailOptions.Deployment == "" {
- panic("tailOptions.Deployment must be a valid deployment ID")
+ return nil, errors.New("tailOptions.Deployment must be a valid deployment ID")
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @src/pkg/cli/tailAndMonitor.go around lines 61 - 63, The TailAndMonitor
function currently calls panic("tailOptions.Deployment must be a valid
deployment ID") when tailOptions.Deployment is empty; change this to return a
proper error instead: validate tailOptions.Deployment at the start of
TailAndMonitor, and if empty return a descriptive error (e.g., fmt.Errorf or
errors.New) rather than panicking so callers can handle it; update the function
signature if necessary to return an error and propagate that error from callers
of TailAndMonitor (search for TailAndMonitor and tailOptions.Deployment to
locate sites to update).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 42-46: Add a sync.RWMutex field to the deploymentModel struct to
protect concurrent access to the services map, then wrap all writes to the map
(in the Update method) with mutex.Lock() / mutex.Unlock() and wrap all reads of
the map (in the View method) with mutex.RLock() / mutex.RUnlock(); ensure you
reference the deploymentModel.services accesses in Update and View and hold the
lock only for the minimal time needed to modify or read the map to avoid
blocking other goroutines.
- Around line 378-389: The loop that collects unbuiltServices currently checks
serviceStates against defangv1.ServiceState_BUILD_STOPPING and thus misses
services in defangv1.ServiceState_BUILD_FAILED; update the condition in the for
loop that populates unbuiltServices (the variable serviceStates and the check
using defangv1.ServiceState_BUILD_STOPPING) to also include
defangv1.ServiceState_BUILD_FAILED (e.g., change the if to include state ==
defangv1.ServiceState_BUILD_FAILED or an equivalent check) so
tailOptions.Services and tailOptions.LogType = logs.LogTypeBuild correctly
include failed builds.
🧹 Nitpick comments (1)
src/cmd/cli/command/compose.go (1)
157-202: Consider logging UI errors for diagnostics.The UI goroutine silently discards errors from
p.Run()(line 177). While these errors may not be critical to deployment success, logging them could aid debugging.🔍 Optional improvement to log UI errors
// Start the bubbletea UI in a goroutine go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debugf("UI error: %v", err) } }()
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/cmd/cli/command/compose.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/cmd/cli/command/compose.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Analyze (go)
🔇 Additional comments (8)
src/cmd/cli/command/compose.go (8)
10-12: LGTM: Imports align with new UI requirements.The added imports (
sort,sync, Bubble Tea libraries) are appropriate for the terminal UI and concurrent monitoring features introduced in this PR.Also applies to: 33-35
64-82: LGTM: Constructor properly initializes UI model.The initialization logic correctly sets up the services map, spinners, and update channel. The buffer size of 100 for
updateChprovides reasonable headroom for status updates.
84-90: LGTM: Standard Bubble Tea initialization pattern.The
Initmethod correctly batches spinner tick commands for all services.
92-114: LGTM: Correct Bubble Tea update logic.The
Updatemethod properly handles keyboard input, service status updates, and spinner ticks. The existence check on line 100 prevents panics from unknown service names.
116-155: LGTM: Clean UI rendering with stable ordering.The
Viewmethod correctly sorts services for consistent display and uses appropriate visual indicators (✓, ✗, spinner) based on service state.
356-377: LGTM: Clean separation between tail and UI monitoring paths.The branching logic correctly preserves the full log tailing behavior behind the
--tailflag while defaulting to the new UI-based monitoring. The info message on line 372 helpfully guides users to access logs when needed.
475-475: LGTM: Text improvement enhances clarity.The updated message is more concise and natural.
204-216: Extract duplicate function to avoid code duplication.This function is duplicated in
src/pkg/cli/tailAndMonitor.go(line 134). Move it to a shared location in theclipackage to follow the DRY principle, as thecommandpackage already importscli.Likely an incorrect or invalid review comment.
src/cmd/cli/command/compose.go
Outdated
| type deploymentModel struct { | ||
| services map[string]*serviceState | ||
| quitting bool | ||
| updateCh chan serviceUpdate | ||
| } |
There was a problem hiding this comment.
Add mutex to protect concurrent map access.
The services map is accessed concurrently by the UI goroutine (in View at lines 124-151) and the monitoring goroutine (via p.Send at line 188). Without synchronization, this causes race conditions and potential panics.
Add a sync.RWMutex field to deploymentModel and protect all map accesses:
- Use
Lock/Unlockwhen modifying the map inUpdate(lines 100-101) - Use
RLock/RUnlockwhen reading the map inView(lines 124-151)
🔒 Proposed fix to add mutex protection
type deploymentModel struct {
services map[string]*serviceState
+ mu sync.RWMutex
quitting bool
updateCh chan serviceUpdate
}Then protect access in the Update method around line 100:
case serviceUpdate:
+ m.mu.Lock()
if svc, exists := m.services[msg.name]; exists {
svc.status = msg.status
}
+ m.mu.Unlock()
return m, nilAnd in the View method around line 122:
func (m *deploymentModel) View() string {
if m.quitting {
return ""
}
+ m.mu.RLock()
+ defer m.mu.RUnlock()
+
var lines []string
// Sort services by name for consistent ordering
var serviceNames []stringCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 42 - 46, Add a sync.RWMutex
field to the deploymentModel struct to protect concurrent access to the services
map, then wrap all writes to the map (in the Update method) with mutex.Lock() /
mutex.Unlock() and wrap all reads of the map (in the View method) with
mutex.RLock() / mutex.RUnlock(); ensure you reference the
deploymentModel.services accesses in Update and View and hold the lock only for
the minimal time needed to modify or read the map to avoid blocking other
goroutines.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 368-372: The loop that collects unbuiltServices currently checks
if state <= defangv1.ServiceState_BUILD_STOPPING which misses
defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates
iteration (the block that appends to unbuiltServices) to also include
BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state ==
defangv1.ServiceState_BUILD_FAILED) so services that failed during build are
captured for build log output.
🧹 Nitpick comments (1)
src/cmd/cli/command/compose.go (1)
177-182: Consider logging UI errors for debugging.UI errors are silently ignored. While this may be intentional to not disrupt the user experience, consider logging them at debug level for troubleshooting.
go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debugf("UI error: %v", err) } }()
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/cmd/cli/command/compose.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices_test.gosrc/pkg/cli/subscribe.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/pkg/cli/deploymentinfo.gosrc/pkg/cli/getServices_test.gosrc/pkg/cli/deploymentinfo_test.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/subscribe.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.
Applied to files:
src/pkg/cli/getServices_test.gosrc/pkg/cli/deploymentinfo_test.go
📚 Learning: 2026-01-07T03:07:48.228Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/cmd/cli/command/compose.go
📚 Learning: 2026-01-07T17:31:18.174Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:185-204
Timestamp: 2026-01-07T17:31:18.174Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, the GetBuildStatus method is intentionally designed as a non-blocking status check. It uses op.Poll(ctx) rather than op.Wait(ctx), and returns nil when the build is nil (operation still in progress), allowing the caller to control the polling loop. The method should not be changed to use Wait() as the waiting logic is handled by the caller.
Applied to files:
src/pkg/cli/subscribe.go
🧬 Code graph analysis (1)
src/pkg/cli/subscribe.go (3)
src/pkg/cli/client/provider.go (1)
Provider(54-80)src/protos/io/defang/v1/fabric.pb.go (20)
Provider(28-28)Provider(66-68)Provider(70-72)Provider(79-81)SubscribeResponse(4286-4295)SubscribeResponse(4308-4308)SubscribeResponse(4323-4325)Status(3332-3337)Status(3350-3350)Status(3365-3367)ServiceState_NOT_SPECIFIED(138-138)ServiceState(135-135)ServiceState(199-201)ServiceState(203-205)ServiceState(212-214)ServiceState_BUILD_FAILED(151-151)ServiceState_DEPLOYMENT_FAILED(150-150)Service(3919-3924)Service(3937-3937)Service(3952-3954)src/pkg/cli/client/errors.go (2)
ErrDeploymentSucceeded(8-8)ErrDeploymentFailed(10-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: nix-shell-test
- GitHub Check: Analyze (go)
🔇 Additional comments (7)
src/pkg/cli/deploymentinfo.go (1)
61-61: LGTM - STATUS column removal aligns with PR objectives.This change correctly removes the STATUS column from the service table output, keeping only the canonical "State" as specified in the PR objectives to improve signal-to-noise ratio.
src/cmd/cli/command/compose.go (3)
42-46: Race condition concern addressed by Bubble Tea's design.The
servicesmap is accessed byUpdateandViewmethods which are called sequentially on Bubble Tea's main goroutine. Thep.Sendcall from the monitoring goroutine is thread-safe as it uses an internal channel. No mutex is needed here.
64-155: Well-structured Bubble Tea UI implementation.The model correctly handles:
- Initialization with spinner ticks for all services
- State updates via messages
- Consistent rendering with sorted service names
- Terminal state indicators (✓/✗) replacing spinners
404-406: Reasonable error handling for PrintServices.Downgrading the error to a warning here is appropriate since the deployment itself succeeded and this is just a display issue.
src/pkg/cli/subscribe.go (1)
118-141: Clean refactoring to callback-based approach.The
WaitServiceStatewrapper correctly delegates toWatchServiceStatewith a callback that handles terminal states. UsingErrDeploymentSucceededas a sentinel for successful completion is an idiomatic pattern.src/pkg/cli/getServices_test.go (1)
89-91: Test expectations correctly updated.The expected output aligns with the production change that removes the STATUS column from the short form table output while preserving status information in the long form YAML output.
src/pkg/cli/deploymentinfo_test.go (1)
40-41: Test expectations correctly aligned with STATUS column removal.All three test cases have been updated to reflect the removal of the STATUS column from the table output, matching the production change in
deploymentinfo.go.Also applies to: 61-62, 80-81
| for service, state := range serviceStates { | ||
| if state <= defangv1.ServiceState_BUILD_STOPPING { | ||
| unbuiltServices = append(unbuiltServices, service) | ||
| } | ||
| } |
There was a problem hiding this comment.
Include BUILD_FAILED state when identifying services for build log output.
The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). Services that failed during the build phase should also be included to properly surface build logs on deployment failure.
🐛 Proposed fix
var unbuiltServices = make([]string, 0, len(project.Services))
for service, state := range serviceStates {
- if state <= defangv1.ServiceState_BUILD_STOPPING {
+ if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
unbuiltServices = append(unbuiltServices, service)
}
}🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 368 - 372, The loop that
collects unbuiltServices currently checks if state <=
defangv1.ServiceState_BUILD_STOPPING which misses
defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates
iteration (the block that appends to unbuiltServices) to also include
BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state ==
defangv1.ServiceState_BUILD_FAILED) so services that failed during build are
captured for build log output.
There was a problem hiding this comment.
Shouldn't we show any service with state != DEPLOYMENT_COMPLETED?
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the ServiceState enum definition to understand all states and their values
rg -n -A 30 'enum ServiceState' --type=proto
Length of output: 29
🏁 Script executed:
#!/bin/bash
# Find the ServiceState enum definition
rg -n -A 30 'ServiceState.*=.*iota|const.*ServiceState_' --type=go
# Also search in proto files
fd -e proto -x cat {} | head -200
Length of output: 7507
🏁 Script executed:
#!/bin/bash
# Search for ServiceState enum in proto files
rg -n 'enum ServiceState' src/protos/ -A 30
# Also check the generated Go file for ServiceState constants
rg -n 'ServiceState_BUILD|ServiceState_DEPLOYMENT' src/protos/ | head -40
Length of output: 3191
@lionello Thanks for the question! I think the narrower condition is actually more appropriate here. Let me explain based on the ServiceState enum values:
State progression:
- Build phase: BUILD_QUEUED (1) → BUILD_PROVISIONING (2) → ... → BUILD_STOPPING (6)
- Build outcome: Either BUILD_FAILED (11) or moves to deployment phase
- Deployment phase: DEPLOYMENT_PENDING (8) → DEPLOYMENT_COMPLETED (9) or DEPLOYMENT_FAILED (10)
Why state != DEPLOYMENT_COMPLETED would be too broad:
The comment says "if any services failed to build, only show build logs for those services". The intent is to show BUILD logs specifically for services that failed during the BUILD phase.
Using state != DEPLOYMENT_COMPLETED would include:
- DEPLOYMENT_FAILED (10): These services successfully built but failed during deployment. Showing their build logs wouldn't help debug the deployment failure - we'd need Pulumi/deployment logs instead.
- DEPLOYMENT_PENDING (8): These might not have failed at all, just pending.
- UPDATE_QUEUED (7): Using existing image, no build occurred.
The correct condition should be:
Services that are in the build phase (BUILD_QUEUED through BUILD_STOPPING) OR explicitly failed during build (BUILD_FAILED):
if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
unbuiltServices = append(unbuiltServices, service)
}This way, if a service reaches DEPLOYMENT_FAILED, we'll show all logs (including Pulumi logs), not just build logs, which is what you'd need to debug deployment issues.
Does that make sense?
🧬 Code Graph Analysis Results
[src/cmd/cli/command/compose.go] (summary)
- Summary: The file prints the current service states after deployment via cli.PrintServices(project.Name, provider). The user question about showing services with state != DEPLOYMENT_COMPLETED relates to how states are presented and possibly filtered in the code path that monitors/tails deployment progress and displays statuses. The file also uses a UI path (MonitorWithUI) that can adjust service states (eg, marking DEPLOYMENT_COMPLETED) during monitoring. No separate filtering logic is present in this file snippet; any filtering would be in underlying PrintServices/monitoring logic.
[src/protos/io/defang/v1/fabric.pb.go] 2571-2594
-
Snippet: ServiceInfo struct
-
Content:
ServiceInfo struct {
state protoimpl.MessageStateprotogen:"open.v1"
Service *Serviceprotobuf:"bytes,1,opt,name=service,proto3" json:"service,omitempty"
Endpoints []stringprotobuf:"bytes,2,rep,name=endpoints,proto3" json:"endpoints,omitempty"// list of endpoints, one for each port
Project stringprotobuf:"bytes,3,opt,name=project,proto3" json:"project,omitempty"// was: tenant; defaults to tenant ID
Etag stringprotobuf:"bytes,4,opt,name=etag,proto3" json:"etag,omitempty"
Status stringprotobuf:"bytes,5,opt,name=status,proto3" json:"status,omitempty"
NatIps []stringprotobuf:"bytes,6,rep,name=nat_ips,json=natIps,proto3" json:"nat_ips,omitempty"// comma-separated list of NAT IPs
LbIps []stringprotobuf:"bytes,7,rep,name=lb_ips,json=lbIps,proto3" json:"lb_ips,omitempty"// comma-separated list of internal CIDRs for the load-balancer
PrivateFqdn stringprotobuf:"bytes,8,opt,name=private_fqdn,json=privateFqdn,proto3" json:"private_fqdn,omitempty"// fully qualified domain name (host)
PublicFqdn stringprotobuf:"bytes,9,opt,name=public_fqdn,json=publicFqdn,proto3" json:"public_fqdn,omitempty"// fully qualified domain name (ingress)
CreatedAt *timestamppb.Timestampprotobuf:"bytes,10,opt,name=created_at,json=createdAt,proto3" json:"created_at,omitempty"
UpdatedAt *timestamppb.Timestampprotobuf:"bytes,11,opt,name=updated_at,json=updatedAt,proto3" json:"updated_at,omitempty"
ZoneId stringprotobuf:"bytes,12,opt,name=zone_id,json=zoneId,proto3" json:"zone_id,omitempty"// zone ID for byod domain
UseAcmeCert boolprotobuf:"varint,13,opt,name=use_acme_cert,json=useAcmeCert,proto3" json:"use_acme_cert,omitempty"// If we should setup the facilities to use ACME(let's encrypt) certs
State ServiceStateprotobuf:"varint,15,opt,name=state,proto3,enum=io.defang.v1.ServiceState" json:"state,omitempty"// enumerated status of the service
Domainname stringprotobuf:"bytes,16,opt,name=domainname,proto3" json:"domainname,omitempty"// domain name for the service
LbDnsName stringprotobuf:"bytes,17,opt,name=lb_dns_name,json=lbDnsName,proto3" json:"lb_dns_name,omitempty"// fully qualified domain name for the load-balancer
AllowScaling boolprotobuf:"varint,18,opt,name=allow_scaling,json=allowScaling,proto3" json:"allow_scaling,omitempty"// true if service is allowed to autoscale
HealthcheckPath stringprotobuf:"bytes,19,opt,name=healthcheck_path,json=healthcheckPath,proto3" json:"healthcheck_path,omitempty"
Type ResourceTypeprotobuf:"varint,21,opt,name=type,proto3,enum=io.defang.v1.ResourceType" json:"type,omitempty"
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
} -
Relevance: This shows that a ServiceInfo has a State field of type ServiceState, which is the source of the status values (including DEPLOYMENT_COMPLETED) that the user comment concerns.
[src/cmd/cli/command/composeUpTui.go] 142-201
-
Snippet: MonitorWithUI usage of DEPLOYMENT_COMPLETED
-
Content (abridged to key parts):
func MonitorWithUI(ctx context.Context, project compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string) (map[string]defangv1.ServiceState, error) {
// ... setup ...
go func() {
// UI runs in a goroutine
if _, err := p.Run(); err != nil { / ... */ }
}()// Start monitoring in a goroutine
go func() {
// Monitor and update UI with service state changes
serviceStates, monitorErr = Monitor(ctx, project, provider, waitTimeout, deploymentID, func(msg *defangv1.SubscribeResponse, states *ServiceStates) error {
for name, state := range *states {
p.Send(serviceUpdate{name: name, status: state})
}
return nil
})if monitorErr == nil { // When monitoring completes, mark any still-DEPLOYMENT_PENDING as DEPLOYMENT_COMPLETED for _, serviceName := range servicesNames { if serviceStates[serviceName] == defangv1.ServiceState_DEPLOYMENT_PENDING { p.Send(serviceUpdate{name: serviceName, status: defangv1.ServiceState_DEPLOYMENT_COMPLETED}) } } } p.Quit()}()
// wait for UI/monitor goroutines
wg.Wait()return serviceStates, monitorErr
} -
Relevance: This shows explicit handling of the DEPLOYMENT_COMPLETED state in the UI; it demonstrates how DEPLOYMENT_PENDING can be transformed into DEPLOYMENT_COMPLETED, which is directly relevant to the idea of filtering or deciding which services to show based on completion state.
Notes and how they relate to user comment
- The user asks: "Shouldn't we show any service with state != DEPLOYMENT_COMPLETED?" The relevant code paths indicate:
- Service state is represented by ServiceState, with DEPLOYMENT_COMPLETED being a possible state (as used in MonitorWithUI).
- There is UI logic that explicitly updates any remaining DEPLOYMENT_PENDING to DEPLOYMENT_COMPLETED upon successful completion of monitoring.
- Print/monitoring functions (PrintServices, Tail/Monitor) operate on a collection of service states but the provided snippets do not show an explicit filter to exclude non-COMPLETED services from being displayed; rather, the UI path actively updates statuses and shows them, including potentially non-COMPLETED states until completion.
- If you want to enforce showing only services with state DEPLOYMENT_COMPLETED, you would likely implement a filter around the display paths (e.g., PrintObject/PrintServices) or in the monitor UI to skip services whose final state is not DEPLOYMENT_COMPLETED. The MonitorWithUI snippet demonstrates where such a logic point could be inserted (e.g., filter or hide entries with non-COMPLETED states, or add a flag to show only completed services).
🧠 Learnings used
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:62-86
Timestamp: 2026-01-09T20:19:04.424Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, BuildTag.Parse should fail on unexpected tag formats (tags that don't have 3-4 underscore-separated parts or aren't DefangCDBuildTag) because build tags are strictly controlled and only created in two places: (1) running CD in cloudbuild by CLI, and (2) building images by CD. Unexpected tags indicate an error case.
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.
c8b40fd to
454c06f
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pkg/cli/subscribe.go (1)
39-42: Data race onserverStreamduring reconnection.The cancellation goroutine (lines 39-42) captures
serverStreamby reference. When the receive goroutine reassignsserverStreamat line 64 during reconnection, there's an unsynchronized concurrent read/write:
- Goroutine 1 may read
serverStreamto callClose()- Goroutine 2 may be writing to
serverStreamduring reconnectionAdditionally, on reconnection the old stream is never explicitly closed, potentially leaking resources.
🔒 Proposed fix using a mutex or local variable
Option 1: Use a mutex to synchronize access:
+var streamMu sync.Mutex go func() { <-ctx.Done() + streamMu.Lock() serverStream.Close() + streamMu.Unlock() }() // In receive goroutine, when reconnecting: +streamMu.Lock() +serverStream.Close() // close old stream serverStream, err = provider.Subscribe(ctx, &subscribeRequest) +streamMu.Unlock()Option 2: Avoid sharing the variable by closing from within the receive goroutine only, and using context cancellation to signal exit.
Also applies to: 64-64
♻️ Duplicate comments (5)
src/pkg/cli/tailAndMonitor.go (2)
18-21: Replace panic with error return for production code.This issue was previously flagged. Using
panicfor input validation can crash the application. Return an error instead.🛡️ Proposed fix
func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) { if deploymentID == "" { - panic("deploymentID must be a valid deployment ID") + return nil, errors.New("deploymentID must be a valid deployment ID") }
59-63: Replace panic with error return for production code.This issue was previously flagged. Similar to the
Monitorfunction, usingpanicfor input validation should be replaced with returning an error.🛡️ Proposed fix
func TailAndMonitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, tailOptions TailOptions) (ServiceStates, error) { tailOptions.Follow = true if tailOptions.Deployment == "" { - panic("tailOptions.Deployment must be a valid deployment ID") + return nil, errors.New("tailOptions.Deployment must be a valid deployment ID") }src/cmd/cli/command/compose.go (2)
41-45: Add mutex to protect concurrent map access.This issue was previously flagged. The
servicesmap is accessed concurrently by the UI goroutine (inView) and the monitoring goroutine (viap.Send). Without synchronization, this causes race conditions.🔒 Proposed fix
type deploymentModel struct { services map[string]*serviceState + mu sync.RWMutex quitting bool updateCh chan serviceUpdate }Then protect access in
Update(around line 99):case serviceUpdate: + m.mu.Lock() if svc, exists := m.services[msg.name]; exists { svc.status = msg.status } + m.mu.Unlock()And in
View(around line 120):func (m *deploymentModel) View() string { + m.mu.RLock() + defer m.mu.RUnlock() + if m.quitting {
373-378: IncludeBUILD_FAILEDstate when identifying services for build log output.This issue was previously flagged. The condition
state <= defangv1.ServiceState_BUILD_STOPPING(value 6) excludesBUILD_FAILED(value 11). Services that failed during build should be included to surface build logs on deployment failure.🐛 Proposed fix
var unbuiltServices = make([]string, 0, len(project.Services)) for service, state := range serviceStates { - if state <= defangv1.ServiceState_BUILD_STOPPING { + if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED { unbuiltServices = append(unbuiltServices, service) } }src/pkg/cli/subscribe.go (1)
71-74: Goroutine can exit without signaling, causing indefinite hang.When
serverStream.Receive()returnsfalseandserverStream.Err()returnsnil(clean stream close), the goroutine exits without sending anything toerrChan. The main loop at line 90 blocks indefinitely waiting for a message that will never arrive.🔒 Proposed fix
if err := serverStream.Err(); err != nil { errChan <- err +} else { + errChan <- io.EOF // or a sentinel error to signal clean close } returnThen handle the sentinel appropriately in the main loop.
🧹 Nitpick comments (3)
src/pkg/cli/tailAndMonitor.go (1)
53-56: Consider making the 2-second delay configurable or documented.The hardcoded 2-second sleep before returning appears to be a buffer for receiving final status messages. Consider either documenting this magic number with a comment explaining its purpose, or making it a named constant.
+const finalMessageGracePeriod = 2 * time.Second // Allow time for final status messages to arrive + wg.Wait() - pkg.SleepWithContext(ctx, 2*time.Second) + pkg.SleepWithContext(ctx, finalMessageGracePeriod)src/cmd/cli/command/compose.go (2)
131-140: Consider handling BUILD_FAILED state in the spinner/status display.The View method shows "✓" for
DEPLOYMENT_COMPLETEDand "✗" forDEPLOYMENT_FAILED, butBUILD_FAILEDwould still show a spinning indicator. Consider adding explicit handling forBUILD_FAILED:switch svc.status { case "DEPLOYMENT_COMPLETED": spinnerOrCheck = "✓" - case "DEPLOYMENT_FAILED": + case "DEPLOYMENT_FAILED", "BUILD_FAILED": spinnerOrCheck = "✗" default: spinnerOrCheck = svc.spinner.View() }
156-203: Consider handling UI run errors.The UI goroutine silently ignores errors from
p.Run(). While UI errors may not be critical, logging them at debug level would aid troubleshooting:go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debug("UI error:", err) } }()
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/go.sumis excluded by!**/*.sum
📒 Files selected for processing (10)
pkgs/defang/cli.nixsrc/cmd/cli/command/compose.gosrc/go.modsrc/pkg/cli/client/errors.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices_test.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.gosrc/pkg/cli/tailAndMonitor.go
🚧 Files skipped from review as they are similar to previous changes (2)
- src/pkg/cli/getServices_test.go
- src/pkg/cli/client/errors.go
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2026-01-07T03:07:56.002Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:56.002Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/tailAndMonitor.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/tailAndMonitor.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-09T20:12:21.986Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:21.986Z
Learning: In Go files, recognize and accept the import path go.yaml.in/yaml/v3 as the maintained fork of the YAML library. Do not flag this import as incorrect; this fork supersedes the archived gopkg.in/yaml.v3 path. If you encounter this or similar forked import paths, treat them as valid Go imports and do not raise review flags.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/tailAndMonitor.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-13T17:46:06.788Z
Learnt from: jordanstephens
Repo: DefangLabs/defang PR: 1754
File: src/pkg/agent/tools/provider.go:19-27
Timestamp: 2026-01-13T17:46:06.788Z
Learning: Go interfaces should be consumer-defined and small, defined in the package that uses them, rather than centralized, broad interfaces. Prefer interfaces that capture the specific methods required by a consumer, enabling implicit satisfaction and easier testing. Do not assume a single, global interface name across packages; it is acceptable (and sometimes intentional) for different packages to define interfaces with the same name but different method sets. Apply this guideline across all Go files in the repository.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/tailAndMonitor.gosrc/cmd/cli/command/compose.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/subscribe.gosrc/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-09T20:19:04.424Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:62-86
Timestamp: 2026-01-09T20:19:04.424Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, BuildTag.Parse should fail on unexpected tag formats (tags that don't have 3-4 underscore-separated parts or aren't DefangCDBuildTag) because build tags are strictly controlled and only created in two places: (1) running CD in cloudbuild by CLI, and (2) building images by CD. Unexpected tags indicate an error case.
Applied to files:
src/pkg/cli/tailAndMonitor.gosrc/cmd/cli/command/compose.go
📚 Learning: 2025-12-31T13:47:20.413Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:20.413Z
Learning: In the Defang codebase, prefer allowing panics in mock objects within test files rather than adding defensive error handling, as the panic behavior helps catch issues during testing.
Applied to files:
src/pkg/cli/tailAndMonitor.go
📚 Learning: 2026-01-09T20:31:23.614Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.
Applied to files:
src/cmd/cli/command/compose.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/subscribe.go
📚 Learning: 2026-01-09T20:12:31.961Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:31.961Z
Learning: The import path `go.yaml.in/yaml/v3` is correct and refers to the YAML organization's fork (yaml/go-yaml) which is the maintained successor after the original gopkg.in/yaml.v3 (go-yaml/yaml) was archived in April 2025. Do not flag this import path as incorrect.
Applied to files:
src/go.mod
🧬 Code graph analysis (2)
src/pkg/cli/tailAndMonitor.go (5)
src/pkg/cli/client/provider.go (1)
Provider(64-90)src/pkg/cli/subscribe.go (2)
ServiceStates(15-15)WatchServiceState(17-116)src/pkg/cli/waitForCdTaskExit.go (1)
WaitForCdTaskExit(14-41)src/pkg/utils.go (1)
SleepWithContext(120-129)src/pkg/cli/tail.go (1)
TailOptions(45-58)
src/cmd/cli/command/compose.go (5)
src/pkg/cli/client/provider.go (1)
Provider(64-90)src/protos/io/defang/v1/fabric.pb.go (18)
Provider(28-28)Provider(66-68)Provider(70-72)Provider(79-81)ServiceState(135-135)ServiceState(199-201)ServiceState(203-205)ServiceState(212-214)SubscribeResponse(4758-4767)SubscribeResponse(4780-4780)SubscribeResponse(4795-4797)Deployment(3322-3342)Deployment(3355-3355)Deployment(3370-3372)ServiceState_BUILD_STOPPING(145-145)Stack(683-696)Stack(709-709)Stack(724-726)src/pkg/cli/tailAndMonitor.go (2)
Monitor(18-57)TailAndMonitor(59-118)src/pkg/cli/tail.go (2)
TailOptions(45-58)Tail(118-153)src/pkg/logs/log_type.go (3)
LogType(8-8)LogTypeAll(23-23)LogTypeBuild(21-21)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Analyze (go)
- GitHub Check: go-test
🔇 Additional comments (15)
src/pkg/cli/deploymentinfo.go (1)
61-61: LGTM!The removal of the "Status" column from the table output aligns with the PR objective to improve signal-to-noise ratio during deployment. The
Statusfield is appropriately retained in theServicestruct and still populated byGetServiceStatesAndEndpoints, preserving the data for other potential consumers while simplifying the default display.src/pkg/cli/deploymentinfo_test.go (1)
40-42: LGTM!Test expectations correctly updated to reflect the removal of the STATUS column from the table output. The test input data appropriately retains the
Statusfield values since the underlying struct still requires them, while the expected output no longer includes the column.Also applies to: 61-63, 80-82
pkgs/defang/cli.nix (1)
10-10: LGTM!The vendorHash update correctly reflects the new dependencies added in
go.mod(charmbracelet/bubbles, bubbletea, lipgloss, and related transitive dependencies).src/go.mod (2)
37-39: LGTM!The Bubble Tea ecosystem dependencies (bubbles, bubbletea, lipgloss) are well-established libraries for building terminal UIs in Go. These additions support the new interactive deployment status UI described in the PR objectives.
56-56: LGTM!The termenv upgrade to v0.16.0 is appropriate as it's a transitive dependency of the Bubble Tea ecosystem and ensures compatibility with the new UI components.
src/pkg/cli/subscribe_test.go (1)
222-234: LGTM!The assertion message correction is appropriate. The previous message "Unexpected error: %v" with a nil error would be confusing; "Expected error but got nil" correctly describes the failure condition in error-expecting test cases.
src/pkg/cli/tailAndMonitor.go (1)
120-132: LGTM!The
CanMonitorServiceandsplitManagedAndUnmanagedServiceshelper functions are well-structured. The logic correctly identifies services that can be monitored based on restart policy and compute service classification.src/cmd/cli/command/compose.go (6)
57-61: LGTM!The adaptive color styles for spinner, status, and service names provide good visual feedback and adapt to light/dark terminal themes appropriately.
63-81: LGTM!The
newDeploymentModelconstructor correctly initializes spinners for each service with a reasonable buffer size (100) for the update channel.
83-113: LGTM!The Bubble Tea
InitandUpdatemethods are correctly implemented:
Initproperly batches spinner tick commands for all servicesUpdatehandles Ctrl+C for graceful exit, service updates, and spinner ticks- The type switch pattern is idiomatic for Bubble Tea
349-367: LGTM!The conditional monitoring path correctly:
- Uses
TailAndMonitorfor verbose or non-interactive modes (full log tailing)- Uses
monitorWithUIfor interactive sessions (compact UI)- Provides helpful hint about tailing logs with the deployment ID
410-412: LGTM!The updated call to
cli.PrintServicescorrectly usessession.Providerand handles errors appropriately with a warning rather than failing the command.
466-466: LGTM!The updated message "This project has already deployed to the following locations:" is clearer than the previous wording.
src/pkg/cli/subscribe.go (2)
90-115: Main loop logic is correct, contingent on fixing goroutine signaling.The select-based message handling and callback invocation are well-structured. The early filtering of unexpected service names (lines 99-102) is good defensive coding.
118-150: Wrapper and helper functions are well-implemented.
WaitServiceStatecleanly wrapsWatchServiceStatewith appropriate callback logic for detecting failures and successful completion. TheallInStatehelper is straightforward and correct.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/pkg/cli/deploymentinfo.go (1)
68-76: Fix column header names to match struct fields.The
attrsslice contains headers that don't match theServicestruct field names. Theterm.Tablefunction usesFieldByName()for exact matching (seesrc/pkg/term/table.goline 57), so mismatches result in "N/A" being printed instead of actual values. Correct the following headers:
"Deployment Status"→"Deployment""State"→"DeploymentState"src/pkg/cli/deploymentinfo_test.go (1)
86-105: Tests make real HTTP requests, causing potential flakiness.These tests invoke
PrintServiceStatesAndEndpoints, which callsGetHealthcheckResults. This function makes actual HTTP requests to endpoints viahttp.DefaultClient.Do()without mocking:
- Test "Service with Domainname" expects
"healthy"forexample.com(line 62), requiring a real HTTPS request to return a 2xx status.- Test "endpoint without port" expects
"unreachable"forservice1(line 81), which attempts a real network request.- Network unavailability or changes in endpoint behavior will cause test failures.
Consider mocking
http.DefaultClientusing an injected HTTP client interface or by settinghttp.DefaultClient.Transportto a mockRoundTripper, as done in other tests in this directory (e.g.,new_test.go).
🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices.go`:
- Around line 57-106: GetHealthcheckResults spawns goroutines that write
concurrently to the shared results map and also mutates serviceInfo.State,
causing data races; protect concurrent access by adding synchronization (e.g., a
sync.Mutex or replace results with a sync.Map) around writes to results inside
the goroutine and when reading/writing it elsewhere, and avoid mutating the
input ServiceInfo in-place from goroutines (e.g., send status back on a channel
and update serviceInfo.State serially after wg.Wait or make a copy of
serviceInfo per goroutine before mutating). Target symbols:
GetHealthcheckResults, results, serviceInfo.State; ensure all writes to results
and any ServiceInfo state changes are synchronized or moved out of concurrent
goroutines.
🧹 Nitpick comments (1)
src/pkg/cli/deploymentinfo.go (1)
47-56: Consider nil check forserviceInfo.Service.Line 49 accesses
serviceInfo.Service.Namewithout checking ifServiceis nil. While this may not occur in practice, a defensive check would prevent potential panics if malformed data is received.💡 Optional defensive check
for _, serviceInfo := range serviceInfos { + if serviceInfo.Service == nil { + continue + } fqdn := serviceInfo.PublicFqdn
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/pkg/agent/tools/default_tool_cli.gosrc/pkg/cli/deploymentinfo.gosrc/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices.go
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.
Applied to files:
src/pkg/cli/deploymentinfo_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices.gosrc/pkg/agent/tools/default_tool_cli.gosrc/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-09T20:12:21.986Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:21.986Z
Learning: In Go files, recognize and accept the import path go.yaml.in/yaml/v3 as the maintained fork of the YAML library. Do not flag this import as incorrect; this fork supersedes the archived gopkg.in/yaml.v3 path. If you encounter this or similar forked import paths, treat them as valid Go imports and do not raise review flags.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices.gosrc/pkg/agent/tools/default_tool_cli.gosrc/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-13T17:46:06.788Z
Learnt from: jordanstephens
Repo: DefangLabs/defang PR: 1754
File: src/pkg/agent/tools/provider.go:19-27
Timestamp: 2026-01-13T17:46:06.788Z
Learning: Go interfaces should be consumer-defined and small, defined in the package that uses them, rather than centralized, broad interfaces. Prefer interfaces that capture the specific methods required by a consumer, enabling implicit satisfaction and easier testing. Do not assume a single, global interface name across packages; it is acceptable (and sometimes intentional) for different packages to define interfaces with the same name but different method sets. Apply this guideline across all Go files in the repository.
Applied to files:
src/pkg/cli/deploymentinfo_test.gosrc/pkg/cli/getServices.gosrc/pkg/agent/tools/default_tool_cli.gosrc/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-09T20:31:23.614Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.
Applied to files:
src/pkg/cli/getServices.gosrc/pkg/agent/tools/default_tool_cli.gosrc/pkg/cli/deploymentinfo.go
🧬 Code graph analysis (4)
src/pkg/cli/deploymentinfo_test.go (1)
src/pkg/cli/deploymentinfo.go (1)
PrintServiceStatesAndEndpoints(62-83)
src/pkg/cli/getServices.go (2)
src/pkg/cli/deploymentinfo.go (2)
PrintServiceStatesAndEndpoints(62-83)Service(10-18)src/protos/io/defang/v1/fabric.pb.go (9)
ServiceInfo(2568-2591)ServiceInfo(2604-2604)ServiceInfo(2619-2621)Service(4391-4396)Service(4409-4409)Service(4424-4426)Status(3780-3785)Status(3798-3798)Status(3813-3815)
src/pkg/agent/tools/default_tool_cli.go (1)
src/pkg/cli/deploymentinfo.go (1)
GetServiceStatesAndEndpoints(20-60)
src/pkg/cli/deploymentinfo.go (2)
src/protos/io/defang/v1/fabric.pb.go (13)
Service(4391-4396)Service(4409-4409)Service(4424-4426)ServiceState(135-135)ServiceState(199-201)ServiceState(203-205)ServiceState(212-214)Status(3780-3785)Status(3798-3798)Status(3813-3815)ServiceInfo(2568-2591)ServiceInfo(2604-2604)ServiceInfo(2619-2621)src/pkg/cli/getServices.go (1)
GetHealthcheckResults(57-107)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Analyze (go)
🔇 Additional comments (3)
src/pkg/agent/tools/default_tool_cli.go (1)
73-81: LGTM!The change correctly threads the context to
GetServiceStatesAndEndpoints, enabling proper timeout and cancellation handling for the healthcheck requests performed downstream.src/pkg/cli/getServices.go (1)
44-55: LGTM!
PrintServicescleanly delegates toPrintServiceStatesAndEndpoints, properly threading the context for healthcheck processing.src/pkg/cli/deploymentinfo.go (1)
10-18: LGTM on struct updates.The
Servicestruct is well-organized with the newDeploymentStateandHealthcheckfields. The field ordering and types are appropriate.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
The WatchServiceState function was blocking on serverStream.Receive() even when the context was cancelled by cancelSvcStatus. This change moves the stream receiving to a separate goroutine and uses channels with a select statement to properly handle context cancellation. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
The test was flaky in CI due to a race condition in the select statement. When a stream ended after sending the final message (e.g., BUILD_FAILED), both msgChan and errChan could have data ready simultaneously: - msgChan had the failure state message - errChan had nil (from stream ending without error) The select statement could non-deterministically choose either channel, causing the test to sometimes return nil error instead of ErrDeploymentFailed. Fixed by: 1. Not sending nil errors to errChan when stream ends normally 2. Fixing misleading error message in test (cosmetic) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.vscode/launch.json:
- Around line 14-20: The launch.json contains a developer-specific hardcoded
path in the "args" array and invalid JSON trailing commas; replace the absolute
path entry with a workspace-relative variable (e.g., use
${workspaceFolder}/samples/... in the "args" array where the hardcoded path
appears), remove trailing commas after the last array/object entries to produce
valid JSON, and either add .vscode/launch.json to .gitignore or move this config
to a workspace-specific settings file so other developers aren't forced to use
the same path.
In `@src/pkg/cli/subscribe.go`:
- Around line 55-88: When the receive loop sees serverStream.Receive() == false
and serverStream.Err() == nil (clean closure) it should explicitly notify the
main watcher instead of silently returning; update the goroutine that reads from
serverStream in WatchServiceState so that on clean closure you send a sentinel
(e.g., io.EOF or a dedicated sentinel error value) into errChan (using a
non-blocking send/select that respects ctx.Done()) rather than just returning,
and keep the existing behavior for non-nil errors from serverStream.Err();
reference serverStream.Receive(), serverStream.Err(), errChan, msgChan, and
caller functions WatchServiceState / Monitor / WaitForCdTaskExit /
cancelSvcStatus to locate and coordinate this change.
♻️ Duplicate comments (3)
src/cmd/cli/command/compose.go (1)
373-378: BUILD_FAILED state is still excluded from build log collection.The condition
state <= defangv1.ServiceState_BUILD_STOPPING(value 6) excludesBUILD_FAILED(value 11). This was flagged in a previous review and marked as addressed, but the fix doesn't appear in this code. Services that failed during build should be included to properly surface build logs on deployment failure.🐛 Proposed fix
var unbuiltServices = make([]string, 0, len(project.Services)) for service, state := range serviceStates { - if state <= defangv1.ServiceState_BUILD_STOPPING { + if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED { unbuiltServices = append(unbuiltServices, service) } }src/pkg/cli/tailAndMonitor.go (2)
18-21: Replace panic with error return for production code.Using
panicfor input validation in production code can crash the entire application. Return an error instead to allow graceful handling.🛡️ Proposed fix
func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) { if deploymentID == "" { - panic("deploymentID must be a valid deployment ID") + return nil, errors.New("deploymentID must be a valid deployment ID") }
59-63: Replace panic with error return for production code.Similar to the
Monitorfunction, usingpanichere can crash the application. Return an error instead.🛡️ Proposed fix
func TailAndMonitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, tailOptions TailOptions) (ServiceStates, error) { tailOptions.Follow = true if tailOptions.Deployment == "" { - panic("tailOptions.Deployment must be a valid deployment ID") + return nil, errors.New("tailOptions.Deployment must be a valid deployment ID") }
🧹 Nitpick comments (3)
src/cmd/cli/command/compose.go (2)
71-74: Initial status string doesn't match actual ServiceState enum.The hardcoded
"DEPLOYMENT_QUEUED"doesn't correspond to any value in theServiceStateenum (which hasBUILD_QUEUED,UPDATE_QUEUED,DEPLOYMENT_PENDING, etc.). Consider usingdefangv1.ServiceState_DEPLOYMENT_PENDING.String()or similar for consistency with the actual state values.♻️ Proposed fix
services[name] = &serviceState{ - status: "DEPLOYMENT_QUEUED", + status: defangv1.ServiceState_DEPLOYMENT_PENDING.String(), spinner: s, }
176-181: UI errors are silently discarded.The error from
p.Run()is ignored. While terminal UI errors may not be critical, logging them would help debugging display issues.♻️ Proposed fix
go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debug("UI error:", err) } }()src/pkg/cli/tailAndMonitor.go (1)
33-56: Consider removing redundant sleep or documenting its purpose.Line 54 adds a 2-second sleep after
wg.Wait(). Combined with the additional 2-second sleep at line 79 inTailAndMonitor, there's a total 4-second delay. If the intent is to capture final status messages, consider consolidating these sleeps or adding a comment explaining why both are necessary.wg.Wait() - pkg.SleepWithContext(ctx, 2*time.Second) + pkg.SleepWithContext(ctx, 2*time.Second) // Allow time for final status messages to arrive return serviceStates, errors.Join(cdErr, svcErr)
| "args": [ | ||
| "-C", | ||
| "/Users/jordan/wk/samples/samples/html-css-js", | ||
| "deploy", | ||
| "-s", | ||
| "beta", | ||
| ], |
There was a problem hiding this comment.
Developer-specific configuration should not be committed.
This launch configuration contains a hardcoded user-specific path (/Users/jordan/wk/samples/...) that won't work for other developers. Additionally, the static analysis correctly identifies invalid JSON syntax - trailing commas on lines 20 and 25 are not valid in standard JSON.
Consider:
- Adding
.vscode/launch.jsonto.gitignore - Or using workspace-relative variables like
${workspaceFolder}/samples
🐛 Proposed fix for JSON syntax
"args": [
"-C",
- "/Users/jordan/wk/samples/samples/html-css-js",
+ "${workspaceFolder}/samples/html-css-js",
"deploy",
"-s",
- "beta",
+ "beta"
],
"env": {
// "GCP_PROJECT_ID": "jordan-project-463223"
"DEFANG_PROVIDER": "aws",
- "AWS_REGION": "us-west-2",
+ "AWS_REGION": "us-west-2"
}
- },
+ }
]
}🧰 Tools
🪛 Biome (2.1.2)
[error] 20-20: Expected an array, an object, or a literal but instead found ']'.
Expected an array, an object, or a literal here.
(parse)
🤖 Prompt for AI Agents
In @.vscode/launch.json around lines 14 - 20, The launch.json contains a
developer-specific hardcoded path in the "args" array and invalid JSON trailing
commas; replace the absolute path entry with a workspace-relative variable
(e.g., use ${workspaceFolder}/samples/... in the "args" array where the
hardcoded path appears), remove trailing commas after the last array/object
entries to produce valid JSON, and either add .vscode/launch.json to .gitignore
or move this config to a workspace-specific settings file so other developers
aren't forced to use the same path.
| go func() { | ||
| for { | ||
| if !serverStream.Receive() { | ||
| // Reconnect on Error: internal: stream error: stream ID 5; INTERNAL_ERROR; received from peer | ||
| if isTransientError(serverStream.Err()) { | ||
| if err := provider.DelayBeforeRetry(ctx); err != nil { | ||
| errChan <- err | ||
| return | ||
| } | ||
| serverStream, err = provider.Subscribe(ctx, &subscribeRequest) | ||
| if err != nil { | ||
| errChan <- err | ||
| return | ||
| } | ||
| continue | ||
| } | ||
| serverStream, err = provider.Subscribe(ctx, &subscribeRequest) | ||
| if err != nil { | ||
| return serviceStates, err | ||
| if err := serverStream.Err(); err != nil { | ||
| errChan <- err | ||
| } | ||
| return | ||
| } | ||
|
|
||
| msg := serverStream.Msg() | ||
| if msg == nil { | ||
| continue | ||
| } | ||
| return serviceStates, serverStream.Err() | ||
| } | ||
|
|
||
| msg := serverStream.Msg() | ||
| if msg == nil { | ||
| continue | ||
| select { | ||
| case msgChan <- msg: | ||
| case <-ctx.Done(): | ||
| return | ||
| } | ||
| } | ||
| }() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how WatchServiceState is called and context management
cd /tmp && git ls-files src/pkg/cli/ | head -20Repository: DefangLabs/defang
Length of output: 129
🏁 Script executed:
#!/bin/bash
# Get the full context of subscribe.go
wc -l src/pkg/cli/subscribe.goRepository: DefangLabs/defang
Length of output: 89
🏁 Script executed:
#!/bin/bash
# Read the subscribe.go file to understand the full context
cat -n src/pkg/cli/subscribe.go | head -100Repository: DefangLabs/defang
Length of output: 3400
🏁 Script executed:
#!/bin/bash
# Search for WatchServiceState usage
rg -nA5 'WatchServiceState\(' --type=goRepository: DefangLabs/defang
Length of output: 1338
🏁 Script executed:
#!/bin/bash
# Read tailAndMonitor.go around line 42 to see context management
cat -n src/pkg/cli/tailAndMonitor.go | head -80Repository: DefangLabs/defang
Length of output: 3255
🏁 Script executed:
#!/bin/bash
# Search for other calls to WatchServiceState
rg -nB3 'WatchServiceState\(' --type=goRepository: DefangLabs/defang
Length of output: 958
🏁 Script executed:
#!/bin/bash
# Check how the main loop in subscribe.go ends - see the rest of the function
cat -n src/pkg/cli/subscribe.go | sed -n '90,150p'Repository: DefangLabs/defang
Length of output: 2249
🏁 Script executed:
#!/bin/bash
# Search for calls to WaitServiceState
rg -nB2 -A2 'WaitServiceState\(' --type=goRepository: DefangLabs/defang
Length of output: 1809
🏁 Script executed:
#!/bin/bash
# Search more broadly for WatchServiceState calls to ensure we have all callers
rg -n 'WatchServiceState\(' --type=goRepository: DefangLabs/defang
Length of output: 448
🏁 Script executed:
#!/bin/bash
# Check what WaitForCdTaskExit does to understand context cancellation timing
rg -nA10 'func WaitForCdTaskExit' --type=goRepository: DefangLabs/defang
Length of output: 799
Stream goroutine exits silently on clean stream closure, depending on external context cancellation.
When serverStream.Receive() returns false and serverStream.Err() returns nil (clean closure), the goroutine exits without signaling errChan. The main loop then depends on the parent context (passed into WatchServiceState) being cancelled to unblock.
In Monitor, this works as designed: WaitForCdTaskExit polls for deployment completion and calls cancelSvcStatus() when done, which cancels the context passed to WatchServiceState. However, this creates an implicit dependency where the receive loop's termination relies on external context management rather than on the stream state itself.
For robustness, consider signaling the main loop on clean stream closure (e.g., sending nil or a sentinel to errChan) instead of relying on the caller to always provide guaranteed context cancellation.
🤖 Prompt for AI Agents
In `@src/pkg/cli/subscribe.go` around lines 55 - 88, When the receive loop sees
serverStream.Receive() == false and serverStream.Err() == nil (clean closure) it
should explicitly notify the main watcher instead of silently returning; update
the goroutine that reads from serverStream in WatchServiceState so that on clean
closure you send a sentinel (e.g., io.EOF or a dedicated sentinel error value)
into errChan (using a non-blocking send/select that respects ctx.Done()) rather
than just returning, and keep the existing behavior for non-nil errors from
serverStream.Err(); reference serverStream.Receive(), serverStream.Err(),
errChan, msgChan, and caller functions WatchServiceState / Monitor /
WaitForCdTaskExit / cancelSvcStatus to locate and coordinate this change.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pkg/cli/getServices_test.go (1)
196-213: Test expectation does not match implementation.This test expects
Endpoint: "N/A"for an endpoint without a port, butServiceEndpointsFromServiceInfowould produce"https://service1"for input"service1". There's no "N/A" handling in the implementation.This discrepancy won't surface until the missing
Testprefix is fixed. Please verify the intended behavior and align either the test expectations or the implementation.
🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices_test.go`:
- Line 150: The test function ServiceEndpointFromServiceInfo is missing the
required Test prefix so the Go test runner skips it; rename the function to
start with Test (e.g., TestServiceEndpointFromServiceInfo) and keep the
signature func TestServiceEndpointFromServiceInfo(t *testing.T) so the test
harness discovers and executes the endpoint scenarios in getServices_test.go.
🧹 Nitpick comments (1)
src/pkg/cli/getServices.go (1)
182-190: Consider simplifying the return signature.This function always returns
nilfor the error. If error handling isn't planned, consider removing the error return to simplify the API. Otherwise, this is fine as-is for future extensibility.♻️ Optional simplification
-func ServiceEndpointsFromServiceInfos(serviceInfos []*defangv1.ServiceInfo) ([]ServiceEndpoint, error) { +func ServiceEndpointsFromServiceInfos(serviceInfos []*defangv1.ServiceInfo) []ServiceEndpoint { var serviceTableItems []ServiceEndpoint for _, serviceInfo := range serviceInfos { serviceTableItems = append(serviceTableItems, ServiceEndpointsFromServiceInfo(serviceInfo)...) } - return serviceTableItems, nil + return serviceTableItems }This would require updating the caller at line 70 in
GetServices.
Previously, partially completed deployments appeared like this: ``` * Live tail logs with `defang tail --deployment=6k69je47hxiy` ⢿ [app] DEPLOYMENT_PENDING ✓ [embedding] DEPLOYMENT_COMPLETED ✓ [llm] DEPLOYMENT_COMPLETED ⢿ [postgres] DEPLOYMENT_QUEUED ⢿ [redis] DEPLOYMENT_QUEUED ✓ [worker] DEPLOYMENT_COMPLETED ```
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices.go`:
- Around line 152-168: ServiceEndpointsFromServiceInfo currently only prepends
"https://" when no port is present, leaving ported endpoints like "service1:80"
unschemed; update ServiceEndpointsFromServiceInfo to detect host:port endpoints
(e.g., via net.SplitHostPort result) and prepend "http://" for those cases so
Endpoint values become "http://service:80" (ensuring healthcheck logic that
checks for absence of "https://" works), and also rename the test function
ServiceEndpointFromServiceInfo to TestServiceEndpointFromServiceInfo so the test
actually runs and catches this mismatch.
♻️ Duplicate comments (2)
src/cmd/cli/command/compose.go (1)
380-385: IncludeBUILD_FAILEDstate when collecting services for build logs.The condition
state <= defangv1.ServiceState_BUILD_STOPPING(value 6) excludesBUILD_FAILED(value 11). Services that failed during build won't have their build logs displayed when deployment fails.🐛 Proposed fix
var unbuiltServices = make([]string, 0, len(project.Services)) for service, state := range serviceStates { - if state <= defangv1.ServiceState_BUILD_STOPPING { + if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED { unbuiltServices = append(unbuiltServices, service) } }src/pkg/cli/getServices_test.go (1)
150-150: MissingTestprefix — test function will not execute.The function is named
ServiceEndpointFromServiceInfobut Go's test framework requires functions to start withTestto be recognized and executed. This entire test function will be silently skipped.Additionally, the test expectations don't match the current implementation (e.g., expecting
"N/A"or"http://service1:80"for endpoints that the code produces differently), which would be caught once the test runs.🐛 Proposed fix
-func ServiceEndpointFromServiceInfo(t *testing.T) { +func TestServiceEndpointFromServiceInfo(t *testing.T) {
🧹 Nitpick comments (2)
src/cmd/cli/command/compose.go (1)
176-181: Consider logging UI errors for debugging.UI errors are silently discarded. While rare, they could make debugging difficult if issues arise.
♻️ Optional: Log UI errors
go func() { defer wg.Done() if _, err := p.Run(); err != nil { - // Handle UI errors if needed + term.Debugf("UI error: %v", err) } }()src/pkg/cli/getServices.go (1)
225-235: Input slice mutation may be unexpected by callers.The function modifies
serviceEndpointsin-place, clearingService,Deployment, andStatefields for duplicate rows. While current callers don't reuse the data, this side effect could cause subtle bugs if callers expect unchanged data.Consider documenting this behavior or working on a copy if this becomes a concern.
This reverts commit 7b53dbc.
when monitoring completes successfully
1d6019e to
07e4a61
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/cmd/cli/command/compose.go`:
- Around line 187-198: The code currently only routes to TailAndMonitor when
global.Verbose or global.NonInteractive are true; update the conditional to also
check the CLI --tail flag so interactive users can opt into live tailing. Modify
the if that decides between TailAndMonitor and MonitorWithUI to include the tail
flag (the parsed boolean flag for "--tail") alongside
global.Verbose/global.NonInteractive, and when true set tailOptions.Follow =
true and call cli.TailAndMonitor(ctx, project, session.Provider,
waitTimeoutDuration, tailOptions) (handling err as already done); otherwise call
cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration,
deploy.Etag). Ensure you reference and use the existing symbols tailOptions,
TailAndMonitor, MonitorWithUI, global.Verbose, global.NonInteractive, and
deploy.Etag.
♻️ Duplicate comments (1)
src/cmd/cli/command/compose.go (1)
205-208: IncludeBUILD_FAILEDwhen collecting unbuilt services for build logs.The check
state <= BUILD_STOPPINGexcludesBUILD_FAILED, so services that already failed during build won’t be captured for build-log tailing.🐛 Proposed fix
for service, state := range serviceStates { - if state <= defangv1.ServiceState_BUILD_STOPPING { + if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED { unbuiltServices = append(unbuiltServices, service) } }
| waitTimeoutDuration := time.Duration(waitTimeout) * time.Second | ||
| var serviceStates map[string]defangv1.ServiceState | ||
| if global.Verbose || global.NonInteractive { | ||
| tailOptions.Follow = true | ||
| serviceStates, err = cli.TailAndMonitor(ctx, project, session.Provider, waitTimeoutDuration, tailOptions) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| } else { | ||
| term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`") | ||
| serviceStates, err = cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration, deploy.Etag) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -nP '\bGetBool\("tail"\)' --type=goRepository: DefangLabs/defang
Length of output: 43
🏁 Script executed:
rg -nA5 -nB5 'tail.*Flag\|Flag.*tail' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 483
🏁 Script executed:
grep -n "tail" src/cmd/cli/command/compose.go | head -20Repository: DefangLabs/defang
Length of output: 1694
🏁 Script executed:
sed -n '170,260p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 3581
🏁 Script executed:
sed -n '1,50p' src/cmd/cli/command/compose.go | grep -A20 "func.*compose"Repository: DefangLabs/defang
Length of output: 43
🏁 Script executed:
sed -n '130,180p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 1764
🏁 Script executed:
sed -n '100,140p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 1645
🏁 Script executed:
sed -n '50,100p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 1582
🏁 Script executed:
rg -n "composeUpCmd.*&cobra.Command" src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 96
🏁 Script executed:
sed -n '60,75p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 432
🏁 Script executed:
sed -n '60,68p' src/cmd/cli/command/compose.goRepository: DefangLabs/defang
Length of output: 293
Enable the --tail flag to opt into live log tailing in interactive mode.
The --tail flag is currently defined as a no-op but never checked in the code, so defang compose up --tail behaves like the UI path regardless of the flag. Per the PR objective, this should enable live-tail behavior. The condition at lines 189–198 gates only on global.Verbose / global.NonInteractive, so add a check for the --tail flag and route to TailAndMonitor when it's set.
✅ Suggested update
waitTimeoutDuration := time.Duration(waitTimeout) * time.Second
var serviceStates map[string]defangv1.ServiceState
-if global.Verbose || global.NonInteractive {
+tail, _ := cmd.Flags().GetBool("tail")
+if global.Verbose || global.NonInteractive || tail {
tailOptions.Follow = true
serviceStates, err = cli.TailAndMonitor(ctx, project, session.Provider, waitTimeoutDuration, tailOptions)
if err != nil {
return err
}
} else {
term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`")
serviceStates, err = cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration, deploy.Etag)
}🤖 Prompt for AI Agents
In `@src/cmd/cli/command/compose.go` around lines 187 - 198, The code currently
only routes to TailAndMonitor when global.Verbose or global.NonInteractive are
true; update the conditional to also check the CLI --tail flag so interactive
users can opt into live tailing. Modify the if that decides between
TailAndMonitor and MonitorWithUI to include the tail flag (the parsed boolean
flag for "--tail") alongside global.Verbose/global.NonInteractive, and when true
set tailOptions.Follow = true and call cli.TailAndMonitor(ctx, project,
session.Provider, waitTimeoutDuration, tailOptions) (handling err as already
done); otherwise call cli.MonitorWithUI(ctx, project, session.Provider,
waitTimeoutDuration, deploy.Etag). Ensure you reference and use the existing
symbols tailOptions, TailAndMonitor, MonitorWithUI, global.Verbose,
global.NonInteractive, and deploy.Etag.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
| "permissions": { | ||
| "allow": [ | ||
| "Bash(go test:*)", | ||
| "Bash(make:*)", |
There was a problem hiding this comment.
this is potentially destructive, so perhaps we shouldn't include it here
| "Bash(make:*)", | ||
| "Bash(golangci-lint run:*)", | ||
| "Bash(go build:*)", | ||
| "Bash(find:*)", |
There was a problem hiding this comment.
also potentially destructive because it has -remove and -exec
| version = "git"; | ||
| src = lib.cleanSource ../../src; | ||
| vendorHash = "sha256-A+DwCvfNUKY8TjxyAe+abiT9xIyy5p7VIh5T5ofeZIg="; # TODO: use fetchFromGitHub | ||
| vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc="; # TODO: use fetchFromGitHub |
There was a problem hiding this comment.
The comment was false:
| vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc="; # TODO: use fetchFromGitHub | |
| vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc="; |
| return err | ||
| } | ||
| } else { | ||
| term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`") |
There was a problem hiding this comment.
| term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`") | |
| term.Infof("See live tail logs with `defang tail %v`", tailOptions) |
to ensure the command line works as-is.
| Deployment: deploy.Etag, | ||
| LogType: logs.LogTypeAll, | ||
| Since: since, | ||
| Verbose: true, |
There was a problem hiding this comment.
| Verbose: true, | |
| Verbose: true, | |
| Stack: session.Stack.Name, |
| return strings.Compare(a.Endpoint, b.Endpoint) | ||
| }) | ||
|
|
||
| // to reduce noise, print empty "Service", "Deployment", and "State" columns |
There was a problem hiding this comment.
I've gone through this path before and I didn't think it was worth it. It makes scripting impossible. Sure, --json and all that, but folks will script whatever they currently get. I wouldn't do this.
| // Reconnect on Error: internal: stream error: stream ID 5; INTERNAL_ERROR; received from peer | ||
| if isTransientError(serverStream.Err()) { | ||
| if err := provider.DelayBeforeRetry(ctx); err != nil { | ||
| errChan <- err |
There was a problem hiding this comment.
Potentially blocking write. You might need a select here. I'm not 100%, because you do create the channel with buffer 1.
| } | ||
| serverStream, err = provider.Subscribe(ctx, &subscribeRequest) | ||
| if err != nil { | ||
| errChan <- err |
| if err != nil { | ||
| return serviceStates, err | ||
| if err := serverStream.Err(); err != nil { | ||
| errChan <- err |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/pkg/cli/getServices_test.go (2)
174-194: Test case expectations don't match function behavior — root cause of pipeline failure.This test case has 2 items in
Endpointsplus aDomainname, but only expects 1 result. PerServiceEndpointsFromServiceInfologic, the function produces one endpoint perEndpointsentry plus an additional endpoint for theDomainname. This should yield 3 endpoints, not 1.This mismatch causes the panic at line 296 when the loop tries to access
expectedServiceEndpoints[1]which doesn't exist.🐛 Proposed fix
{ name: "Service with Domainname", serviceinfo: &defangv1.ServiceInfo{ Service: &defangv1.Service{ Name: "service1", }, Status: "UNKNOWN", Domainname: "example.com", Endpoints: []string{ "example.com", "service1.internal:80", }, }, expectedServiceEndpoints: []ServiceEndpoint{ { Service: "service1", Status: "UNKNOWN", Endpoint: "https://example.com", }, + { + Service: "service1", + Status: "UNKNOWN", + Endpoint: "http://service1.internal:80", + }, + { + Service: "service1", + Status: "UNKNOWN", + Endpoint: "https://example.com", + }, }, },Alternatively, if the intent is to test filtering/deduplication logic, verify that
ServiceEndpointsFromServiceInfoactually implements that behavior and adjust the test case inputs accordingly.
292-303: Add length assertion to prevent panic and provide clearer failure message.The loop iterates over
serviceEndpoints(actual) but indexes intoexpectedServiceEndpoints. When actual has more items than expected, this causes a panic instead of a clear test failure. Add a length check first.♻️ Proposed fix
for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { serviceEndpoints := ServiceEndpointsFromServiceInfo(tt.serviceinfo) + require.Len(t, serviceEndpoints, len(tt.expectedServiceEndpoints), "endpoint count mismatch") for i, endpoint := range serviceEndpoints { assert.Equal(t, tt.expectedServiceEndpoints[i].Service, endpoint.Service) assert.Equal(t, tt.expectedServiceEndpoints[i].Status, endpoint.Status) assert.Equal(t, tt.expectedServiceEndpoints[i].Endpoint, endpoint.Endpoint) assert.Equal(t, tt.expectedServiceEndpoints[i].AcmeCertUsed, endpoint.AcmeCertUsed) } }) }
Description
This PR aims to improve the signal-to-noise ratio during deployment by refurbishing the cli deployment "ui". Before this PR, the deployment "interface" is a live tail stream of logs from pulumi, build runners, and services. This PR changes the default behaviour during
defang up. Instead of printing logs, we will print a list of compose services and live update their service statues. Logs are only printed when there is a failure—and we only print the logs associated with the failure. The existing interface can still be invoked by usingdefang compose up --tail.TODO:
Here is an example of a successful crewai deployment:
Linked Issues
Checklist
Summary by CodeRabbit
New Features
Improvements
User-Facing Data/Display
✏️ Tip: You can customize this high-level summary in your review settings.