Skip to content

Improve signal-to-noise ratio during deployment (TUI)#1750

Open
jordanstephens wants to merge 43 commits intomainfrom
jordan/deployment-status-updates
Open

Improve signal-to-noise ratio during deployment (TUI)#1750
jordanstephens wants to merge 43 commits intomainfrom
jordan/deployment-status-updates

Conversation

@jordanstephens
Copy link
Member

@jordanstephens jordanstephens commented Jan 6, 2026

Description

This PR aims to improve the signal-to-noise ratio during deployment by refurbishing the cli deployment "ui". Before this PR, the deployment "interface" is a live tail stream of logs from pulumi, build runners, and services. This PR changes the default behaviour during defang up. Instead of printing logs, we will print a list of compose services and live update their service statues. Logs are only printed when there is a failure—and we only print the logs associated with the failure. The existing interface can still be invoked by using defang compose up --tail.

TODO:

  • build states are not being tracked

Here is an example of a successful crewai deployment:

defang deploy -s jordan
 * Using the "jordan" stack on gcp from --stack flag and local stack file
 ! Defang cannot monitor status of the following managed service(s): [postgres redis].
   To check if the managed service is up, check the status of the service which depends on it.
 * Service environment variables resolution summary:

SERVICE    ENVIRONMENT             VALUE                                                                                SOURCE
app        CELERY_BROKER_URL       redis://redis:6379/0                                                                 Compose                
app        CELERY_RESULT_BACKEND   redis://redis:6379/0                                                                 Compose                
app        DATABASE_URL            postgres://postgres:${POSTGRES_PASSWORD}@postgres:5432/postgres?sslmode=${SSL_MODE}  Config (interpolated)  
app        DJANGO_SECRET_KEY       *****                                                                                Config                 
app        DJANGO_SETTINGS_MODULE  config.settings                                                                      Compose                
app        REDIS_URL               redis://redis:6379/0                                                                 Compose                
embedding  DEBUG                   true                                                                                 Compose                
embedding  OPENAI_API_KEY          defang                                                                               Compose                
llm        DEBUG                   true                                                                                 Compose                
llm        OPENAI_API_KEY          defang                                                                               Compose                
postgres   POSTGRES_PASSWORD       *****                                                                                Config                 
worker     DATABASE_URL            postgres://postgres:${POSTGRES_PASSWORD}@postgres:5432/postgres?sslmode=${SSL_MODE}  Config (interpolated)  
worker     DJANGO_SECRET_KEY       *****                                                                                Config                 
worker     DJANGO_SETTINGS_MODULE  config.settings                                                                      Compose                
worker     OPENAI_API_KEY          defang                                                                               Compose                
worker     REDIS_URL               redis://redis:6379/0                                                                 Compose                
 * Packaging the project files for app at /Users/jordan/wk/samples/samples/crewai/app
 * Uploading the project files for app
 * Setting up defang CD in GCP project jordan-project-463223, this could take a few minutes
 * Packaging the project files for worker at /Users/jordan/wk/samples/samples/crewai/app
 * Uploading the project files for worker

SERVICE    DEPLOYMENT    STATE                 HEALTHCHECK  ENDPOINT
app        0yj1840g9bmo  DEPLOYMENT_COMPLETED  healthy      https://app--8000.crewai-jordan.jordanstephens.defang.app                  
                                               healthy      https://defang-crewai-jordan-app-a03f73a-200650937057.us-central1.run.app  
embedding  0yj1840g9bmo  DEPLOYMENT_COMPLETED  -            embedding.google.internal:80                                               
llm        0yj1840g9bmo  DEPLOYMENT_COMPLETED  -            llm.google.internal:80                                                     
postgres   0yj1840g9bmo  DEPLOYMENT_COMPLETED  -            postgres.google.internal:5432                                              
redis      0yj1840g9bmo  DEPLOYMENT_COMPLETED  -            redis.google.internal:6379                                                 
 * Done.

Linked Issues

Checklist

  • I have performed a self-review of my code
  • I have added appropriate tests
  • I have updated the Defang CLI docs and/or README to reflect my changes, if necessary

Summary by CodeRabbit

  • New Features

    • Interactive terminal UI for real-time deployment monitoring with per-service spinners and clear success/failure indicators.
    • Interactive monitor that can display live deployment state in-session.
  • Improvements

    • Conditional behavior: interactive UI when appropriate, otherwise tails logs with smarter fallbacks.
    • More resilient streaming with automatic reconnects and focused logs for failing/unbuilt services.
    • Clarified post-deploy messaging.
  • User-Facing Data/Display

    • Service listing now shows endpoints and healthcheck path/status (FQDN removed).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 6, 2026

📝 Walkthrough

Walkthrough

Adds a Bubble Tea TUI for compose up, replaces polling with a streaming WatchServiceState+callback, introduces Monitor to coordinate monitoring and tailing, refactors service listing to ServiceEndpoint with healthcheck logic, updates agent CLI interfaces/mocks, and bumps a Nix vendor hash and Go deps.

Changes

Cohort / File(s) Summary
Compose command & TUI
src/cmd/cli/command/compose.go, src/pkg/cli/composeUpTui.go, src/go.mod
Add Bubble Tea-based MonitorWithUI; wire interactive vs verbose/tail flows; add TUI deps in go.mod.
Streaming / watch API
src/pkg/cli/subscribe.go, src/pkg/cli/subscribe_test.go
Replace polling WaitServiceState with streaming WatchServiceState(ctx,..., cb) and callback-driven completion; keep wrapper WaitServiceState; update tests.
Monitor & tail orchestration
src/pkg/cli/tailAndMonitor.go, src/cmd/cli/command/compose.go
Add Monitor(...) delegating to WatchServiceState; run Monitor concurrently in TailAndMonitor; coordinate cancellation and aggregate monitor/tail errors; selective tailing on failures.
Service model & healthchecks
src/pkg/cli/getServices.go, src/pkg/cli/getServices_test.go
Replace ServiceLineItemServiceEndpoint; add HealthcheckPath/Healthcheck, remove Fqdn; add ServiceEndpointsFromServiceInfo(s); refactor healthcheck logic and printing/sorting.
Agent interfaces & tests
src/pkg/agent/tools/interfaces.go, src/pkg/agent/tools/default_tool_cli.go, src/pkg/agent/tools/services_test.go
Change CLIInterface.GetServices and implementations/tests to return []cli.ServiceEndpoint instead of []cli.ServiceLineItem; update mocks and tests.
Client sentinel error
src/pkg/cli/client/errors.go
Add exported sentinel ErrDeploymentSucceeded for success signaling in watch callbacks.
Tailing / error-path logging
src/pkg/cli/tailAndMonitor.go, src/cmd/cli/command/compose.go
Make tail behavior conditional (tail all vs selective service logs) when monitor reports failures; switch error-path stack/provider refs to use session stack.
Build/package/configs
pkgs/defang/cli.nix, .claude/settings.local.json, .vscode/launch.json
Update Nix vendorHash; add local Claude permissions file and VSCode Go launch configuration.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant Compose as Compose Command
    participant UI as Bubble Tea UI
    participant Monitor as Monitor / WatchServiceState
    participant Provider as Backend Provider

    User->>Compose: defang compose up
    Compose->>Compose: choose mode (interactive vs verbose/tail)

    alt Interactive
        Compose->>UI: start MonitorWithUI
        Compose->>Monitor: Monitor(ctx, project, provider, timeout, deploymentID, callback)
        Monitor->>Provider: subscribe stream
        Provider-->>Monitor: SubscribeResponse messages
        Monitor->>UI: invoke callback / send updates
        UI->>User: render live service statuses
        Monitor->>Compose: return final states
        Compose->>UI: stop UI and optionally tail logs
    else Verbose / Non-interactive
        Compose->>Monitor: TailAndMonitor (monitor + tail)
        Monitor->>Provider: stream states and logs
        Provider-->>User: logs streamed to terminal
        Monitor->>Compose: return final states
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • edwardrf

Poem

🐇
I hop and stir the terminal light,
Spinners whirl through deploy‑night;
Quiet until a fault appears,
Then logs unfurl to calm our fears.
Hooray — the callback hums delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Improve signal-to-noise ratio during deployment (TUI)' clearly summarizes the main change: introducing a Text User Interface to reduce noise in deployment logs by showing only service states instead of full log streams.
Linked Issues check ✅ Passed The PR implements all core requirements from issue #1749: replaces live-tail log stream with live-updating service state list, displays proper service states (BUILD_, DEPLOYMENT_), handles failure-specific logging (build logs for build failures, Pulumi logs for deployment failures), and preserves full logs via --tail flag.
Out of Scope Changes check ✅ Passed Changes are scope-aligned: TUI monitoring (composeUpTui.go, tailAndMonitor.go, subscribe.go), service endpoint refactoring (getServices.go), CLI command flow updates (compose.go), and dependency additions (go.mod) all support the core objective. Configuration files (.claude, .vscode) are developer-facing and minimal.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@jordanstephens jordanstephens force-pushed the jordan/deployment-status-updates branch from d17d7ad to 2fa03a8 Compare January 6, 2026 21:04
@jordanstephens jordanstephens marked this pull request as ready for review January 7, 2026 01:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @src/pkg/cli/subscribe.go:
- Around line 71-74: The goroutine that reads from serverStream can exit without
sending anything to errChan when serverStream.Receive() returns false and
serverStream.Err() is nil, which causes the main select loop to block; modify
the reader goroutine (the one calling serverStream.Receive() and
serverStream.Err()) to always signal completion by sending a sentinel error
(e.g., io.EOF or a package-level errStreamClosed) to errChan or by closing
errChan before returning, and update the main select loop that waits on errChan
to treat that sentinel/closed channel as stream completion and exit or continue
appropriately (handle errStreamClosed/io.EOF in the select case that currently
expects errors from errChan).
🧹 Nitpick comments (4)
src/cmd/cli/command/compose.go (2)

174-179: Silently swallowing UI errors may hide issues.

The error from p.Run() is discarded. While UI errors may not be critical, at minimum consider logging them for debugging purposes.

🔎 Proposed fix
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("UI error:", err)
 		}
 	}()

203-215: Duplicate function: splitManagedAndUnmanagedServices exists in tailAndMonitor.go.

This function is duplicated from src/pkg/cli/tailAndMonitor.go (lines 134-145). Consider importing it from the cli package instead of duplicating it here.

🔎 Proposed fix
-func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) {
-	var managedServices []string
-	var unmanagedServices []string
-	for _, service := range serviceInfos {
-		if cli.CanMonitorService(&service) {
-			unmanagedServices = append(unmanagedServices, service.Name)
-		} else {
-			managedServices = append(managedServices, service.Name)
-		}
-	}
-
-	return managedServices, unmanagedServices
-}

Then use the function from the cli package directly where needed, or export splitManagedAndUnmanagedServices from the cli package if it isn't already.

src/pkg/cli/tailAndMonitor.go (2)

19-21: Panic for invalid input may be too aggressive for a library function.

Panicking on empty deploymentID will crash the program. If this is called with user-provided or dynamically-determined input, consider returning an error instead. If this is strictly an internal function where the caller guarantees validity, the panic is acceptable as a programming error detector.

🔎 Proposed fix to return an error instead
 func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
 	if deploymentID == "" {
-		panic("deploymentID must be a valid deployment ID")
+		return nil, errors.New("deploymentID must be a valid deployment ID")
 	}

53-56: Clarify the purpose of the 2-second sleep before returning.

The SleepWithContext call at line 54 delays the return by 2 seconds. There's also another 2-second sleep at line 79 in TailAndMonitor. If both are executed in the TailAndMonitor path, that's 4 seconds of delay. Consider adding a comment explaining why this delay is needed, or consolidate if both serve the same purpose.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 347a98f and 40fd11a.

⛔ Files ignored due to path filters (1)
  • src/go.sum is excluded by !**/*.sum
📒 Files selected for processing (7)
  • pkgs/defang/cli.nix
  • src/cmd/cli/command/compose.go
  • src/go.mod
  • src/pkg/cli/client/errors.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
  • src/pkg/cli/tailAndMonitor.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/pkg/cli/subscribe_test.go
  • src/pkg/cli/client/errors.go
  • src/pkg/cli/tailAndMonitor.go
  • src/pkg/cli/subscribe.go
  • src/cmd/cli/command/compose.go
🧬 Code graph analysis (2)
src/pkg/cli/tailAndMonitor.go (6)
src/pkg/cli/client/provider.go (1)
  • Provider (54-80)
src/protos/io/defang/v1/fabric.pb.go (10)
  • Provider (28-28)
  • Provider (66-68)
  • Provider (70-72)
  • Provider (79-81)
  • SubscribeResponse (4202-4211)
  • SubscribeResponse (4224-4224)
  • SubscribeResponse (4239-4241)
  • Deployment (2822-2838)
  • Deployment (2851-2851)
  • Deployment (2866-2868)
src/pkg/cli/subscribe.go (2)
  • ServiceStates (15-15)
  • WatchServiceState (17-114)
src/pkg/cli/waitForCdTaskExit.go (1)
  • WaitForCdTaskExit (14-41)
src/pkg/utils.go (1)
  • SleepWithContext (116-125)
src/pkg/cli/tail.go (1)
  • TailOptions (45-58)
src/pkg/cli/subscribe.go (3)
src/protos/io/defang/v1/fabric.pb.go (15)
  • SubscribeResponse (4202-4211)
  • SubscribeResponse (4224-4224)
  • SubscribeResponse (4239-4241)
  • Status (3248-3253)
  • Status (3266-3266)
  • Status (3281-3283)
  • ServiceState (135-135)
  • ServiceState (199-201)
  • ServiceState (203-205)
  • ServiceState (212-214)
  • ServiceState_BUILD_FAILED (151-151)
  • ServiceState_DEPLOYMENT_FAILED (150-150)
  • Service (3835-3840)
  • Service (3853-3853)
  • Service (3868-3870)
src/pkg/types/etag.go (1)
  • ETag (9-9)
src/pkg/cli/client/errors.go (2)
  • ErrDeploymentSucceeded (8-8)
  • ErrDeploymentFailed (10-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: nix-shell-test
  • GitHub Check: go-playground-test
  • GitHub Check: Analyze (go)
🔇 Additional comments (9)
src/pkg/cli/client/errors.go (1)

3-8: LGTM! Sentinel error for deployment success signaling.

Using ErrDeploymentSucceeded as a sentinel to break out of the streaming loop is a clean pattern. The callback in WatchServiceState correctly converts this to a nil return, so it won't propagate as an actual error to callers.

pkgs/defang/cli.nix (1)

10-10: LGTM! Vendor hash updated for new dependencies.

The hash change is expected given the new Bubble Tea ecosystem dependencies added to go.mod.

src/pkg/cli/subscribe_test.go (1)

225-227: LGTM! Corrected test assertion message.

The previous message "Unexpected error" was misleading since this branch handles the case where err == nil but an error was expected. "Expected error but got nil" accurately describes the failure condition.

src/cmd/cli/command/compose.go (2)

374-374: Redundant error check after monitorWithUI.

Line 374 checks if err != nil but err is only set by the tail branch (line 368). For the monitorWithUI branch, the error is returned directly in serviceStates, err = monitorWithUI(...) at line 372, but err is checked again at line 374.

The issue is that err from line 372 won't be caught by line 374's condition because the code flow doesn't assign to err after monitorWithUI. Looking more carefully, line 372 does assign to err, so this should work, but the structure is confusing because line 368 returns early.

Actually, re-reading: line 368 has return err inside the if tail block, so line 374 only applies to the monitorWithUI path. This is correct but the flow is a bit hard to follow.


362-373: Verify behavior when monitorWithUI returns both states and error.

When monitorWithUI returns an error, the code continues to lines 374+ which handles deployment errors. However, if serviceStates is partially populated on error, the subsequent logic at lines 379-388 (checking for unbuilt services) depends on those states being correct. Ensure this is the intended behavior.

src/pkg/cli/tailAndMonitor.go (1)

65-66: Using context.Background() for tailCtx disconnects it from the parent context.

The tail context is derived from context.Background() rather than ctx, meaning parent cancellation (e.g., user pressing Ctrl+C) won't directly cancel tailing. Cancellation only happens via cancelTail(errMonitoringDone) when monitoring completes. This appears intentional to ensure logs continue until monitoring finishes, but verify this matches the expected UX when a user cancels mid-deployment.

src/pkg/cli/subscribe.go (2)

82-86: Good pattern: non-blocking send with context check.

The select with ctx.Done() prevents the goroutine from blocking forever if the main loop has exited due to context cancellation. This is the correct approach for channel sends in goroutines.


116-139: LGTM! Clean wrapper preserving the original API.

WaitServiceState provides a clean wrapper around WatchServiceState with a callback that implements the original target-state-based completion logic. The use of ErrDeploymentSucceeded as a sentinel to signal success is handled correctly by WatchServiceState (converted to nil return at line 108).

src/go.mod (1)

37-39: LGTM! Bubble Tea ecosystem dependencies for TUI.

The Charm Bracelet libraries (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0) are well-established for building terminal UIs in Go. These versions are current stable releases: lipgloss v1.1.0 is the latest available, while bubbles and bubbletea have stable v0.21.0 and v1.3.10 releases respectively (newer pre-release versions exist but are not yet stable).

@jordanstephens jordanstephens force-pushed the jordan/deployment-status-updates branch from 40fd11a to e492b30 Compare January 8, 2026 00:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pkg/cli/subscribe.go (1)

39-42: Critical: Data race on serverStream variable between goroutines.

The cleanup goroutine (lines 39-42) reads and calls serverStream.Close(), while the receiver goroutine (line 64) reassigns serverStream during reconnection. This creates a data race that could result in:

  • Closing the wrong stream (old vs. new)
  • Double-close panics
  • Accessing a closed stream

The receiver goroutine should manage its own stream lifecycle, or use synchronization (e.g., a mutex) to protect concurrent access.

🔒 Proposed fix using atomic value or local variable

Option 1: Keep stream local to receiver goroutine

-	serverStream, err := provider.Subscribe(ctx, &subscribeRequest)
-	if err != nil {
-		return nil, err
-	}
-
 	ctx, cancel := context.WithCancel(ctx)
 	defer cancel() // to ensure we close the stream and clean-up this context
 
-	go func() {
-		<-ctx.Done()
-		serverStream.Close()
-	}()
-
 	// ... 
 	
 	// Run stream receiving in a separate goroutine
 	go func() {
+		serverStream, err := provider.Subscribe(ctx, &subscribeRequest)
+		if err != nil {
+			errChan <- err
+			return
+		}
+		defer serverStream.Close()
+
 		for {
+			// Check context before blocking on Receive
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
+
 			if !serverStream.Receive() {
 				// ... reconnection logic
 			}
 			// ...
 		}
 	}()

This eliminates the race by making the receiver goroutine own the stream lifecycle entirely.

Option 2: Use sync.Mutex to protect access

Add a mutex to protect concurrent access to serverStream if it must be shared.

Also applies to: 64-68

🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 42-46: The services map on deploymentModel is accessed
concurrently; add a sync.RWMutex field to deploymentModel and use it to guard
all accesses to services (use Lock/Unlock when adding/removing/modifying entries
in Update and the monitoring goroutine that calls p.Send, and RLock/RUnlock when
reading/iterating the map in View). Ensure you acquire the mutex before any map
read/write, and when iterating in View prefer to either hold RLock for the
duration of the read or copy the needed keys/values under lock into a local
slice then release the lock before doing heavier UI work.
- Around line 203-215: The function splitManagedAndUnmanagedServices is
duplicated; extract a single implementation into the cli package and have both
call sites use it: move the logic into the existing cli package as an exported
function (e.g., SplitManagedAndUnmanagedServices or keep the same name but
exported if needed), preserving the signature (accepting compose.Services and
returning ([]string, []string)), update the places that currently define their
own copy (the command package and the tail/monitor code) to call
cli.SplitManagedAndUnmanagedServices, and remove the duplicate implementations;
ensure imports and visibility are adjusted so compose.Services is referenced
correctly from the new shared function.

In @src/pkg/cli/subscribe.go:
- Around line 71-74: The receiver goroutine can exit without signaling when
serverStream.Receive() returns false and serverStream.Err() is nil; modify the
goroutine that calls serverStream.Receive() to push a sentinel value (e.g., a
predefined error variable like errStreamClosed) onto errChan before returning
whenever serverStream.Err() == nil, and ensure the main select/loop that reads
from errChan (the loop handling responses and errors) treats errStreamClosed as
a clean/finished condition (stop waiting and exit gracefully) instead of
blocking; reference the receiver that calls serverStream.Receive(), the check
serverStream.Err(), and the errChan/errStreamClosed symbols so the change is
applied in the right places.

In @src/pkg/cli/tailAndMonitor.go:
- Around line 19-21: Replace the panic on invalid deploymentID with a proper
error return: change the check that currently does if deploymentID == "" {
panic(...) } to return an error (e.g., return fmt.Errorf("invalid deploymentID:
empty string")) from the enclosing function, update the function signature to
return (error) if needed, and propagate that error to callers so they can handle
it instead of crashing the process; reference the deploymentID validation block
in tailAndMonitor.go and update any callers to handle the returned error.
- Around line 61-63: The TailAndMonitor function currently calls
panic("tailOptions.Deployment must be a valid deployment ID") when
tailOptions.Deployment is empty; change this to return a proper error instead:
validate tailOptions.Deployment at the start of TailAndMonitor, and if empty
return a descriptive error (e.g., fmt.Errorf or errors.New) rather than
panicking so callers can handle it; update the function signature if necessary
to return an error and propagate that error from callers of TailAndMonitor
(search for TailAndMonitor and tailOptions.Deployment to locate sites to
update).
🧹 Nitpick comments (5)
src/pkg/cli/client/errors.go (1)

8-8: Document the unconventional use of an error for success.

Using an error to signal successful deployment completion is unconventional and may confuse maintainers. Consider adding a comment explaining that this sentinel error is used to signal completion in callback-based monitoring flows, distinguishing it from actual failure errors.

📝 Suggested documentation
+// ErrDeploymentSucceeded is a sentinel error returned by watch callbacks
+// to signal successful deployment completion (not an actual error condition).
 var ErrDeploymentSucceeded = errors.New("deployment succeeded")
src/cmd/cli/command/compose.go (3)

80-80: Close the update channel to prevent goroutine leaks.

The updateCh is never closed, which could lead to goroutines waiting on this channel to never terminate properly. While the Bubble Tea program's Quit will stop the UI, it's good practice to close channels when done sending.

Consider closing the channel in the monitoring goroutine after all updates are sent:

go func() {
	defer wg.Done()
	serviceStates, monitorErr = cli.Monitor(ctx, project, provider, waitTimeout, deploymentID, func(msg *defangv1.SubscribeResponse, states *cli.ServiceStates) error {
		// Send service status updates to the bubbletea model
		for name, state := range *states {
			p.Send(serviceUpdate{
				name:   name,
				status: state.String(),
			})
		}
		return nil
	})
	close(model.updateCh) // Close channel when done
	// Quit the UI when monitoring is done
	p.Quit()
}()

Note: Verify that no other goroutines are sending to this channel before closing it.


174-179: Handle UI errors instead of silently ignoring them.

The error returned by p.Run() is currently ignored. While UI errors might be rare, they should at least be logged for debugging purposes.

 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("Bubble Tea UI error:", err)
 		}
 	}()

474-474: Minor: Grammar improvement.

The message changed from "was previously deployed" to "has already deployed". While both are understandable, the grammar could be improved.

✏️ Suggested grammar fix
-	term.Info("This project has already deployed to the following locations:")
+	term.Info("This project has already been deployed to the following locations:")
src/pkg/cli/subscribe.go (1)

54-88: Consider explicit context check in receiver loop for clarity.

The receiver goroutine doesn't explicitly check ctx.Done() before calling the blocking serverStream.Receive(). While the cleanup goroutine (lines 39-42) closes the stream on context cancellation, adding an explicit check would make the cancellation path more evident:

for {
    select {
    case <-ctx.Done():
        return
    default:
    }
    
    if !serverStream.Receive() {
        // ...
    }
    // ...
}

This is especially helpful given that the stream reassignment race (identified separately) means the cleanup goroutine might not reliably close the current stream.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40fd11a and e492b30.

⛔ Files ignored due to path filters (1)
  • src/go.sum is excluded by !**/*.sum
📒 Files selected for processing (7)
  • pkgs/defang/cli.nix
  • src/cmd/cli/command/compose.go
  • src/go.mod
  • src/pkg/cli/client/errors.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
  • src/pkg/cli/tailAndMonitor.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkgs/defang/cli.nix
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-07T03:07:48.228Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.

Applied to files:

  • src/pkg/cli/client/errors.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/pkg/cli/client/errors.go
  • src/pkg/cli/subscribe_test.go
  • src/pkg/cli/subscribe.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/tailAndMonitor.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

  • src/pkg/cli/subscribe_test.go
🧬 Code graph analysis (2)
src/pkg/cli/subscribe.go (4)
src/pkg/cli/client/provider.go (1)
  • Provider (54-80)
src/protos/io/defang/v1/fabric.pb.go (19)
  • Provider (28-28)
  • Provider (66-68)
  • Provider (70-72)
  • Provider (79-81)
  • SubscribeResponse (4286-4295)
  • SubscribeResponse (4308-4308)
  • SubscribeResponse (4323-4325)
  • Status (3332-3337)
  • Status (3350-3350)
  • Status (3365-3367)
  • ServiceState (135-135)
  • ServiceState (199-201)
  • ServiceState (203-205)
  • ServiceState (212-214)
  • ServiceState_BUILD_FAILED (151-151)
  • ServiceState_DEPLOYMENT_FAILED (150-150)
  • Service (3919-3924)
  • Service (3937-3937)
  • Service (3952-3954)
src/pkg/types/etag.go (1)
  • ETag (9-9)
src/pkg/cli/client/errors.go (2)
  • ErrDeploymentSucceeded (8-8)
  • ErrDeploymentFailed (10-13)
src/pkg/cli/tailAndMonitor.go (5)
src/pkg/cli/client/provider.go (1)
  • Provider (54-80)
src/pkg/cli/subscribe.go (2)
  • ServiceStates (15-15)
  • WatchServiceState (17-114)
src/pkg/cli/waitForCdTaskExit.go (1)
  • WaitForCdTaskExit (14-41)
src/pkg/utils.go (1)
  • SleepWithContext (116-125)
src/pkg/cli/tail.go (1)
  • TailOptions (45-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Analyze (go)
  • GitHub Check: go-test
🔇 Additional comments (7)
src/pkg/cli/subscribe_test.go (1)

226-226: LGTM! Clearer error message.

The updated message accurately describes the failure condition when the test expects an error but receives nil, improving test readability.

src/cmd/cli/command/compose.go (2)

80-80: Consider whether channel buffer size is sufficient.

The updateCh has a buffer of 100 messages. For deployments with many services or rapid state transitions, this buffer could fill up and cause the monitoring goroutine to block when sending updates via p.Send.

Can you estimate the maximum number of state updates expected during a typical deployment? Each service goes through multiple states (BUILD_QUEUED, BUILD_PROVISIONING, BUILD_RUNNING, etc.), so for N services, you might see 5-10 updates per service. With 100 services, that could exceed the buffer.

Consider either:

  1. Increasing the buffer size significantly (e.g., 1000)
  2. Using an unbuffered channel (relying on Bubble Tea's internal buffering)
  3. Making the channel buffering configurable based on the number of services

364-373: LGTM! Clean separation of tail vs. UI monitoring paths.

The conditional logic appropriately preserves the tail behavior when --tail is specified while introducing the new UI-based monitoring as the default. The error handling for both paths is consistent.

src/pkg/cli/tailAndMonitor.go (2)

18-57: LGTM! Well-structured monitoring function.

The Monitor function properly:

  • Validates inputs (though should return error instead of panic)
  • Applies timeout when specified
  • Runs monitoring and CD task checking concurrently
  • Aggregates errors from both sources
  • Provides callback mechanism for state updates

The use of sync.WaitGroup and proper context cancellation ensures clean goroutine lifecycle management.


59-118: Verify the 2-second sleep is necessary.

Lines 54 and 79 include a 2-second sleep before cancellation. While the comment at line 79 explains it's to "make sure we get last status messages," this seems like a brittle approach that relies on timing rather than proper synchronization.

Consider whether there's a more reliable way to ensure all status messages are received before cancellation, such as:

  1. Waiting for a completion signal from the subscription
  2. Using a done channel to signal message processing is complete
  3. Flushing any buffered messages

The hardcoded 2-second delay may be insufficient under heavy load or excessive when messages arrive quickly.

src/go.mod (1)

37-39: Bubble Tea library versions are valid and secure.

The added dependencies (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0, termenv v0.16.0) are the latest stable releases as of January 2026 with no known published vulnerabilities or CVEs.

src/pkg/cli/subscribe.go (1)

77-80: The nil check after Receive() is necessary and justified.

The serverStream.Msg() call can legitimately return nil even after Receive() returns true, as evidenced by the test implementation in subscribe_test.go where MockSubscribeServerStreamForReconnectTest.Msg() explicitly returns nil. This defensive check is appropriate for robust stream handling. Consider adding a comment explaining this edge case for future maintainers.

Comment on lines 203 to 215
func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) {
var managedServices []string
var unmanagedServices []string
for _, service := range serviceInfos {
if cli.CanMonitorService(&service) {
unmanagedServices = append(unmanagedServices, service.Name)
} else {
managedServices = append(managedServices, service.Name)
}
}

return managedServices, unmanagedServices
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "func splitManagedAndUnmanagedServices" --type=go

Repository: DefangLabs/defang

Length of output: 313


🏁 Script executed:

# Read the function from tailAndMonitor.go
echo "=== tailAndMonitor.go ===" && sed -n '134,146p' src/pkg/cli/tailAndMonitor.go

# Read the function from compose.go
echo "" && echo "=== compose.go ===" && sed -n '203,215p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 940


🏁 Script executed:

# Check package declarations
echo "=== tailAndMonitor.go package ===" && head -20 src/pkg/cli/tailAndMonitor.go | grep -E "^package"
echo "=== compose.go package ===" && head -20 src/cmd/cli/command/compose.go | grep -E "^package"

# Check imports to understand CanMonitorService references
echo "" && echo "=== tailAndMonitor.go imports ===" && sed -n '1,30p' src/pkg/cli/tailAndMonitor.go | grep -A 20 "^import"
echo "" && echo "=== compose.go imports ===" && sed -n '1,30p' src/cmd/cli/command/compose.go | grep -A 20 "^import"

Repository: DefangLabs/defang

Length of output: 1448


Extract duplicate function to shared location to avoid code duplication.

The splitManagedAndUnmanagedServices function is duplicated in two locations with identical logic:

  • src/pkg/cli/tailAndMonitor.go (line 134, package cli)
  • src/cmd/cli/command/compose.go (line 203, package command)

Since compose.go already imports the cli package, this function should be moved to src/pkg/cli to eliminate the duplication and follow the DRY principle.

🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 203 - 215, The function
splitManagedAndUnmanagedServices is duplicated; extract a single implementation
into the cli package and have both call sites use it: move the logic into the
existing cli package as an exported function (e.g.,
SplitManagedAndUnmanagedServices or keep the same name but exported if needed),
preserving the signature (accepting compose.Services and returning ([]string,
[]string)), update the places that currently define their own copy (the command
package and the tail/monitor code) to call cli.SplitManagedAndUnmanagedServices,
and remove the duplicate implementations; ensure imports and visibility are
adjusted so compose.Services is referenced correctly from the new shared
function.

Comment on lines +19 to 21
if deploymentID == "" {
panic("deploymentID must be a valid deployment ID")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace panic with error return for production code.

Using panic for input validation in production code can crash the entire application. This should return an error instead, allowing callers to handle the validation failure gracefully.

🛡️ Proposed fix
-func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
+func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
 	if deploymentID == "" {
-		panic("deploymentID must be a valid deployment ID")
+		return nil, errors.New("deploymentID must be a valid deployment ID")
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if deploymentID == "" {
panic("deploymentID must be a valid deployment ID")
}
if deploymentID == "" {
return nil, errors.New("deploymentID must be a valid deployment ID")
}
🤖 Prompt for AI Agents
In @src/pkg/cli/tailAndMonitor.go around lines 19 - 21, Replace the panic on
invalid deploymentID with a proper error return: change the check that currently
does if deploymentID == "" { panic(...) } to return an error (e.g., return
fmt.Errorf("invalid deploymentID: empty string")) from the enclosing function,
update the function signature to return (error) if needed, and propagate that
error to callers so they can handle it instead of crashing the process;
reference the deploymentID validation block in tailAndMonitor.go and update any
callers to handle the returned error.

Comment on lines +61 to +63
if tailOptions.Deployment == "" {
panic("tailOptions.Deployment must be a valid deployment ID")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace panic with error return for production code.

Similar to the issue in the Monitor function, using panic for input validation in TailAndMonitor can crash the application. Return an error instead.

🛡️ Proposed fix
 	tailOptions.Follow = true
 	if tailOptions.Deployment == "" {
-		panic("tailOptions.Deployment must be a valid deployment ID")
+		return nil, errors.New("tailOptions.Deployment must be a valid deployment ID")
 	}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @src/pkg/cli/tailAndMonitor.go around lines 61 - 63, The TailAndMonitor
function currently calls panic("tailOptions.Deployment must be a valid
deployment ID") when tailOptions.Deployment is empty; change this to return a
proper error instead: validate tailOptions.Deployment at the start of
TailAndMonitor, and if empty return a descriptive error (e.g., fmt.Errorf or
errors.New) rather than panicking so callers can handle it; update the function
signature if necessary to return an error and propagate that error from callers
of TailAndMonitor (search for TailAndMonitor and tailOptions.Deployment to
locate sites to update).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 42-46: Add a sync.RWMutex field to the deploymentModel struct to
protect concurrent access to the services map, then wrap all writes to the map
(in the Update method) with mutex.Lock() / mutex.Unlock() and wrap all reads of
the map (in the View method) with mutex.RLock() / mutex.RUnlock(); ensure you
reference the deploymentModel.services accesses in Update and View and hold the
lock only for the minimal time needed to modify or read the map to avoid
blocking other goroutines.
- Around line 378-389: The loop that collects unbuiltServices currently checks
serviceStates against defangv1.ServiceState_BUILD_STOPPING and thus misses
services in defangv1.ServiceState_BUILD_FAILED; update the condition in the for
loop that populates unbuiltServices (the variable serviceStates and the check
using defangv1.ServiceState_BUILD_STOPPING) to also include
defangv1.ServiceState_BUILD_FAILED (e.g., change the if to include state ==
defangv1.ServiceState_BUILD_FAILED or an equivalent check) so
tailOptions.Services and tailOptions.LogType = logs.LogTypeBuild correctly
include failed builds.
🧹 Nitpick comments (1)
src/cmd/cli/command/compose.go (1)

157-202: Consider logging UI errors for diagnostics.

The UI goroutine silently discards errors from p.Run() (line 177). While these errors may not be critical to deployment success, logging them could aid debugging.

🔍 Optional improvement to log UI errors
 	// Start the bubbletea UI in a goroutine
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debugf("UI error: %v", err)
 		}
 	}()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e492b30 and 52e8045.

📒 Files selected for processing (1)
  • src/cmd/cli/command/compose.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/cmd/cli/command/compose.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (8)
src/cmd/cli/command/compose.go (8)

10-12: LGTM: Imports align with new UI requirements.

The added imports (sort, sync, Bubble Tea libraries) are appropriate for the terminal UI and concurrent monitoring features introduced in this PR.

Also applies to: 33-35


64-82: LGTM: Constructor properly initializes UI model.

The initialization logic correctly sets up the services map, spinners, and update channel. The buffer size of 100 for updateCh provides reasonable headroom for status updates.


84-90: LGTM: Standard Bubble Tea initialization pattern.

The Init method correctly batches spinner tick commands for all services.


92-114: LGTM: Correct Bubble Tea update logic.

The Update method properly handles keyboard input, service status updates, and spinner ticks. The existence check on line 100 prevents panics from unknown service names.


116-155: LGTM: Clean UI rendering with stable ordering.

The View method correctly sorts services for consistent display and uses appropriate visual indicators (✓, ✗, spinner) based on service state.


356-377: LGTM: Clean separation between tail and UI monitoring paths.

The branching logic correctly preserves the full log tailing behavior behind the --tail flag while defaulting to the new UI-based monitoring. The info message on line 372 helpfully guides users to access logs when needed.


475-475: LGTM: Text improvement enhances clarity.

The updated message is more concise and natural.


204-216: Extract duplicate function to avoid code duplication.

This function is duplicated in src/pkg/cli/tailAndMonitor.go (line 134). Move it to a shared location in the cli package to follow the DRY principle, as the command package already imports cli.

Likely an incorrect or invalid review comment.

Comment on lines 42 to 45
type deploymentModel struct {
services map[string]*serviceState
quitting bool
updateCh chan serviceUpdate
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add mutex to protect concurrent map access.

The services map is accessed concurrently by the UI goroutine (in View at lines 124-151) and the monitoring goroutine (via p.Send at line 188). Without synchronization, this causes race conditions and potential panics.

Add a sync.RWMutex field to deploymentModel and protect all map accesses:

  • Use Lock/Unlock when modifying the map in Update (lines 100-101)
  • Use RLock/RUnlock when reading the map in View (lines 124-151)
🔒 Proposed fix to add mutex protection
 type deploymentModel struct {
 	services map[string]*serviceState
+	mu       sync.RWMutex
 	quitting bool
 	updateCh chan serviceUpdate
 }

Then protect access in the Update method around line 100:

 	case serviceUpdate:
+		m.mu.Lock()
 		if svc, exists := m.services[msg.name]; exists {
 			svc.status = msg.status
 		}
+		m.mu.Unlock()
 		return m, nil

And in the View method around line 122:

 func (m *deploymentModel) View() string {
 	if m.quitting {
 		return ""
 	}
 
+	m.mu.RLock()
+	defer m.mu.RUnlock()
+
 	var lines []string
 	// Sort services by name for consistent ordering
 	var serviceNames []string

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 42 - 46, Add a sync.RWMutex
field to the deploymentModel struct to protect concurrent access to the services
map, then wrap all writes to the map (in the Update method) with mutex.Lock() /
mutex.Unlock() and wrap all reads of the map (in the View method) with
mutex.RLock() / mutex.RUnlock(); ensure you reference the
deploymentModel.services accesses in Update and View and hold the lock only for
the minimal time needed to modify or read the map to avoid blocking other
goroutines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @src/cmd/cli/command/compose.go:
- Around line 368-372: The loop that collects unbuiltServices currently checks
if state <= defangv1.ServiceState_BUILD_STOPPING which misses
defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates
iteration (the block that appends to unbuiltServices) to also include
BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state ==
defangv1.ServiceState_BUILD_FAILED) so services that failed during build are
captured for build log output.
🧹 Nitpick comments (1)
src/cmd/cli/command/compose.go (1)

177-182: Consider logging UI errors for debugging.

UI errors are silently ignored. While this may be intentional to not disrupt the user experience, consider logging them at debug level for troubleshooting.

 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debugf("UI error: %v", err)
 		}
 	}()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52e8045 and c8b40fd.

📒 Files selected for processing (5)
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices_test.go
  • src/pkg/cli/subscribe.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/getServices_test.go
  • src/pkg/cli/deploymentinfo_test.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/subscribe.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

  • src/pkg/cli/getServices_test.go
  • src/pkg/cli/deploymentinfo_test.go
📚 Learning: 2026-01-07T03:07:48.228Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/cmd/cli/command/compose.go
📚 Learning: 2026-01-07T17:31:18.174Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:185-204
Timestamp: 2026-01-07T17:31:18.174Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, the GetBuildStatus method is intentionally designed as a non-blocking status check. It uses op.Poll(ctx) rather than op.Wait(ctx), and returns nil when the build is nil (operation still in progress), allowing the caller to control the polling loop. The method should not be changed to use Wait() as the waiting logic is handled by the caller.

Applied to files:

  • src/pkg/cli/subscribe.go
🧬 Code graph analysis (1)
src/pkg/cli/subscribe.go (3)
src/pkg/cli/client/provider.go (1)
  • Provider (54-80)
src/protos/io/defang/v1/fabric.pb.go (20)
  • Provider (28-28)
  • Provider (66-68)
  • Provider (70-72)
  • Provider (79-81)
  • SubscribeResponse (4286-4295)
  • SubscribeResponse (4308-4308)
  • SubscribeResponse (4323-4325)
  • Status (3332-3337)
  • Status (3350-3350)
  • Status (3365-3367)
  • ServiceState_NOT_SPECIFIED (138-138)
  • ServiceState (135-135)
  • ServiceState (199-201)
  • ServiceState (203-205)
  • ServiceState (212-214)
  • ServiceState_BUILD_FAILED (151-151)
  • ServiceState_DEPLOYMENT_FAILED (150-150)
  • Service (3919-3924)
  • Service (3937-3937)
  • Service (3952-3954)
src/pkg/cli/client/errors.go (2)
  • ErrDeploymentSucceeded (8-8)
  • ErrDeploymentFailed (10-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: nix-shell-test
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
src/pkg/cli/deploymentinfo.go (1)

61-61: LGTM - STATUS column removal aligns with PR objectives.

This change correctly removes the STATUS column from the service table output, keeping only the canonical "State" as specified in the PR objectives to improve signal-to-noise ratio.

src/cmd/cli/command/compose.go (3)

42-46: Race condition concern addressed by Bubble Tea's design.

The services map is accessed by Update and View methods which are called sequentially on Bubble Tea's main goroutine. The p.Send call from the monitoring goroutine is thread-safe as it uses an internal channel. No mutex is needed here.


64-155: Well-structured Bubble Tea UI implementation.

The model correctly handles:

  • Initialization with spinner ticks for all services
  • State updates via messages
  • Consistent rendering with sorted service names
  • Terminal state indicators (✓/✗) replacing spinners

404-406: Reasonable error handling for PrintServices.

Downgrading the error to a warning here is appropriate since the deployment itself succeeded and this is just a display issue.

src/pkg/cli/subscribe.go (1)

118-141: Clean refactoring to callback-based approach.

The WaitServiceState wrapper correctly delegates to WatchServiceState with a callback that handles terminal states. Using ErrDeploymentSucceeded as a sentinel for successful completion is an idiomatic pattern.

src/pkg/cli/getServices_test.go (1)

89-91: Test expectations correctly updated.

The expected output aligns with the production change that removes the STATUS column from the short form table output while preserving status information in the long form YAML output.

src/pkg/cli/deploymentinfo_test.go (1)

40-41: Test expectations correctly aligned with STATUS column removal.

All three test cases have been updated to reflect the removal of the STATUS column from the table output, matching the production change in deploymentinfo.go.

Also applies to: 61-62, 80-81

Comment on lines +368 to +378
for service, state := range serviceStates {
if state <= defangv1.ServiceState_BUILD_STOPPING {
unbuiltServices = append(unbuiltServices, service)
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Include BUILD_FAILED state when identifying services for build log output.

The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). Services that failed during the build phase should also be included to properly surface build logs on deployment failure.

🐛 Proposed fix
 var unbuiltServices = make([]string, 0, len(project.Services))
 for service, state := range serviceStates {
-	if state <= defangv1.ServiceState_BUILD_STOPPING {
+	if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
 		unbuiltServices = append(unbuiltServices, service)
 	}
 }
🤖 Prompt for AI Agents
In @src/cmd/cli/command/compose.go around lines 368 - 372, The loop that
collects unbuiltServices currently checks if state <=
defangv1.ServiceState_BUILD_STOPPING which misses
defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates
iteration (the block that appends to unbuiltServices) to also include
BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state ==
defangv1.ServiceState_BUILD_FAILED) so services that failed during build are
captured for build log output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we show any service with state != DEPLOYMENT_COMPLETED?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the ServiceState enum definition to understand all states and their values
rg -n -A 30 'enum ServiceState' --type=proto

Length of output: 29


🏁 Script executed:

#!/bin/bash
# Find the ServiceState enum definition
rg -n -A 30 'ServiceState.*=.*iota|const.*ServiceState_' --type=go

# Also search in proto files
fd -e proto -x cat {} | head -200

Length of output: 7507


🏁 Script executed:

#!/bin/bash
# Search for ServiceState enum in proto files
rg -n 'enum ServiceState' src/protos/ -A 30

# Also check the generated Go file for ServiceState constants
rg -n 'ServiceState_BUILD|ServiceState_DEPLOYMENT' src/protos/ | head -40

Length of output: 3191


@lionello Thanks for the question! I think the narrower condition is actually more appropriate here. Let me explain based on the ServiceState enum values:

State progression:

  • Build phase: BUILD_QUEUED (1) → BUILD_PROVISIONING (2) → ... → BUILD_STOPPING (6)
  • Build outcome: Either BUILD_FAILED (11) or moves to deployment phase
  • Deployment phase: DEPLOYMENT_PENDING (8) → DEPLOYMENT_COMPLETED (9) or DEPLOYMENT_FAILED (10)

Why state != DEPLOYMENT_COMPLETED would be too broad:

The comment says "if any services failed to build, only show build logs for those services". The intent is to show BUILD logs specifically for services that failed during the BUILD phase.

Using state != DEPLOYMENT_COMPLETED would include:

  • DEPLOYMENT_FAILED (10): These services successfully built but failed during deployment. Showing their build logs wouldn't help debug the deployment failure - we'd need Pulumi/deployment logs instead.
  • DEPLOYMENT_PENDING (8): These might not have failed at all, just pending.
  • UPDATE_QUEUED (7): Using existing image, no build occurred.

The correct condition should be:

Services that are in the build phase (BUILD_QUEUED through BUILD_STOPPING) OR explicitly failed during build (BUILD_FAILED):

if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
    unbuiltServices = append(unbuiltServices, service)
}

This way, if a service reaches DEPLOYMENT_FAILED, we'll show all logs (including Pulumi logs), not just build logs, which is what you'd need to debug deployment issues.

Does that make sense?


🧬 Code Graph Analysis Results

[src/cmd/cli/command/compose.go] (summary)

  • Summary: The file prints the current service states after deployment via cli.PrintServices(project.Name, provider). The user question about showing services with state != DEPLOYMENT_COMPLETED relates to how states are presented and possibly filtered in the code path that monitors/tails deployment progress and displays statuses. The file also uses a UI path (MonitorWithUI) that can adjust service states (eg, marking DEPLOYMENT_COMPLETED) during monitoring. No separate filtering logic is present in this file snippet; any filtering would be in underlying PrintServices/monitoring logic.

[src/protos/io/defang/v1/fabric.pb.go] 2571-2594

  • Snippet: ServiceInfo struct

  • Content:
    ServiceInfo struct {
    state protoimpl.MessageState protogen:"open.v1"
    Service *Service protobuf:"bytes,1,opt,name=service,proto3" json:"service,omitempty"
    Endpoints []string protobuf:"bytes,2,rep,name=endpoints,proto3" json:"endpoints,omitempty" // list of endpoints, one for each port
    Project string protobuf:"bytes,3,opt,name=project,proto3" json:"project,omitempty" // was: tenant; defaults to tenant ID
    Etag string protobuf:"bytes,4,opt,name=etag,proto3" json:"etag,omitempty"
    Status string protobuf:"bytes,5,opt,name=status,proto3" json:"status,omitempty"
    NatIps []string protobuf:"bytes,6,rep,name=nat_ips,json=natIps,proto3" json:"nat_ips,omitempty" // comma-separated list of NAT IPs
    LbIps []string protobuf:"bytes,7,rep,name=lb_ips,json=lbIps,proto3" json:"lb_ips,omitempty" // comma-separated list of internal CIDRs for the load-balancer
    PrivateFqdn string protobuf:"bytes,8,opt,name=private_fqdn,json=privateFqdn,proto3" json:"private_fqdn,omitempty" // fully qualified domain name (host)
    PublicFqdn string protobuf:"bytes,9,opt,name=public_fqdn,json=publicFqdn,proto3" json:"public_fqdn,omitempty" // fully qualified domain name (ingress)
    CreatedAt *timestamppb.Timestamp protobuf:"bytes,10,opt,name=created_at,json=createdAt,proto3" json:"created_at,omitempty"
    UpdatedAt *timestamppb.Timestamp protobuf:"bytes,11,opt,name=updated_at,json=updatedAt,proto3" json:"updated_at,omitempty"
    ZoneId string protobuf:"bytes,12,opt,name=zone_id,json=zoneId,proto3" json:"zone_id,omitempty" // zone ID for byod domain
    UseAcmeCert bool protobuf:"varint,13,opt,name=use_acme_cert,json=useAcmeCert,proto3" json:"use_acme_cert,omitempty" // If we should setup the facilities to use ACME(let's encrypt) certs
    State ServiceState protobuf:"varint,15,opt,name=state,proto3,enum=io.defang.v1.ServiceState" json:"state,omitempty" // enumerated status of the service
    Domainname string protobuf:"bytes,16,opt,name=domainname,proto3" json:"domainname,omitempty" // domain name for the service
    LbDnsName string protobuf:"bytes,17,opt,name=lb_dns_name,json=lbDnsName,proto3" json:"lb_dns_name,omitempty" // fully qualified domain name for the load-balancer
    AllowScaling bool protobuf:"varint,18,opt,name=allow_scaling,json=allowScaling,proto3" json:"allow_scaling,omitempty" // true if service is allowed to autoscale
    HealthcheckPath string protobuf:"bytes,19,opt,name=healthcheck_path,json=healthcheckPath,proto3" json:"healthcheck_path,omitempty"
    Type ResourceType protobuf:"varint,21,opt,name=type,proto3,enum=io.defang.v1.ResourceType" json:"type,omitempty"
    unknownFields protoimpl.UnknownFields
    sizeCache protoimpl.SizeCache
    }

  • Relevance: This shows that a ServiceInfo has a State field of type ServiceState, which is the source of the status values (including DEPLOYMENT_COMPLETED) that the user comment concerns.

[src/cmd/cli/command/composeUpTui.go] 142-201

  • Snippet: MonitorWithUI usage of DEPLOYMENT_COMPLETED

  • Content (abridged to key parts):
    func MonitorWithUI(ctx context.Context, project compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string) (map[string]defangv1.ServiceState, error) {
    // ... setup ...
    go func() {
    // UI runs in a goroutine
    if _, err := p.Run(); err != nil { /
    ... */ }
    }()

    // Start monitoring in a goroutine
    go func() {
    // Monitor and update UI with service state changes
    serviceStates, monitorErr = Monitor(ctx, project, provider, waitTimeout, deploymentID, func(msg *defangv1.SubscribeResponse, states *ServiceStates) error {
    for name, state := range *states {
    p.Send(serviceUpdate{name: name, status: state})
    }
    return nil
    })

    if monitorErr == nil {
      // When monitoring completes, mark any still-DEPLOYMENT_PENDING as DEPLOYMENT_COMPLETED
      for _, serviceName := range servicesNames {
        if serviceStates[serviceName] == defangv1.ServiceState_DEPLOYMENT_PENDING {
          p.Send(serviceUpdate{name: serviceName, status: defangv1.ServiceState_DEPLOYMENT_COMPLETED})
        }
      }
    }
    p.Quit()
    

    }()

    // wait for UI/monitor goroutines
    wg.Wait()

    return serviceStates, monitorErr
    }

  • Relevance: This shows explicit handling of the DEPLOYMENT_COMPLETED state in the UI; it demonstrates how DEPLOYMENT_PENDING can be transformed into DEPLOYMENT_COMPLETED, which is directly relevant to the idea of filtering or deciding which services to show based on completion state.

Notes and how they relate to user comment

  • The user asks: "Shouldn't we show any service with state != DEPLOYMENT_COMPLETED?" The relevant code paths indicate:
    • Service state is represented by ServiceState, with DEPLOYMENT_COMPLETED being a possible state (as used in MonitorWithUI).
    • There is UI logic that explicitly updates any remaining DEPLOYMENT_PENDING to DEPLOYMENT_COMPLETED upon successful completion of monitoring.
    • Print/monitoring functions (PrintServices, Tail/Monitor) operate on a collection of service states but the provided snippets do not show an explicit filter to exclude non-COMPLETED services from being displayed; rather, the UI path actively updates statuses and shows them, including potentially non-COMPLETED states until completion.
  • If you want to enforce showing only services with state DEPLOYMENT_COMPLETED, you would likely implement a filter around the display paths (e.g., PrintObject/PrintServices) or in the monitor UI to skip services whose final state is not DEPLOYMENT_COMPLETED. The MonitorWithUI snippet demonstrates where such a logic point could be inserted (e.g., filter or hide entries with non-COMPLETED states, or add a flag to show only completed services).

🧠 Learnings used
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:62-86
Timestamp: 2026-01-09T20:19:04.424Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, BuildTag.Parse should fail on unexpected tag formats (tags that don't have 3-4 underscore-separated parts or aren't DefangCDBuildTag) because build tags are strictly controlled and only created in two places: (1) running CD in cloudbuild by CLI, and (2) building images by CD. Unexpected tags indicate an error case.

Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pkg/cli/subscribe.go (1)

39-42: Data race on serverStream during reconnection.

The cancellation goroutine (lines 39-42) captures serverStream by reference. When the receive goroutine reassigns serverStream at line 64 during reconnection, there's an unsynchronized concurrent read/write:

  • Goroutine 1 may read serverStream to call Close()
  • Goroutine 2 may be writing to serverStream during reconnection

Additionally, on reconnection the old stream is never explicitly closed, potentially leaking resources.

🔒 Proposed fix using a mutex or local variable

Option 1: Use a mutex to synchronize access:

+var streamMu sync.Mutex

 go func() {
 	<-ctx.Done()
+	streamMu.Lock()
 	serverStream.Close()
+	streamMu.Unlock()
 }()

 // In receive goroutine, when reconnecting:
+streamMu.Lock()
+serverStream.Close() // close old stream
 serverStream, err = provider.Subscribe(ctx, &subscribeRequest)
+streamMu.Unlock()

Option 2: Avoid sharing the variable by closing from within the receive goroutine only, and using context cancellation to signal exit.

Also applies to: 64-64

♻️ Duplicate comments (5)
src/pkg/cli/tailAndMonitor.go (2)

18-21: Replace panic with error return for production code.

This issue was previously flagged. Using panic for input validation can crash the application. Return an error instead.

🛡️ Proposed fix
 func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
 	if deploymentID == "" {
-		panic("deploymentID must be a valid deployment ID")
+		return nil, errors.New("deploymentID must be a valid deployment ID")
 	}

59-63: Replace panic with error return for production code.

This issue was previously flagged. Similar to the Monitor function, using panic for input validation should be replaced with returning an error.

🛡️ Proposed fix
 func TailAndMonitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, tailOptions TailOptions) (ServiceStates, error) {
 	tailOptions.Follow = true
 	if tailOptions.Deployment == "" {
-		panic("tailOptions.Deployment must be a valid deployment ID")
+		return nil, errors.New("tailOptions.Deployment must be a valid deployment ID")
 	}
src/cmd/cli/command/compose.go (2)

41-45: Add mutex to protect concurrent map access.

This issue was previously flagged. The services map is accessed concurrently by the UI goroutine (in View) and the monitoring goroutine (via p.Send). Without synchronization, this causes race conditions.

🔒 Proposed fix
 type deploymentModel struct {
 	services map[string]*serviceState
+	mu       sync.RWMutex
 	quitting bool
 	updateCh chan serviceUpdate
 }

Then protect access in Update (around line 99):

 	case serviceUpdate:
+		m.mu.Lock()
 		if svc, exists := m.services[msg.name]; exists {
 			svc.status = msg.status
 		}
+		m.mu.Unlock()

And in View (around line 120):

 func (m *deploymentModel) View() string {
+	m.mu.RLock()
+	defer m.mu.RUnlock()
+
 	if m.quitting {

373-378: Include BUILD_FAILED state when identifying services for build log output.

This issue was previously flagged. The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). Services that failed during build should be included to surface build logs on deployment failure.

🐛 Proposed fix
 var unbuiltServices = make([]string, 0, len(project.Services))
 for service, state := range serviceStates {
-	if state <= defangv1.ServiceState_BUILD_STOPPING {
+	if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
 		unbuiltServices = append(unbuiltServices, service)
 	}
 }
src/pkg/cli/subscribe.go (1)

71-74: Goroutine can exit without signaling, causing indefinite hang.

When serverStream.Receive() returns false and serverStream.Err() returns nil (clean stream close), the goroutine exits without sending anything to errChan. The main loop at line 90 blocks indefinitely waiting for a message that will never arrive.

🔒 Proposed fix
 if err := serverStream.Err(); err != nil {
 	errChan <- err
+} else {
+	errChan <- io.EOF // or a sentinel error to signal clean close
 }
 return

Then handle the sentinel appropriately in the main loop.

🧹 Nitpick comments (3)
src/pkg/cli/tailAndMonitor.go (1)

53-56: Consider making the 2-second delay configurable or documented.

The hardcoded 2-second sleep before returning appears to be a buffer for receiving final status messages. Consider either documenting this magic number with a comment explaining its purpose, or making it a named constant.

+const finalMessageGracePeriod = 2 * time.Second // Allow time for final status messages to arrive
+
 	wg.Wait()
-	pkg.SleepWithContext(ctx, 2*time.Second)
+	pkg.SleepWithContext(ctx, finalMessageGracePeriod)
src/cmd/cli/command/compose.go (2)

131-140: Consider handling BUILD_FAILED state in the spinner/status display.

The View method shows "✓" for DEPLOYMENT_COMPLETED and "✗" for DEPLOYMENT_FAILED, but BUILD_FAILED would still show a spinning indicator. Consider adding explicit handling for BUILD_FAILED:

 		switch svc.status {
 		case "DEPLOYMENT_COMPLETED":
 			spinnerOrCheck = "✓"
-		case "DEPLOYMENT_FAILED":
+		case "DEPLOYMENT_FAILED", "BUILD_FAILED":
 			spinnerOrCheck = "✗"
 		default:
 			spinnerOrCheck = svc.spinner.View()
 		}

156-203: Consider handling UI run errors.

The UI goroutine silently ignores errors from p.Run(). While UI errors may not be critical, logging them at debug level would aid troubleshooting:

 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("UI error:", err)
 		}
 	}()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8b40fd and 454c06f.

⛔ Files ignored due to path filters (1)
  • src/go.sum is excluded by !**/*.sum
📒 Files selected for processing (10)
  • pkgs/defang/cli.nix
  • src/cmd/cli/command/compose.go
  • src/go.mod
  • src/pkg/cli/client/errors.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices_test.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
  • src/pkg/cli/tailAndMonitor.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/pkg/cli/getServices_test.go
  • src/pkg/cli/client/errors.go
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2026-01-07T03:07:56.002Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:56.002Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/tailAndMonitor.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/tailAndMonitor.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-09T20:12:21.986Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:21.986Z
Learning: In Go files, recognize and accept the import path go.yaml.in/yaml/v3 as the maintained fork of the YAML library. Do not flag this import as incorrect; this fork supersedes the archived gopkg.in/yaml.v3 path. If you encounter this or similar forked import paths, treat them as valid Go imports and do not raise review flags.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/tailAndMonitor.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-13T17:46:06.788Z
Learnt from: jordanstephens
Repo: DefangLabs/defang PR: 1754
File: src/pkg/agent/tools/provider.go:19-27
Timestamp: 2026-01-13T17:46:06.788Z
Learning: Go interfaces should be consumer-defined and small, defined in the package that uses them, rather than centralized, broad interfaces. Prefer interfaces that capture the specific methods required by a consumer, enabling implicit satisfaction and easier testing. Do not assume a single, global interface name across packages; it is acceptable (and sometimes intentional) for different packages to define interfaces with the same name but different method sets. Apply this guideline across all Go files in the repository.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/tailAndMonitor.go
  • src/cmd/cli/command/compose.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/subscribe.go
  • src/pkg/cli/subscribe_test.go
📚 Learning: 2026-01-09T20:19:04.424Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:62-86
Timestamp: 2026-01-09T20:19:04.424Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, BuildTag.Parse should fail on unexpected tag formats (tags that don't have 3-4 underscore-separated parts or aren't DefangCDBuildTag) because build tags are strictly controlled and only created in two places: (1) running CD in cloudbuild by CLI, and (2) building images by CD. Unexpected tags indicate an error case.

Applied to files:

  • src/pkg/cli/tailAndMonitor.go
  • src/cmd/cli/command/compose.go
📚 Learning: 2025-12-31T13:47:20.413Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:20.413Z
Learning: In the Defang codebase, prefer allowing panics in mock objects within test files rather than adding defensive error handling, as the panic behavior helps catch issues during testing.

Applied to files:

  • src/pkg/cli/tailAndMonitor.go
📚 Learning: 2026-01-09T20:31:23.614Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.

Applied to files:

  • src/cmd/cli/command/compose.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/subscribe.go
📚 Learning: 2026-01-09T20:12:31.961Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:31.961Z
Learning: The import path `go.yaml.in/yaml/v3` is correct and refers to the YAML organization's fork (yaml/go-yaml) which is the maintained successor after the original gopkg.in/yaml.v3 (go-yaml/yaml) was archived in April 2025. Do not flag this import path as incorrect.

Applied to files:

  • src/go.mod
🧬 Code graph analysis (2)
src/pkg/cli/tailAndMonitor.go (5)
src/pkg/cli/client/provider.go (1)
  • Provider (64-90)
src/pkg/cli/subscribe.go (2)
  • ServiceStates (15-15)
  • WatchServiceState (17-116)
src/pkg/cli/waitForCdTaskExit.go (1)
  • WaitForCdTaskExit (14-41)
src/pkg/utils.go (1)
  • SleepWithContext (120-129)
src/pkg/cli/tail.go (1)
  • TailOptions (45-58)
src/cmd/cli/command/compose.go (5)
src/pkg/cli/client/provider.go (1)
  • Provider (64-90)
src/protos/io/defang/v1/fabric.pb.go (18)
  • Provider (28-28)
  • Provider (66-68)
  • Provider (70-72)
  • Provider (79-81)
  • ServiceState (135-135)
  • ServiceState (199-201)
  • ServiceState (203-205)
  • ServiceState (212-214)
  • SubscribeResponse (4758-4767)
  • SubscribeResponse (4780-4780)
  • SubscribeResponse (4795-4797)
  • Deployment (3322-3342)
  • Deployment (3355-3355)
  • Deployment (3370-3372)
  • ServiceState_BUILD_STOPPING (145-145)
  • Stack (683-696)
  • Stack (709-709)
  • Stack (724-726)
src/pkg/cli/tailAndMonitor.go (2)
  • Monitor (18-57)
  • TailAndMonitor (59-118)
src/pkg/cli/tail.go (2)
  • TailOptions (45-58)
  • Tail (118-153)
src/pkg/logs/log_type.go (3)
  • LogType (8-8)
  • LogTypeAll (23-23)
  • LogTypeBuild (21-21)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Analyze (go)
  • GitHub Check: go-test
🔇 Additional comments (15)
src/pkg/cli/deploymentinfo.go (1)

61-61: LGTM!

The removal of the "Status" column from the table output aligns with the PR objective to improve signal-to-noise ratio during deployment. The Status field is appropriately retained in the Service struct and still populated by GetServiceStatesAndEndpoints, preserving the data for other potential consumers while simplifying the default display.

src/pkg/cli/deploymentinfo_test.go (1)

40-42: LGTM!

Test expectations correctly updated to reflect the removal of the STATUS column from the table output. The test input data appropriately retains the Status field values since the underlying struct still requires them, while the expected output no longer includes the column.

Also applies to: 61-63, 80-82

pkgs/defang/cli.nix (1)

10-10: LGTM!

The vendorHash update correctly reflects the new dependencies added in go.mod (charmbracelet/bubbles, bubbletea, lipgloss, and related transitive dependencies).

src/go.mod (2)

37-39: LGTM!

The Bubble Tea ecosystem dependencies (bubbles, bubbletea, lipgloss) are well-established libraries for building terminal UIs in Go. These additions support the new interactive deployment status UI described in the PR objectives.


56-56: LGTM!

The termenv upgrade to v0.16.0 is appropriate as it's a transitive dependency of the Bubble Tea ecosystem and ensures compatibility with the new UI components.

src/pkg/cli/subscribe_test.go (1)

222-234: LGTM!

The assertion message correction is appropriate. The previous message "Unexpected error: %v" with a nil error would be confusing; "Expected error but got nil" correctly describes the failure condition in error-expecting test cases.

src/pkg/cli/tailAndMonitor.go (1)

120-132: LGTM!

The CanMonitorService and splitManagedAndUnmanagedServices helper functions are well-structured. The logic correctly identifies services that can be monitored based on restart policy and compute service classification.

src/cmd/cli/command/compose.go (6)

57-61: LGTM!

The adaptive color styles for spinner, status, and service names provide good visual feedback and adapt to light/dark terminal themes appropriately.


63-81: LGTM!

The newDeploymentModel constructor correctly initializes spinners for each service with a reasonable buffer size (100) for the update channel.


83-113: LGTM!

The Bubble Tea Init and Update methods are correctly implemented:

  • Init properly batches spinner tick commands for all services
  • Update handles Ctrl+C for graceful exit, service updates, and spinner ticks
  • The type switch pattern is idiomatic for Bubble Tea

349-367: LGTM!

The conditional monitoring path correctly:

  • Uses TailAndMonitor for verbose or non-interactive modes (full log tailing)
  • Uses monitorWithUI for interactive sessions (compact UI)
  • Provides helpful hint about tailing logs with the deployment ID

410-412: LGTM!

The updated call to cli.PrintServices correctly uses session.Provider and handles errors appropriately with a warning rather than failing the command.


466-466: LGTM!

The updated message "This project has already deployed to the following locations:" is clearer than the previous wording.

src/pkg/cli/subscribe.go (2)

90-115: Main loop logic is correct, contingent on fixing goroutine signaling.

The select-based message handling and callback invocation are well-structured. The early filtering of unexpected service names (lines 99-102) is good defensive coding.


118-150: Wrapper and helper functions are well-implemented.

WaitServiceState cleanly wraps WatchServiceState with appropriate callback logic for detecting failures and successful completion. The allInState helper is straightforward and correct.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/pkg/cli/deploymentinfo.go (1)

68-76: Fix column header names to match struct fields.

The attrs slice contains headers that don't match the Service struct field names. The term.Table function uses FieldByName() for exact matching (see src/pkg/term/table.go line 57), so mismatches result in "N/A" being printed instead of actual values. Correct the following headers:

  • "Deployment Status""Deployment"
  • "State""DeploymentState"
src/pkg/cli/deploymentinfo_test.go (1)

86-105: Tests make real HTTP requests, causing potential flakiness.

These tests invoke PrintServiceStatesAndEndpoints, which calls GetHealthcheckResults. This function makes actual HTTP requests to endpoints via http.DefaultClient.Do() without mocking:

  • Test "Service with Domainname" expects "healthy" for example.com (line 62), requiring a real HTTPS request to return a 2xx status.
  • Test "endpoint without port" expects "unreachable" for service1 (line 81), which attempts a real network request.
  • Network unavailability or changes in endpoint behavior will cause test failures.

Consider mocking http.DefaultClient using an injected HTTP client interface or by setting http.DefaultClient.Transport to a mock RoundTripper, as done in other tests in this directory (e.g., new_test.go).

🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices.go`:
- Around line 57-106: GetHealthcheckResults spawns goroutines that write
concurrently to the shared results map and also mutates serviceInfo.State,
causing data races; protect concurrent access by adding synchronization (e.g., a
sync.Mutex or replace results with a sync.Map) around writes to results inside
the goroutine and when reading/writing it elsewhere, and avoid mutating the
input ServiceInfo in-place from goroutines (e.g., send status back on a channel
and update serviceInfo.State serially after wg.Wait or make a copy of
serviceInfo per goroutine before mutating). Target symbols:
GetHealthcheckResults, results, serviceInfo.State; ensure all writes to results
and any ServiceInfo state changes are synchronized or moved out of concurrent
goroutines.
🧹 Nitpick comments (1)
src/pkg/cli/deploymentinfo.go (1)

47-56: Consider nil check for serviceInfo.Service.

Line 49 accesses serviceInfo.Service.Name without checking if Service is nil. While this may not occur in practice, a defensive check would prevent potential panics if malformed data is received.

💡 Optional defensive check
 	for _, serviceInfo := range serviceInfos {
+		if serviceInfo.Service == nil {
+			continue
+		}
 		fqdn := serviceInfo.PublicFqdn
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 454c06f and 370d92b.

📒 Files selected for processing (4)
  • src/pkg/agent/tools/default_tool_cli.go
  • src/pkg/cli/deploymentinfo.go
  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices.go
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-12-31T13:47:12.225Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
📚 Learning: 2026-01-07T00:34:13.131Z
Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices.go
  • src/pkg/agent/tools/default_tool_cli.go
  • src/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-09T20:12:21.986Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:30-30
Timestamp: 2026-01-09T20:12:21.986Z
Learning: In Go files, recognize and accept the import path go.yaml.in/yaml/v3 as the maintained fork of the YAML library. Do not flag this import as incorrect; this fork supersedes the archived gopkg.in/yaml.v3 path. If you encounter this or similar forked import paths, treat them as valid Go imports and do not raise review flags.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices.go
  • src/pkg/agent/tools/default_tool_cli.go
  • src/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-13T17:46:06.788Z
Learnt from: jordanstephens
Repo: DefangLabs/defang PR: 1754
File: src/pkg/agent/tools/provider.go:19-27
Timestamp: 2026-01-13T17:46:06.788Z
Learning: Go interfaces should be consumer-defined and small, defined in the package that uses them, rather than centralized, broad interfaces. Prefer interfaces that capture the specific methods required by a consumer, enabling implicit satisfaction and easier testing. Do not assume a single, global interface name across packages; it is acceptable (and sometimes intentional) for different packages to define interfaces with the same name but different method sets. Apply this guideline across all Go files in the repository.

Applied to files:

  • src/pkg/cli/deploymentinfo_test.go
  • src/pkg/cli/getServices.go
  • src/pkg/agent/tools/default_tool_cli.go
  • src/pkg/cli/deploymentinfo.go
📚 Learning: 2026-01-09T20:31:23.614Z
Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/stream.go:497-512
Timestamp: 2026-01-09T20:31:23.614Z
Learning: In src/pkg/cli/client/byoc/gcp/stream.go, the getReadyServicesCompletedResps helper function intentionally uses variable shadowing. The loop variable `status` from `readyServices` map represents individual service statuses, while the function parameter (to be renamed `cdStatus`) represents only the CD service (defangCD) completion status. Each ready service should retain its original status from the map.

Applied to files:

  • src/pkg/cli/getServices.go
  • src/pkg/agent/tools/default_tool_cli.go
  • src/pkg/cli/deploymentinfo.go
🧬 Code graph analysis (4)
src/pkg/cli/deploymentinfo_test.go (1)
src/pkg/cli/deploymentinfo.go (1)
  • PrintServiceStatesAndEndpoints (62-83)
src/pkg/cli/getServices.go (2)
src/pkg/cli/deploymentinfo.go (2)
  • PrintServiceStatesAndEndpoints (62-83)
  • Service (10-18)
src/protos/io/defang/v1/fabric.pb.go (9)
  • ServiceInfo (2568-2591)
  • ServiceInfo (2604-2604)
  • ServiceInfo (2619-2621)
  • Service (4391-4396)
  • Service (4409-4409)
  • Service (4424-4426)
  • Status (3780-3785)
  • Status (3798-3798)
  • Status (3813-3815)
src/pkg/agent/tools/default_tool_cli.go (1)
src/pkg/cli/deploymentinfo.go (1)
  • GetServiceStatesAndEndpoints (20-60)
src/pkg/cli/deploymentinfo.go (2)
src/protos/io/defang/v1/fabric.pb.go (13)
  • Service (4391-4396)
  • Service (4409-4409)
  • Service (4424-4426)
  • ServiceState (135-135)
  • ServiceState (199-201)
  • ServiceState (203-205)
  • ServiceState (212-214)
  • Status (3780-3785)
  • Status (3798-3798)
  • Status (3813-3815)
  • ServiceInfo (2568-2591)
  • ServiceInfo (2604-2604)
  • ServiceInfo (2619-2621)
src/pkg/cli/getServices.go (1)
  • GetHealthcheckResults (57-107)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (3)
src/pkg/agent/tools/default_tool_cli.go (1)

73-81: LGTM!

The change correctly threads the context to GetServiceStatesAndEndpoints, enabling proper timeout and cancellation handling for the healthcheck requests performed downstream.

src/pkg/cli/getServices.go (1)

44-55: LGTM!

PrintServices cleanly delegates to PrintServiceStatesAndEndpoints, properly threading the context for healthcheck processing.

src/pkg/cli/deploymentinfo.go (1)

10-18: LGTM on struct updates.

The Service struct is well-organized with the new DeploymentState and Healthcheck fields. The field ordering and types are appropriate.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

jordanstephens and others added 17 commits January 16, 2026 10:06
The WatchServiceState function was blocking on serverStream.Receive()
even when the context was cancelled by cancelSvcStatus. This change
moves the stream receiving to a separate goroutine and uses channels
with a select statement to properly handle context cancellation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
The test was flaky in CI due to a race condition in the select statement.
When a stream ended after sending the final message (e.g., BUILD_FAILED),
both msgChan and errChan could have data ready simultaneously:
- msgChan had the failure state message
- errChan had nil (from stream ending without error)

The select statement could non-deterministically choose either channel,
causing the test to sometimes return nil error instead of ErrDeploymentFailed.

Fixed by:
1. Not sending nil errors to errChan when stream ends normally
2. Fixing misleading error message in test (cosmetic)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.vscode/launch.json:
- Around line 14-20: The launch.json contains a developer-specific hardcoded
path in the "args" array and invalid JSON trailing commas; replace the absolute
path entry with a workspace-relative variable (e.g., use
${workspaceFolder}/samples/... in the "args" array where the hardcoded path
appears), remove trailing commas after the last array/object entries to produce
valid JSON, and either add .vscode/launch.json to .gitignore or move this config
to a workspace-specific settings file so other developers aren't forced to use
the same path.

In `@src/pkg/cli/subscribe.go`:
- Around line 55-88: When the receive loop sees serverStream.Receive() == false
and serverStream.Err() == nil (clean closure) it should explicitly notify the
main watcher instead of silently returning; update the goroutine that reads from
serverStream in WatchServiceState so that on clean closure you send a sentinel
(e.g., io.EOF or a dedicated sentinel error value) into errChan (using a
non-blocking send/select that respects ctx.Done()) rather than just returning,
and keep the existing behavior for non-nil errors from serverStream.Err();
reference serverStream.Receive(), serverStream.Err(), errChan, msgChan, and
caller functions WatchServiceState / Monitor / WaitForCdTaskExit /
cancelSvcStatus to locate and coordinate this change.
♻️ Duplicate comments (3)
src/cmd/cli/command/compose.go (1)

373-378: BUILD_FAILED state is still excluded from build log collection.

The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). This was flagged in a previous review and marked as addressed, but the fix doesn't appear in this code. Services that failed during build should be included to properly surface build logs on deployment failure.

🐛 Proposed fix
 var unbuiltServices = make([]string, 0, len(project.Services))
 for service, state := range serviceStates {
-	if state <= defangv1.ServiceState_BUILD_STOPPING {
+	if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
 		unbuiltServices = append(unbuiltServices, service)
 	}
 }
src/pkg/cli/tailAndMonitor.go (2)

18-21: Replace panic with error return for production code.

Using panic for input validation in production code can crash the entire application. Return an error instead to allow graceful handling.

🛡️ Proposed fix
 func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
 	if deploymentID == "" {
-		panic("deploymentID must be a valid deployment ID")
+		return nil, errors.New("deploymentID must be a valid deployment ID")
 	}

59-63: Replace panic with error return for production code.

Similar to the Monitor function, using panic here can crash the application. Return an error instead.

🛡️ Proposed fix
 func TailAndMonitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, tailOptions TailOptions) (ServiceStates, error) {
 	tailOptions.Follow = true
 	if tailOptions.Deployment == "" {
-		panic("tailOptions.Deployment must be a valid deployment ID")
+		return nil, errors.New("tailOptions.Deployment must be a valid deployment ID")
 	}
🧹 Nitpick comments (3)
src/cmd/cli/command/compose.go (2)

71-74: Initial status string doesn't match actual ServiceState enum.

The hardcoded "DEPLOYMENT_QUEUED" doesn't correspond to any value in the ServiceState enum (which has BUILD_QUEUED, UPDATE_QUEUED, DEPLOYMENT_PENDING, etc.). Consider using defangv1.ServiceState_DEPLOYMENT_PENDING.String() or similar for consistency with the actual state values.

♻️ Proposed fix
 		services[name] = &serviceState{
-			status:  "DEPLOYMENT_QUEUED",
+			status:  defangv1.ServiceState_DEPLOYMENT_PENDING.String(),
 			spinner: s,
 		}

176-181: UI errors are silently discarded.

The error from p.Run() is ignored. While terminal UI errors may not be critical, logging them would help debugging display issues.

♻️ Proposed fix
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("UI error:", err)
 		}
 	}()
src/pkg/cli/tailAndMonitor.go (1)

33-56: Consider removing redundant sleep or documenting its purpose.

Line 54 adds a 2-second sleep after wg.Wait(). Combined with the additional 2-second sleep at line 79 in TailAndMonitor, there's a total 4-second delay. If the intent is to capture final status messages, consider consolidating these sleeps or adding a comment explaining why both are necessary.

 	wg.Wait()
-	pkg.SleepWithContext(ctx, 2*time.Second)
+	pkg.SleepWithContext(ctx, 2*time.Second) // Allow time for final status messages to arrive

 	return serviceStates, errors.Join(cdErr, svcErr)

Comment on lines +14 to +20
"args": [
"-C",
"/Users/jordan/wk/samples/samples/html-css-js",
"deploy",
"-s",
"beta",
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Developer-specific configuration should not be committed.

This launch configuration contains a hardcoded user-specific path (/Users/jordan/wk/samples/...) that won't work for other developers. Additionally, the static analysis correctly identifies invalid JSON syntax - trailing commas on lines 20 and 25 are not valid in standard JSON.

Consider:

  1. Adding .vscode/launch.json to .gitignore
  2. Or using workspace-relative variables like ${workspaceFolder}/samples
🐛 Proposed fix for JSON syntax
       "args": [
         "-C",
-        "/Users/jordan/wk/samples/samples/html-css-js",
+        "${workspaceFolder}/samples/html-css-js",
         "deploy",
         "-s",
-        "beta",
+        "beta"
       ],
       "env": {
         // "GCP_PROJECT_ID": "jordan-project-463223"
         "DEFANG_PROVIDER": "aws",
-        "AWS_REGION": "us-west-2",
+        "AWS_REGION": "us-west-2"
       }
-    },
+    }
   ]
 }
🧰 Tools
🪛 Biome (2.1.2)

[error] 20-20: Expected an array, an object, or a literal but instead found ']'.

Expected an array, an object, or a literal here.

(parse)

🤖 Prompt for AI Agents
In @.vscode/launch.json around lines 14 - 20, The launch.json contains a
developer-specific hardcoded path in the "args" array and invalid JSON trailing
commas; replace the absolute path entry with a workspace-relative variable
(e.g., use ${workspaceFolder}/samples/... in the "args" array where the
hardcoded path appears), remove trailing commas after the last array/object
entries to produce valid JSON, and either add .vscode/launch.json to .gitignore
or move this config to a workspace-specific settings file so other developers
aren't forced to use the same path.

Comment on lines +55 to +88
go func() {
for {
if !serverStream.Receive() {
// Reconnect on Error: internal: stream error: stream ID 5; INTERNAL_ERROR; received from peer
if isTransientError(serverStream.Err()) {
if err := provider.DelayBeforeRetry(ctx); err != nil {
errChan <- err
return
}
serverStream, err = provider.Subscribe(ctx, &subscribeRequest)
if err != nil {
errChan <- err
return
}
continue
}
serverStream, err = provider.Subscribe(ctx, &subscribeRequest)
if err != nil {
return serviceStates, err
if err := serverStream.Err(); err != nil {
errChan <- err
}
return
}

msg := serverStream.Msg()
if msg == nil {
continue
}
return serviceStates, serverStream.Err()
}

msg := serverStream.Msg()
if msg == nil {
continue
select {
case msgChan <- msg:
case <-ctx.Done():
return
}
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how WatchServiceState is called and context management
cd /tmp && git ls-files src/pkg/cli/ | head -20

Repository: DefangLabs/defang

Length of output: 129


🏁 Script executed:

#!/bin/bash
# Get the full context of subscribe.go
wc -l src/pkg/cli/subscribe.go

Repository: DefangLabs/defang

Length of output: 89


🏁 Script executed:

#!/bin/bash
# Read the subscribe.go file to understand the full context
cat -n src/pkg/cli/subscribe.go | head -100

Repository: DefangLabs/defang

Length of output: 3400


🏁 Script executed:

#!/bin/bash
# Search for WatchServiceState usage
rg -nA5 'WatchServiceState\(' --type=go

Repository: DefangLabs/defang

Length of output: 1338


🏁 Script executed:

#!/bin/bash
# Read tailAndMonitor.go around line 42 to see context management
cat -n src/pkg/cli/tailAndMonitor.go | head -80

Repository: DefangLabs/defang

Length of output: 3255


🏁 Script executed:

#!/bin/bash
# Search for other calls to WatchServiceState
rg -nB3 'WatchServiceState\(' --type=go

Repository: DefangLabs/defang

Length of output: 958


🏁 Script executed:

#!/bin/bash
# Check how the main loop in subscribe.go ends - see the rest of the function
cat -n src/pkg/cli/subscribe.go | sed -n '90,150p'

Repository: DefangLabs/defang

Length of output: 2249


🏁 Script executed:

#!/bin/bash
# Search for calls to WaitServiceState
rg -nB2 -A2 'WaitServiceState\(' --type=go

Repository: DefangLabs/defang

Length of output: 1809


🏁 Script executed:

#!/bin/bash
# Search more broadly for WatchServiceState calls to ensure we have all callers
rg -n 'WatchServiceState\(' --type=go

Repository: DefangLabs/defang

Length of output: 448


🏁 Script executed:

#!/bin/bash
# Check what WaitForCdTaskExit does to understand context cancellation timing
rg -nA10 'func WaitForCdTaskExit' --type=go

Repository: DefangLabs/defang

Length of output: 799


Stream goroutine exits silently on clean stream closure, depending on external context cancellation.

When serverStream.Receive() returns false and serverStream.Err() returns nil (clean closure), the goroutine exits without signaling errChan. The main loop then depends on the parent context (passed into WatchServiceState) being cancelled to unblock.

In Monitor, this works as designed: WaitForCdTaskExit polls for deployment completion and calls cancelSvcStatus() when done, which cancels the context passed to WatchServiceState. However, this creates an implicit dependency where the receive loop's termination relies on external context management rather than on the stream state itself.

For robustness, consider signaling the main loop on clean stream closure (e.g., sending nil or a sentinel to errChan) instead of relying on the caller to always provide guaranteed context cancellation.

🤖 Prompt for AI Agents
In `@src/pkg/cli/subscribe.go` around lines 55 - 88, When the receive loop sees
serverStream.Receive() == false and serverStream.Err() == nil (clean closure) it
should explicitly notify the main watcher instead of silently returning; update
the goroutine that reads from serverStream in WatchServiceState so that on clean
closure you send a sentinel (e.g., io.EOF or a dedicated sentinel error value)
into errChan (using a non-blocking send/select that respects ctx.Done()) rather
than just returning, and keep the existing behavior for non-nil errors from
serverStream.Err(); reference serverStream.Receive(), serverStream.Err(),
errChan, msgChan, and caller functions WatchServiceState / Monitor /
WaitForCdTaskExit / cancelSvcStatus to locate and coordinate this change.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pkg/cli/getServices_test.go (1)

196-213: Test expectation does not match implementation.

This test expects Endpoint: "N/A" for an endpoint without a port, but ServiceEndpointsFromServiceInfo would produce "https://service1" for input "service1". There's no "N/A" handling in the implementation.

This discrepancy won't surface until the missing Test prefix is fixed. Please verify the intended behavior and align either the test expectations or the implementation.

🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices_test.go`:
- Line 150: The test function ServiceEndpointFromServiceInfo is missing the
required Test prefix so the Go test runner skips it; rename the function to
start with Test (e.g., TestServiceEndpointFromServiceInfo) and keep the
signature func TestServiceEndpointFromServiceInfo(t *testing.T) so the test
harness discovers and executes the endpoint scenarios in getServices_test.go.
🧹 Nitpick comments (1)
src/pkg/cli/getServices.go (1)

182-190: Consider simplifying the return signature.

This function always returns nil for the error. If error handling isn't planned, consider removing the error return to simplify the API. Otherwise, this is fine as-is for future extensibility.

♻️ Optional simplification
-func ServiceEndpointsFromServiceInfos(serviceInfos []*defangv1.ServiceInfo) ([]ServiceEndpoint, error) {
+func ServiceEndpointsFromServiceInfos(serviceInfos []*defangv1.ServiceInfo) []ServiceEndpoint {
 	var serviceTableItems []ServiceEndpoint
 
 	for _, serviceInfo := range serviceInfos {
 		serviceTableItems = append(serviceTableItems, ServiceEndpointsFromServiceInfo(serviceInfo)...)
 	}
 
-	return serviceTableItems, nil
+	return serviceTableItems
 }

This would require updating the caller at line 70 in GetServices.

@lionello lionello changed the title Improve signal-to-noise ratio during deployment Improve signal-to-noise ratio during deployment (TUI) Jan 19, 2026
Previously, partially completed deployments appeared like this:

```
 * Live tail logs with `defang tail --deployment=6k69je47hxiy`
⢿  [app] DEPLOYMENT_PENDING
✓ [embedding] DEPLOYMENT_COMPLETED
✓ [llm] DEPLOYMENT_COMPLETED
⢿  [postgres] DEPLOYMENT_QUEUED
⢿  [redis] DEPLOYMENT_QUEUED
✓ [worker] DEPLOYMENT_COMPLETED
```
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/pkg/cli/getServices.go`:
- Around line 152-168: ServiceEndpointsFromServiceInfo currently only prepends
"https://" when no port is present, leaving ported endpoints like "service1:80"
unschemed; update ServiceEndpointsFromServiceInfo to detect host:port endpoints
(e.g., via net.SplitHostPort result) and prepend "http://" for those cases so
Endpoint values become "http://service:80" (ensuring healthcheck logic that
checks for absence of "https://" works), and also rename the test function
ServiceEndpointFromServiceInfo to TestServiceEndpointFromServiceInfo so the test
actually runs and catches this mismatch.
♻️ Duplicate comments (2)
src/cmd/cli/command/compose.go (1)

380-385: Include BUILD_FAILED state when collecting services for build logs.

The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). Services that failed during build won't have their build logs displayed when deployment fails.

🐛 Proposed fix
 var unbuiltServices = make([]string, 0, len(project.Services))
 for service, state := range serviceStates {
-	if state <= defangv1.ServiceState_BUILD_STOPPING {
+	if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
 		unbuiltServices = append(unbuiltServices, service)
 	}
 }
src/pkg/cli/getServices_test.go (1)

150-150: Missing Test prefix — test function will not execute.

The function is named ServiceEndpointFromServiceInfo but Go's test framework requires functions to start with Test to be recognized and executed. This entire test function will be silently skipped.

Additionally, the test expectations don't match the current implementation (e.g., expecting "N/A" or "http://service1:80" for endpoints that the code produces differently), which would be caught once the test runs.

🐛 Proposed fix
-func ServiceEndpointFromServiceInfo(t *testing.T) {
+func TestServiceEndpointFromServiceInfo(t *testing.T) {
🧹 Nitpick comments (2)
src/cmd/cli/command/compose.go (1)

176-181: Consider logging UI errors for debugging.

UI errors are silently discarded. While rare, they could make debugging difficult if issues arise.

♻️ Optional: Log UI errors
 go func() {
 	defer wg.Done()
 	if _, err := p.Run(); err != nil {
-		// Handle UI errors if needed
+		term.Debugf("UI error: %v", err)
 	}
 }()
src/pkg/cli/getServices.go (1)

225-235: Input slice mutation may be unexpected by callers.

The function modifies serviceEndpoints in-place, clearing Service, Deployment, and State fields for duplicate rows. While current callers don't reuse the data, this side effect could cause subtle bugs if callers expect unchanged data.

Consider documenting this behavior or working on a copy if this becomes a concern.

@jordanstephens jordanstephens force-pushed the jordan/deployment-status-updates branch from 1d6019e to 07e4a61 Compare January 19, 2026 20:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/cmd/cli/command/compose.go`:
- Around line 187-198: The code currently only routes to TailAndMonitor when
global.Verbose or global.NonInteractive are true; update the conditional to also
check the CLI --tail flag so interactive users can opt into live tailing. Modify
the if that decides between TailAndMonitor and MonitorWithUI to include the tail
flag (the parsed boolean flag for "--tail") alongside
global.Verbose/global.NonInteractive, and when true set tailOptions.Follow =
true and call cli.TailAndMonitor(ctx, project, session.Provider,
waitTimeoutDuration, tailOptions) (handling err as already done); otherwise call
cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration,
deploy.Etag). Ensure you reference and use the existing symbols tailOptions,
TailAndMonitor, MonitorWithUI, global.Verbose, global.NonInteractive, and
deploy.Etag.
♻️ Duplicate comments (1)
src/cmd/cli/command/compose.go (1)

205-208: Include BUILD_FAILED when collecting unbuilt services for build logs.

The check state <= BUILD_STOPPING excludes BUILD_FAILED, so services that already failed during build won’t be captured for build-log tailing.

🐛 Proposed fix
 for service, state := range serviceStates {
-	if state <= defangv1.ServiceState_BUILD_STOPPING {
+	if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED {
 		unbuiltServices = append(unbuiltServices, service)
 	}
 }

Comment on lines +187 to 198
waitTimeoutDuration := time.Duration(waitTimeout) * time.Second
var serviceStates map[string]defangv1.ServiceState
if global.Verbose || global.NonInteractive {
tailOptions.Follow = true
serviceStates, err = cli.TailAndMonitor(ctx, project, session.Provider, waitTimeoutDuration, tailOptions)
if err != nil {
return err
}
} else {
term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`")
serviceStates, err = cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration, deploy.Etag)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -nP '\bGetBool\("tail"\)' --type=go

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

rg -nA5 -nB5 'tail.*Flag\|Flag.*tail' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 483


🏁 Script executed:

grep -n "tail" src/cmd/cli/command/compose.go | head -20

Repository: DefangLabs/defang

Length of output: 1694


🏁 Script executed:

sed -n '170,260p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 3581


🏁 Script executed:

sed -n '1,50p' src/cmd/cli/command/compose.go | grep -A20 "func.*compose"

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

sed -n '130,180p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 1764


🏁 Script executed:

sed -n '100,140p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 1645


🏁 Script executed:

sed -n '50,100p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 1582


🏁 Script executed:

rg -n "composeUpCmd.*&cobra.Command" src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 96


🏁 Script executed:

sed -n '60,75p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 432


🏁 Script executed:

sed -n '60,68p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 293


Enable the --tail flag to opt into live log tailing in interactive mode.

The --tail flag is currently defined as a no-op but never checked in the code, so defang compose up --tail behaves like the UI path regardless of the flag. Per the PR objective, this should enable live-tail behavior. The condition at lines 189–198 gates only on global.Verbose / global.NonInteractive, so add a check for the --tail flag and route to TailAndMonitor when it's set.

✅ Suggested update
 waitTimeoutDuration := time.Duration(waitTimeout) * time.Second
 var serviceStates map[string]defangv1.ServiceState
-if global.Verbose || global.NonInteractive {
+tail, _ := cmd.Flags().GetBool("tail")
+if global.Verbose || global.NonInteractive || tail {
 	tailOptions.Follow = true
 	serviceStates, err = cli.TailAndMonitor(ctx, project, session.Provider, waitTimeoutDuration, tailOptions)
 	if err != nil {
 		return err
 	}
 } else {
 	term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`")
 	serviceStates, err = cli.MonitorWithUI(ctx, project, session.Provider, waitTimeoutDuration, deploy.Etag)
 }
🤖 Prompt for AI Agents
In `@src/cmd/cli/command/compose.go` around lines 187 - 198, The code currently
only routes to TailAndMonitor when global.Verbose or global.NonInteractive are
true; update the conditional to also check the CLI --tail flag so interactive
users can opt into live tailing. Modify the if that decides between
TailAndMonitor and MonitorWithUI to include the tail flag (the parsed boolean
flag for "--tail") alongside global.Verbose/global.NonInteractive, and when true
set tailOptions.Follow = true and call cli.TailAndMonitor(ctx, project,
session.Provider, waitTimeoutDuration, tailOptions) (handling err as already
done); otherwise call cli.MonitorWithUI(ctx, project, session.Provider,
waitTimeoutDuration, deploy.Etag). Ensure you reference and use the existing
symbols tailOptions, TailAndMonitor, MonitorWithUI, global.Verbose,
global.NonInteractive, and deploy.Etag.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
"permissions": {
"allow": [
"Bash(go test:*)",
"Bash(make:*)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is potentially destructive, so perhaps we shouldn't include it here

"Bash(make:*)",
"Bash(golangci-lint run:*)",
"Bash(go build:*)",
"Bash(find:*)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also potentially destructive because it has -remove and -exec

version = "git";
src = lib.cleanSource ../../src;
vendorHash = "sha256-A+DwCvfNUKY8TjxyAe+abiT9xIyy5p7VIh5T5ofeZIg="; # TODO: use fetchFromGitHub
vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc="; # TODO: use fetchFromGitHub
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment was false:

Suggested change
vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc="; # TODO: use fetchFromGitHub
vendorHash = "sha256-saGEuoB8Eeh/4SASPeyxW/xWArC7+oW88wUg6EJ39Fc=";

return err
}
} else {
term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
term.Info("Live tail logs with `defang tail --deployment=" + deploy.Etag + "`")
term.Infof("See live tail logs with `defang tail %v`", tailOptions)

to ensure the command line works as-is.

Deployment: deploy.Etag,
LogType: logs.LogTypeAll,
Since: since,
Verbose: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Verbose: true,
Verbose: true,
Stack: session.Stack.Name,

return strings.Compare(a.Endpoint, b.Endpoint)
})

// to reduce noise, print empty "Service", "Deployment", and "State" columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone through this path before and I didn't think it was worth it. It makes scripting impossible. Sure, --json and all that, but folks will script whatever they currently get. I wouldn't do this.

// Reconnect on Error: internal: stream error: stream ID 5; INTERNAL_ERROR; received from peer
if isTransientError(serverStream.Err()) {
if err := provider.DelayBeforeRetry(ctx); err != nil {
errChan <- err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially blocking write. You might need a select here. I'm not 100%, because you do create the channel with buffer 1.

}
serverStream, err = provider.Subscribe(ctx, &subscribeRequest)
if err != nil {
errChan <- err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

if err != nil {
return serviceStates, err
if err := serverStream.Err(); err != nil {
errChan <- err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/pkg/cli/getServices_test.go (2)

174-194: Test case expectations don't match function behavior — root cause of pipeline failure.

This test case has 2 items in Endpoints plus a Domainname, but only expects 1 result. Per ServiceEndpointsFromServiceInfo logic, the function produces one endpoint per Endpoints entry plus an additional endpoint for the Domainname. This should yield 3 endpoints, not 1.

This mismatch causes the panic at line 296 when the loop tries to access expectedServiceEndpoints[1] which doesn't exist.

🐛 Proposed fix
 		{
 			name: "Service with Domainname",
 			serviceinfo: &defangv1.ServiceInfo{
 				Service: &defangv1.Service{
 					Name: "service1",
 				},
 				Status:     "UNKNOWN",
 				Domainname: "example.com",
 				Endpoints: []string{
 					"example.com",
 					"service1.internal:80",
 				},
 			},
 			expectedServiceEndpoints: []ServiceEndpoint{
 				{
 					Service:  "service1",
 					Status:   "UNKNOWN",
 					Endpoint: "https://example.com",
 				},
+				{
+					Service:  "service1",
+					Status:   "UNKNOWN",
+					Endpoint: "http://service1.internal:80",
+				},
+				{
+					Service:  "service1",
+					Status:   "UNKNOWN",
+					Endpoint: "https://example.com",
+				},
 			},
 		},

Alternatively, if the intent is to test filtering/deduplication logic, verify that ServiceEndpointsFromServiceInfo actually implements that behavior and adjust the test case inputs accordingly.


292-303: Add length assertion to prevent panic and provide clearer failure message.

The loop iterates over serviceEndpoints (actual) but indexes into expectedServiceEndpoints. When actual has more items than expected, this causes a panic instead of a clear test failure. Add a length check first.

♻️ Proposed fix
 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
 			serviceEndpoints := ServiceEndpointsFromServiceInfo(tt.serviceinfo)
+			require.Len(t, serviceEndpoints, len(tt.expectedServiceEndpoints), "endpoint count mismatch")
 			for i, endpoint := range serviceEndpoints {
 				assert.Equal(t, tt.expectedServiceEndpoints[i].Service, endpoint.Service)
 				assert.Equal(t, tt.expectedServiceEndpoints[i].Status, endpoint.Status)
 				assert.Equal(t, tt.expectedServiceEndpoints[i].Endpoint, endpoint.Endpoint)
 				assert.Equal(t, tt.expectedServiceEndpoints[i].AcmeCertUsed, endpoint.AcmeCertUsed)
 			}
 		})
 	}

@jordanstephens jordanstephens added the wip Work in progress; don't merge label Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wip Work in progress; don't merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hide logs during compose up unless there is a problem

2 participants