fix: address bugs found in a multi-agent codebase audit#481
Conversation
Five parallel review agents each swept a different area (self-update, TUI keybindings, agent/session handling, sandbox/shell safety, and auth/credential storage) and surfaced nine concrete defects, all fixed here with regression tests: - sandbox: grant_scope path comparisons were case-sensitive, letting a persisted deny grant be bypassed by spelling a Windows path with different case - securefile/oauth: createSecretFile's Windows ERROR_ACCESS_DENIED handling was only applied to the oauth token store, not the shared credstore backend; both also masked the real lock-contention error behind a misleading "file does not exist" timeout - cli: firstUsableProvider ignored OAuth logins when picking a fallback provider, forcing unnecessary re-onboarding for an already-authenticated user - sessions: exec prompt summarization truncated on a raw byte offset, which could split a multi-byte UTF-8 rune and embed invalid UTF-8 into a resumed session's prompt - tui: configurable keybindings had no collision detection, so a remapped chord could silently make a hardcoded shortcut (or another configured binding) permanently unreachable - tools: the Windows shell-syntax pre-flight check wasn't quote-aware and could block harmless commands whose quoted arguments merely contained operator-like text - update: checksum verification didn't cross-check the archive filename it verified against the one being extracted; extractZip didn't reject symlink-mode entries like extractTarGz already did; replaceBinary's restore-on-failure rename had no retry, risking a permanently missing binary on a transient Windows file lock
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR fixes provider fallback, lock contention, path matching, truncation, shell detection, keybinding collisions, and update safety checks across CLI, auth, sandbox, sessions, TUI, and updater code, with regression tests added for each change. ChangesMulti-area audit bug fixes
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related PRs
Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/cli/setup.go (1)
335-351: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winDon't treat store errors as “no OAuth logins.”
oauthLoggedInProviders()returns an empty set on anyNewStore/Statuserror, so a temporary backend failure makes bothsetupRequired()andfirstUsableProvider()fall back as if the user had no OAuth login. Logging that error at debug/warn would distinguish “empty store” from “store unavailable” and avoid silently re-triggering onboarding.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/cli/setup.go` around lines 335 - 351, The oauthLoggedInProviders() helper is swallowing oauth.NewStore and store.Status failures by returning an empty map, which makes setupRequired() and firstUsableProvider() misread backend outages as “no OAuth logins.” Update oauthLoggedInProviders() to log the error with context (debug or warn) before returning, using the oauth.NewStore and Status call sites as the key locations, so callers can distinguish an empty store from an unavailable one.
🧹 Nitpick comments (4)
internal/sessions/exec_session.go (1)
224-241: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDuplicate of existing rune-boundary truncation helper.
truncateUTF8reimplementscutPromptRuneBoundaryininternal/sessions/replay.go, which does the exact same byte-truncation-with-rune-backoff logic. Reuse the existing helper instead of adding a near-identical one in the same package.♻️ Proposed consolidation
- if len(text) > 500 { - return truncateUTF8(text, 500) - } - return text -} - -// truncateUTF8 returns the longest prefix of s that is at most n bytes, -// backing off to the nearest rune boundary so a multi-byte character isn't -// split — a split rune here would embed invalid UTF-8 into the exec prompt. -func truncateUTF8(s string, n int) string { - if len(s) <= n { - return s - } - for n > 0 && !utf8.RuneStart(s[n]) { - n-- - } - return s[:n] -} + if len(text) > 500 { + return cutPromptRuneBoundary(text, 500) + } + return text +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/sessions/exec_session.go` around lines 224 - 241, The new truncateUTF8 helper duplicates the existing rune-boundary truncation logic already provided by cutPromptRuneBoundary in the same package. Replace the local truncation logic in exec_session.go with a call to cutPromptRuneBoundary so there is only one implementation of byte truncation with rune backoff. Keep the behavior identical for long and short strings, and remove the redundant truncateUTF8 helper if it is no longer needed.internal/tools/shell_runtime.go (1)
80-104: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low valueBackslash-escaped quotes aren't handled — narrow edge case.
A span like
echo "foo \" ls -la"treats the backslash-escaped\"as a real closing quote, sols -laafter it would be unmasked and could still be (mis)matched. This mirrors typical POSIX escaping, not cmd.exe's""-doubling convention, so real-world impact on Windows commands is low, but worth a mental note if this masking logic gets reused elsewhere.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tools/shell_runtime.go` around lines 80 - 104, stripQuotedSpans currently treats every quote byte as closing, so escaped quotes inside a quoted span can prematurely end masking. Update stripQuotedSpans in shell_runtime.go to recognize backslash-escaped quotes while inside single/double quotes and keep them masked instead of toggling quote state; preserve the existing behavior for normal quote boundaries and the current length-preserving space replacement.internal/cli/setup_fallback_test.go (1)
72-96: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winStrengthen the regression test with a before/after assertion.
The test only checks the state after seeding the OAuth token. Add an assertion beforehand (e.g.
okis false, or a different provider is chosen) to prove the token — not some unrelated path — is what makesxaiselectable; otherwise this could keep "passing" even if the OAuth-login check regresses.🧪 Proposed strengthening
providers := []config.ProviderProfile{ {Name: "xai", CatalogID: "xai", APIKeyEnv: "XAI_API_KEY"}, // no inline key/env, but logged in via OAuth } + if _, ok := firstUsableProvider(providers); ok { + t.Fatalf("want no usable provider before OAuth login is seeded") + } got, ok := firstUsableProvider(providers) if !ok || got.Name != "xai" { t.Fatalf("want OAuth-logged-in provider (xai), got %q ok=%v", got.Name, ok) }Note: move this check above
store.Save, not after, to test the pre-seed state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/cli/setup_fallback_test.go` around lines 72 - 96, Strengthen TestFirstUsableProviderRecognizesOAuthLogin by adding a pre-seed assertion before oauth.Store.Save so the test proves the OAuth token changes the outcome. Verify firstUsableProvider with the same providers slice returns no usable provider, or a different result, before seeding the token, then keep the existing post-seed assertion that xai becomes selectable. Use the existing symbols firstUsableProvider, oauth.NewStore, and store.Save to anchor the before/after check.internal/update/replace_windows.go (1)
11-13: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winRetry window may be too short for typical AV/indexer lock durations.
10 attempts × 100ms caps the restore retry at ~1s total. Given the stated motivation (antivirus/indexer scanning holding a transient lock), real-world lock durations can plausibly exceed 1s, which would mean the retry rarely helps in the exact scenario it was built for. Consider a longer total window (e.g., more attempts or a slightly longer/backoff delay) since the cost of retrying is only paid on the already-slow error path.
Also applies to: 39-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/update/replace_windows.go` around lines 11 - 13, The restore rename retry window in replace_windows.go is too short for transient AV/indexer locks. Update the retry constants used by the restore rename path (restoreRenameRetryAttempts and restoreRenameRetryDelay) to allow a longer total wait, ideally with a modest backoff or more attempts, and make sure the retry loop that uses these constants in the restore/rename logic covers the intended transient-lock scenario.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/cli/setup.go`:
- Around line 335-351: The oauthLoggedInProviders() helper is swallowing
oauth.NewStore and store.Status failures by returning an empty map, which makes
setupRequired() and firstUsableProvider() misread backend outages as “no OAuth
logins.” Update oauthLoggedInProviders() to log the error with context (debug or
warn) before returning, using the oauth.NewStore and Status call sites as the
key locations, so callers can distinguish an empty store from an unavailable
one.
---
Nitpick comments:
In `@internal/cli/setup_fallback_test.go`:
- Around line 72-96: Strengthen TestFirstUsableProviderRecognizesOAuthLogin by
adding a pre-seed assertion before oauth.Store.Save so the test proves the OAuth
token changes the outcome. Verify firstUsableProvider with the same providers
slice returns no usable provider, or a different result, before seeding the
token, then keep the existing post-seed assertion that xai becomes selectable.
Use the existing symbols firstUsableProvider, oauth.NewStore, and store.Save to
anchor the before/after check.
In `@internal/sessions/exec_session.go`:
- Around line 224-241: The new truncateUTF8 helper duplicates the existing
rune-boundary truncation logic already provided by cutPromptRuneBoundary in the
same package. Replace the local truncation logic in exec_session.go with a call
to cutPromptRuneBoundary so there is only one implementation of byte truncation
with rune backoff. Keep the behavior identical for long and short strings, and
remove the redundant truncateUTF8 helper if it is no longer needed.
In `@internal/tools/shell_runtime.go`:
- Around line 80-104: stripQuotedSpans currently treats every quote byte as
closing, so escaped quotes inside a quoted span can prematurely end masking.
Update stripQuotedSpans in shell_runtime.go to recognize backslash-escaped
quotes while inside single/double quotes and keep them masked instead of
toggling quote state; preserve the existing behavior for normal quote boundaries
and the current length-preserving space replacement.
In `@internal/update/replace_windows.go`:
- Around line 11-13: The restore rename retry window in replace_windows.go is
too short for transient AV/indexer locks. Update the retry constants used by the
restore rename path (restoreRenameRetryAttempts and restoreRenameRetryDelay) to
allow a longer total wait, ideally with a modest backoff or more attempts, and
make sure the retry loop that uses these constants in the restore/rename logic
covers the intended transient-lock scenario.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a5013253-c52d-49b3-a102-974530d4d57d
📒 Files selected for processing (19)
internal/cli/setup.gointernal/cli/setup_fallback_test.gointernal/oauth/encrypt.gointernal/sandbox/grant_scope.gointernal/sandbox/grant_scope_test.gointernal/securefile/securefile.gointernal/sessions/exec_session.gointernal/sessions/store_test.gointernal/tools/bash_tool_test.gointernal/tools/shell_runtime.gointernal/tui/binding_test.gointernal/tui/keybindings.gointernal/tui/model.gointernal/update/apply.gointernal/update/apply_test.gointernal/update/extract.gointernal/update/extract_test.gointernal/update/replace_windows.gointernal/update/replace_windows_test.go
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Approve — all 9 audit fixes verified real and correctly fixed; two notes + a merge-order
I put this through a 9-way independent audit (one reviewer per claimed bug) plus an adversarial verify pass on each, ground-truthing every fix against the pre-fix code on main. All 9 are real defects with correct, net-improving fixes — nothing fabricated, and the skeptic pass agreed keep on all nine. Regression tests present for 8/9 (the securefile one rides the existing concurrency test). Genuinely strong work, @PierrunoYT — thanks.
Highlights I confirmed by tracing the actual code:
- grant-scope case bypass (the important one): real. Pre-fix
grantCoversused raw==/HasPrefix; on case-insensitive Windows a persisted specific-path deny could be slipped by re-casing the path (C:\proj\secret.txtvsc:\…) into a broader allow. The fix routes through the existingfilepath.Rel-basedpathWithinRoot(folds case viaEqualFoldon Windows, byte-identical on Unix) — reuses the codebase's established helper rather than inventing logic. Correct. - securefile lock: real — #445's Windows
ERROR_ACCESS_DENIEDretry was only in the oauth store, not the shared credstore backend. The fix faithfully mirrors #445 (doesn't revert it) and stops masking contention behind a misleading "file does not exist" timeout.
Two non-blocking things worth folding in:
- Reclassify two of the "critical" update fixes.
update-checksum-filenameandupdate-zip-symlinkare real code gaps and the fixes are correct, but they're defense-in-depth, not live vulns — both fail safe onmaintoday (noos.Symlinkis ever materialized,safeExtractPathalready blocks name traversal, extraction is SHA256-gated, and the checksum-filename path force-aborts viaassertSafeChecksumFileName). Worth landing — just soften the "critical/supply-chain" framing so they aren't read as fixing an exploitable hole. - The keybinding-collision fix has a coverage gap (follow-up, not a blocker).
reservedBindingsprotectsctrl+c/esc/enter/shift+tab/ctrl+f/?but omits the bare-chord hardcoded handlers —tab,backspace,up/down,pgup/pgdown. Those are config-settable and matched withMod==0, so binding e.g.toggleDetailed="tab"would still silently shadow Tab navigation (="up"kills history recall) — the exact class this fix closes, uncaught and unwarned. The fix still strictly improves onmainand covers the critical keys, so ship it; just file a follow-up to extend the reserved set.
Merge order — land this after the euxaristia shell cluster (#476 → #468 → #465). #481's shell-preflight-quotes hunk makes detection quote-aware by running the patterns against masked := stripQuotedSpans(trimmed) instead of raw trimmed. Open #476/#468 are actively reworking those very detectors in shell_runtime.go. If #481 rebases last, the masked wrapper cleanly covers whatever final detector set the cluster produces — with the explicit rule that any new detector pattern must match against masked, not trimmed, or quoted content silently gets re-flagged. #481's other 8 fixes are in disjoint files and can land anytime. (If the cluster stalls, land #481 now and make the cluster own the shell_runtime.go reconciliation with that same masked requirement.)
Approving on the merits — CI's green on all 7 checks and every fix holds up.
jatmn
left a comment
There was a problem hiding this comment.
I found an issue that needs to be addressed before this is ready.
Findings
-
[P2] Include default and bare-key shortcuts in keybinding collision checks
internal/tui/keybindings.go:274
The new sanitizer only compares explicitly configured bindings againstreservedBindingsand the other non-zero configured bindings, so it still misses collisions with an action's default shortcut and with hardcoded bare-key handlers that are not inreservedBindings. For example,toggleDetailed = "ctrl+t"is accepted with no warning even though the dispatch switch checkstoggleDetailedbefore the defaultcycleReasoningCtrl+Thandler, making cycle-reasoning unreachable; similarlytoggleDetailed = "tab"is accepted and shadows the later Tab navigation/prompt handling. Please model the default bindings and the remaining hardcoded bare-key shortcuts in the collision set so the class of silent shortcut shadowing this PR is fixing is actually closed. -
[P2] Add regression coverage for the securefile lock-contention fix
internal/securefile/securefile.go:126
This patch changes the encrypted-file credential backend to treat Windowsos.ErrPermissionfrom the secret lock as contention, but the PR does not add or update asecurefile/credstoreregression test for that path; only the implementation file changes. The existing OAuth concurrency test exercises the separate OAuth copy of this logic, not the sharedsecurefilebackend that stores provider API keys, so this Windows-only credential-storage fix can regress while the new test coverage still passes. Please add focused coverage for the securefile/credstore branch, or factor the lock acquisition so theos.ErrPermissioncontention behavior is testable without relying on a real Windows delete-pending race.
There was a problem hiding this comment.
Pull request overview
This pull request fixes nine concrete defects identified in a multi-area audit (update flow, sandbox grants, OAuth/credential storage, sessions, TUI keybindings, and Windows shell preflight checks), and adds targeted regression tests to prevent reintroductions.
Changes:
- Hardens the self-update path (checksum filename cross-checking, zip entry type validation, and safer Windows binary replacement behavior).
- Fixes Windows-specific correctness/security issues (case-insensitive persisted sandbox grants; consistent lock-contention handling for encrypted secret stores).
- Improves UX/correctness in sessions/tools/TUI (UTF-8-safe prompt summarization truncation, quote-aware Windows shell syntax detection, and keybinding collision detection with user-facing warnings).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/update/replace_windows.go | Adds restore rename retry logic for Windows binary replacement. |
| internal/update/replace_windows_test.go | Adds Windows-only tests for replaceBinary and renameWithRetry. |
| internal/update/extract.go | Rejects non-regular zip entries to avoid extracting special types (e.g., symlinks). |
| internal/update/extract_test.go | Adds a regression test ensuring symlink-mode zip entries are rejected. |
| internal/update/apply.go | Adds verifyArchiveChecksum to cross-check checksum filename vs downloaded archive. |
| internal/update/apply_test.go | Adds a regression test for checksum filename mismatch rejection. |
| internal/tui/model.go | Wires in keybinding sanitization and surfaces collision warnings as notices. |
| internal/tui/keybindings.go | Implements reserved/collision detection for configurable keybindings. |
| internal/tui/binding_test.go | Adds coverage for reserved-binding and mutual-collision sanitization behavior. |
| internal/tools/shell_runtime.go | Makes Windows shell-syntax preflight checks ignore operator-like text inside quotes. |
| internal/tools/bash_tool_test.go | Adds regression tests covering quoted-content false positives. |
| internal/sessions/store_test.go | Adds regression coverage for UTF-8-safe truncation behavior. |
| internal/sessions/exec_session.go | Introduces UTF-8 rune-boundary truncation for exec prompt summarization. |
| internal/securefile/securefile.go | Treats Windows ErrPermission as lock contention and preserves correct timeout error cause. |
| internal/securefile/securefile_test.go | Adds regression test for permission-contention retry behavior. |
| internal/sandbox/grant_scope.go | Makes file/dir grant comparisons case-insensitive on Windows via filepath.Rel helpers. |
| internal/sandbox/grant_scope_test.go | Adds a Windows-only regression test for case-insensitive grant coverage. |
| internal/oauth/encrypt.go | Preserves the lock-creation error as the timeout cause to avoid misleading diagnostics. |
| internal/cli/setup.go | Extends provider fallback selection to treat stored OAuth logins as usable credentials. |
| internal/cli/setup_fallback_test.go | Adds regression test ensuring OAuth-only providers can be selected as fallback. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for attempt := 0; attempt < restoreRenameRetryAttempts; attempt++ { | ||
| if err := os.Rename(oldPath, newPath); err == nil { | ||
| return nil | ||
| } else { | ||
| lastErr = err | ||
| } | ||
| time.Sleep(restoreRenameRetryDelay) | ||
| } |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Summary
Fixes #480. Five parallel review passes swept different areas of the codebase (self-update, TUI keybindings, agent/session handling, sandbox/shell safety, and auth/credential storage) and surfaced nine concrete defects, all fixed here with regression tests.
grant_scopepath comparisons were case-sensitive, letting a persisted deny grant be bypassed by spelling a Windows path with different casecreateSecretFile's WindowsERROR_ACCESS_DENIEDhandling was only applied to the oauth token store, not the shared credstore backend; both also masked the real lock-contention error behind a misleading "file does not exist" timeoutfirstUsableProviderignored OAuth logins when picking a fallback provider, forcing unnecessary re-onboarding for an already-authenticated userextractZipdidn't reject symlink-mode entries likeextractTarGzalready did;replaceBinary's restore-on-failure rename had no retry, risking a permanently missing binary on a transient Windows file lockTest plan
go build ./...andgo vet ./...clean across the reposandbox,oauth,securefile,credstore,cli,sessions,tui,tools,update)tuitest failure (TestProviderWizardAdvancesProviderAPIKeyAndModelSteps) confirmed to fail identically on unmodifiedmainSummary by CodeRabbit