feat(web): binary file attachments for Ask#1375
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds persisted chat attachments with upload, commit, serving, preview, and cleanup flows. The change also threads image-size limits through the app, updates model capability handling, and adds background pruning for unlinked attachments. ChangesChat Image Attachment Feature
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 17
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/attachmentPruner.ts`:
- Around line 38-40: The startup call to pruneOrphanedAttachments() is firing
without any await or error handling, which can surface as an unhandled rejection
and crash the worker during initialization. Update the startup path in
attachmentPruner so the first prune is either awaited from an async
initialization flow or wrapped with explicit catch/log handling, while keeping
the recurring setIntervalAsync scheduling intact.
- Around line 55-80: The orphan cleanup in attachmentPruner’s batch loop deletes
files based only on the initial findMany result, so a PENDING attachment can be
unlinked even if it becomes linked or non-orphaned before deletion. Re-check
each attachment’s current state in the same batch before calling unlink, ideally
by verifying the row still matches the orphan criteria in this method before
file removal and before deleteMany. Use the existing attachmentPruner loop, the
db.attachment queries, and the unlink call to keep only still-orphaned
attachments eligible for byte deletion.
In
`@packages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sql`:
- Around line 42-43: The Attachment foreign key on uploadedById currently
cascades deletes from User, which causes committed attachments and their
ChatAttachment links to disappear when an uploader is removed. Update the
migration and the corresponding Prisma model for Attachment/uploadedById so
deletion of a User does not delete Attachment rows; use a non-cascading delete
behavior that preserves historical attachments for existing chats.
In
`@packages/web/src/app/api/`(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.ts:
- Line 101: The attachment response in the route handler for chat attachments is
being cached too aggressively via the Cache-Control header. Update the
response-building logic in the attachment route so access-controlled content is
not reused for an hour by the browser; use a no-store/no-cache style policy (or
equivalent short-lived revalidation) for the attachment fetch path, especially
where the route checks authorization before returning the file.
In `@packages/web/src/app/api/`(server)/ee/chat/attachments/route.ts:
- Around line 18-24: The attachment size check in the chat upload route is still
bypassable because it only relies on content-length before calling
req.formData() and file.arrayBuffer() in the route handler. Update the
attachment handling flow in the route’s upload path to enforce an authoritative
limit after parsing by rejecting oversized File objects before buffering their
contents, and keep the existing maxImageBytes check as a secondary guard. Use
the existing attachment route logic and the file size handling around
req.formData(), file.arrayBuffer(), and maxImageBytes to ensure the request is
rejected even when content-length is missing or chunked.
In `@packages/web/src/app/api/`(server)/ee/chat/route.ts:
- Line 79: Reject empty message arrays in the chat route before accessing
latestMessage, since messages: [] currently passes validation and leaves
latestMessage undefined. Update the request handling in the route’s
message-processing logic around latestMessage to explicitly validate that
messages has at least one entry and return a typed 400 for empty arrays before
any downstream dereference of parts.
- Around line 98-104: Validate the model in the chat route before calling
commitMessageAttachments. In the route handler for the chat API, move the
languageModelConfig check ahead of the attachment commit so a bad model request
returns 400 before blobs are linked/flipped. Update the flow around
latestMessage, commitMessageAttachments, and languageModelConfig so attachments
are only persisted after the model request is confirmed.
In `@packages/web/src/ee/features/chat/agent.ts`:
- Around line 139-143: The omission note in agent.ts uses the wrong reason when
the latest user turn has images but all image reads fail. Update the logic
around the imageBlobs.length > 0 branch so the reason distinguishes between
unsupported image models, images added on a different turn, and images that were
added on this turn but could not be loaded. Use the existing identifiers
isLatestUserTurn, supportsImages, and imageBlobs to select the correct message
before appending to baseText.
In `@packages/web/src/features/chat/actions.ts`:
- Around line 306-322: Snapshot the source chat’s attachment links before
persisting the duplicate chat so the copy isn’t affected by a concurrent
delete/cascade. In the chat-duplication flow in actions.ts, read originalLinks
from prisma.chatAttachment.findMany using originalChat.id before creating
newChat, then create newChat and use the saved links in
prisma.chatAttachment.createMany. Keep the fix localized to the duplication
logic that handles originalChat, newChat, and chatAttachment.
- Around line 196-214: The attachment snapshot in the chat delete flow can miss
links created concurrently, causing orphaned blobs after deletion. Update the
delete path in actions.ts around the linkedAttachments fetch and
prisma.chat.delete call so orphan cleanup is based on a post-delete or
transaction-safe view of attachments, and ensure deleteOrphanedAttachments runs
with all attachmentIds still associated with the chat at the moment of deletion.
Use the existing deleteOrphanedAttachments helper and
prisma.chatAttachment/prisma.chat delete logic to keep the cleanup atomic or
re-read after delete before sweeping.
In `@packages/web/src/features/chat/attachments/filename.ts`:
- Around line 1-15: The sanitizeFilename helper only removes control characters
and whitespace, but it still allows markup-significant characters that can break
the <attachment filename="..."> boundary. Update sanitizeFilename to also strip
or escape characters like double quotes, angle brackets, and ampersands while
keeping the basename and existing fallback behavior intact.
In `@packages/web/src/features/chat/components/chatBox/chatBox.tsx`:
- Around line 281-298: The submit gating in chatBox should also catch image
attachments that are failed or malformed, not just `uploading`, because
`attachmentData` later drops those silently and can still allow an empty-text
send. Update the submit-disabled logic in the `chatBox` submit-state helper to
treat non-sendable image attachments the same as uploading ones, using the same
image attachment status checks that feed `attachmentData` so the UI blocks
submit whenever an image won’t actually be included.
In `@packages/web/src/features/chat/constants.ts`:
- Around line 21-24: ATTACHMENT_MAX_IMAGE_BYTES is hard-coded in constants.ts
even though the upload limit is configurable on the server, so the client can
diverge from the authoritative value. Update the client-side early-rejection
logic to source the max image bytes from the same configurable setting used by
the upload route (env.SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES) rather than a
fixed 10 MiB constant, and keep the existing chat attachment constants
synchronized with the server-facing limit.
In `@packages/web/src/features/chat/modelsDevCatalog.server.ts`:
- Around line 123-129: The cold-start gating in ModelsDevCatalog.server.ts only
uses hasAttempted derived from catalogFetchedAt and lastFailedAt, so it stays
false while inFlightFetch is still pending and allows repeated short waits.
Update the awaitWhenEmpty path to track the cold-start wait attempt separately
from fetch settlement, and use that flag in the condition around Promise.race so
only one process-wide COLD_START_BLOCK_BUDGET_MS wait can occur. Keep the
existing inFlightFetch and cachedCatalog behavior, but mark the short-wait as
attempted as soon as it is started.
In `@packages/web/src/features/chat/utils.server.ts`:
- Around line 222-242: The orphan cleanup in the attachment deletion flow needs
a final safety check before removing rows, because a concurrent relink can
happen after the initial lookup. Update the logic in the utility that computes
orphanedIds and calls prisma.attachment.deleteMany so the delete is conditional
on the attachment still having no chatAttachment references at delete time,
rather than deleting by bare id from the earlier snapshot.
- Around line 159-199: The attachment claim flow in the chat utility is
validating and then committing outside a single atomic check, so two concurrent
sends can both attach the same upload. Move the PENDING/ownership check into the
commit path in the `createMany`/`updateMany` transaction inside the chat
attachment helper in `utils.server.ts`, and ensure the `attachment` update only
succeeds when the row is still `AttachmentStatus.PENDING` and belongs to the
expected `userId`/`orgId`. If the conditional update affects fewer rows than
`idsToCommit`, treat it as an invalid request and do not create any
`chatAttachment` links.
In `@packages/web/src/lib/posthogEvents.ts`:
- Around line 207-218: The PostHog event schema for chat attachment events
currently leaves source optional, which allows indistinguishable cross-surface
emissions from the upload flow. Update the type definitions in posthogEvents.ts
for chat_attachment_uploaded and chat_attachment_degraded so source is required,
or alternatively rename these events to use the wa_ prefix if they are truly
web-only; make sure the emitting call sites match the chosen contract,
especially the new upload route, and keep the schema aligned with the intended
event origin.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 79aa38e4-8079-483a-9868-4da770dbc165
📒 Files selected for processing (31)
docs/docs/configuration/environment-variables.mdxpackages/backend/src/attachmentPruner.tspackages/backend/src/index.tspackages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sqlpackages/db/prisma/schema.prismapackages/shared/src/env.server.tspackages/web/src/app/api/(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.tspackages/web/src/app/api/(server)/ee/chat/attachments/route.tspackages/web/src/app/api/(server)/ee/chat/route.tspackages/web/src/ee/features/chat/agent.tspackages/web/src/ee/features/chat/components/chatThread/chatThreadListItem.tsxpackages/web/src/ee/features/chat/components/chatThread/messageAttachments.tsxpackages/web/src/features/chat/actions.tspackages/web/src/features/chat/attachmentUtils.tspackages/web/src/features/chat/attachments/attachmentPreviewCache.tspackages/web/src/features/chat/attachments/filename.tspackages/web/src/features/chat/attachments/storage.tspackages/web/src/features/chat/attachments/validation.tspackages/web/src/features/chat/components/chatBox/attachmentButton.tsxpackages/web/src/features/chat/components/chatBox/attachmentTray.tsxpackages/web/src/features/chat/components/chatBox/attachmentViewerDialog.tsxpackages/web/src/features/chat/components/chatBox/chatBox.tsxpackages/web/src/features/chat/components/chatBox/chatPaneDropzone.tsxpackages/web/src/features/chat/constants.tspackages/web/src/features/chat/modelCapabilities.server.test.tspackages/web/src/features/chat/modelCapabilities.server.tspackages/web/src/features/chat/modelsDevCatalog.server.tspackages/web/src/features/chat/types.tspackages/web/src/features/chat/utils.server.tspackages/web/src/features/chat/utils.tspackages/web/src/lib/posthogEvents.ts
This comment has been minimized.
This comment has been minimized.
Squashed onto main after PR #1374 (text file attachments) was squash-merged, which orphaned the stacked text-attachment commits this branch carried. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
04fdb14 to
b30cea1
Compare
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/web/src/app/api/(server)/ee/chat/route.ts (1)
30-31: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winValidate and budget all user turns, not only the latest.
messagesis still client-suppliedz.any()[], butcreateMessageStreamfolds every user turn into the prompt. A caller can place an oversized text attachment in an earlier user message and keep the latest message under the limit. Parse message shape before helper calls and reject any user message overATTACHMENT_MAX_TURN_TEXT_BYTES.Suggested direction
const chatRequestSchema = z.object({ - messages: z.array(z.any()), + messages: z.array(z.object({ + role: z.string(), + parts: z.array(z.any()), + }).passthrough()).min(1), id: z.string(), ...additionalChatRequestParamsSchema.shape, })- if ( - latestMessage.role === 'user' && - getMessageTextBytes(latestMessage) > ATTACHMENT_MAX_TURN_TEXT_BYTES - ) { + const hasOversizedUserMessage = messages.some((message) => + message.role === 'user' && + getMessageTextBytes(message) > ATTACHMENT_MAX_TURN_TEXT_BYTES + ); + if (hasOversizedUserMessage) {As per coding guidelines, route handlers should validate inputs using Zod schemas for request bodies in POST/PUT/PATCH requests.
Also applies to: 92-98
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/app/api/`(server)/ee/chat/route.ts around lines 30 - 31, The chat route currently accepts a client-controlled messages array with z.any(), so earlier user turns can bypass the turn text budget even if the latest message is small. Update chatRequestSchema and the route handler in chat/route.ts to parse each message’s shape before calling createMessageStream, then validate every user message against ATTACHMENT_MAX_TURN_TEXT_BYTES and reject the request if any user turn exceeds it. Use the existing request-body Zod validation path to enforce this across all messages, including the logic around the createMessageStream call.Source: Coding guidelines
♻️ Duplicate comments (1)
packages/backend/src/attachmentPruner.ts (1)
69-87: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winDelete the DB row before unlinking bytes.
The stale-batch race is still present for the blob: line 71 deletes bytes before line 81 re-checks
PENDING. If a send commits afterfindMany,deleteManywill skip the row, but the now-committed attachment’s file is already gone.Proposed fix
- await Promise.all(batch.map(async (attachment) => { - try { - await this.storage.delete(attachment.storageKey); - } catch (error) { - logger.warn(`Failed to delete bytes for orphaned attachment ${attachment.id}: ${error}`); - } - })); - - // Re-assert the orphan criteria in the delete itself: a concurrent - // send could have committed (PENDING -> COMMITTED + linked) a row in - // this batch after the findMany, and deleting by bare id would - // cascade that live link away. - const result = await this.db.attachment.deleteMany({ - where: { - id: { in: batch.map((attachment) => attachment.id) }, - status: AttachmentStatus.PENDING, - createdAt: { lt: cutoff }, - }, - }); - totalDeleted += result.count; + const deletedCounts = await Promise.all(batch.map(async (attachment) => { + const result = await this.db.attachment.deleteMany({ + where: { + id: attachment.id, + status: AttachmentStatus.PENDING, + createdAt: { lt: cutoff }, + }, + }); + + if (result.count === 0) { + return 0; + } + + try { + await this.storage.delete(attachment.storageKey); + } catch (error) { + logger.warn(`Failed to delete bytes for orphaned attachment ${attachment.id}: ${error}`); + } + + return result.count; + })); + totalDeleted += deletedCounts.reduce((sum, count) => sum + count, 0);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/backend/src/attachmentPruner.ts` around lines 69 - 87, The orphan cleanup in attachmentPruner’s batch delete still unlinks storage before confirming the row is still deletable, leaving a stale-batch race. Rework the flow in the batch loop and the subsequent attachment.deleteMany call so the database row is removed/re-checked first, and only delete the blob via this.storage.delete after the row has been confirmed deleted or otherwise safely marked orphaned. Keep the existing AttachmentStatus.PENDING and createdAt cutoff guard in place, and use the attachment.id/storageKey values to drive the post-delete byte cleanup.
🧹 Nitpick comments (1)
packages/web/src/ee/features/chat/agent.test.ts (1)
34-40: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMake the storage backend mock stable.
Returning a new object from
getStorageBackend()makes it hard to configuregetor assert storage reads in attachment tests. Prefer a shared mock object returned by the factory.Suggested refactor
+const mockStorageBackend = { + get: vi.fn(), + put: vi.fn(), + stat: vi.fn(), + createReadStream: vi.fn(), + delete: vi.fn(), +}; + // inside vi.mock("`@sourcebot/shared`", ...) - getStorageBackend: () => ({ - get: vi.fn(), - put: vi.fn(), - stat: vi.fn(), - createReadStream: vi.fn(), - delete: vi.fn(), - }), + getStorageBackend: () => mockStorageBackend,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/ee/features/chat/agent.test.ts` around lines 34 - 40, Make the storage backend mock stable by returning the same shared mock object from getStorageBackend() instead of creating a new object each time. Update the test setup in agent.test.ts so the backend methods (especially get) can be configured and asserted consistently across attachment tests, while keeping the existing method stubs on the shared mock.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/web/src/ee/features/chat/agent.ts`:
- Around line 90-112: The attachment loading in getMediaBlobs/acceptedBlobs is
only bounded per message, so long chats can still accumulate too many media
bytes before the model call. Add a global prompt-level cap (byte and/or total
attachment count) in agent.ts before querying Prisma and calling storage.get,
and enforce bounded concurrency instead of unbounded Promise.all when populating
the result map. Use the acceptedBlobs, records, and storage.get flow as the
place to apply the limit so no more than the configured aggregate media budget
is loaded.
---
Outside diff comments:
In `@packages/web/src/app/api/`(server)/ee/chat/route.ts:
- Around line 30-31: The chat route currently accepts a client-controlled
messages array with z.any(), so earlier user turns can bypass the turn text
budget even if the latest message is small. Update chatRequestSchema and the
route handler in chat/route.ts to parse each message’s shape before calling
createMessageStream, then validate every user message against
ATTACHMENT_MAX_TURN_TEXT_BYTES and reject the request if any user turn exceeds
it. Use the existing request-body Zod validation path to enforce this across all
messages, including the logic around the createMessageStream call.
---
Duplicate comments:
In `@packages/backend/src/attachmentPruner.ts`:
- Around line 69-87: The orphan cleanup in attachmentPruner’s batch delete still
unlinks storage before confirming the row is still deletable, leaving a
stale-batch race. Rework the flow in the batch loop and the subsequent
attachment.deleteMany call so the database row is removed/re-checked first, and
only delete the blob via this.storage.delete after the row has been confirmed
deleted or otherwise safely marked orphaned. Keep the existing
AttachmentStatus.PENDING and createdAt cutoff guard in place, and use the
attachment.id/storageKey values to drive the post-delete byte cleanup.
---
Nitpick comments:
In `@packages/web/src/ee/features/chat/agent.test.ts`:
- Around line 34-40: Make the storage backend mock stable by returning the same
shared mock object from getStorageBackend() instead of creating a new object
each time. Update the test setup in agent.test.ts so the backend methods
(especially get) can be configured and asserted consistently across attachment
tests, while keeping the existing method stubs on the shared mock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 10fe4848-13b9-4cc8-8e37-578497a8ea97
📒 Files selected for processing (42)
CHANGELOG.mddocs/docs/configuration/environment-variables.mdxpackages/backend/src/attachmentPruner.tspackages/backend/src/index.tspackages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sqlpackages/db/prisma/schema.prismapackages/shared/src/env.server.tspackages/shared/src/index.server.tspackages/shared/src/storage.tspackages/web/src/app/(app)/askgh/[owner]/[repo]/components/landingPage.tsxpackages/web/src/app/(app)/askgh/[owner]/[repo]/page.tsxpackages/web/src/app/(app)/chat/[id]/page.tsxpackages/web/src/app/(app)/chat/chatLandingPage.tsxpackages/web/src/app/(app)/chat/components/landingPageChatBox.tsxpackages/web/src/app/api/(server)/ee/chat/[chatId]/attachments/[attachmentId]/route.tspackages/web/src/app/api/(server)/ee/chat/attachments/route.tspackages/web/src/app/api/(server)/ee/chat/route.tspackages/web/src/ee/features/chat/agent.test.tspackages/web/src/ee/features/chat/agent.tspackages/web/src/ee/features/chat/components/chatThread/chatThread.tsxpackages/web/src/ee/features/chat/components/chatThread/chatThreadListItem.tsxpackages/web/src/ee/features/chat/components/chatThread/messageAttachments.tsxpackages/web/src/ee/features/chat/components/chatThreadPanel.test.tsxpackages/web/src/ee/features/chat/components/chatThreadPanel.tsxpackages/web/src/features/chat/actions.tspackages/web/src/features/chat/attachmentUtils.tspackages/web/src/features/chat/attachments/filename.tspackages/web/src/features/chat/attachments/modality.tspackages/web/src/features/chat/attachments/validation.tspackages/web/src/features/chat/components/chatBox/attachmentButton.tsxpackages/web/src/features/chat/components/chatBox/attachmentTray.tsxpackages/web/src/features/chat/components/chatBox/attachmentViewerDialog.tsxpackages/web/src/features/chat/components/chatBox/chatBox.tsxpackages/web/src/features/chat/components/chatBox/chatPaneDropzone.tsxpackages/web/src/features/chat/constants.tspackages/web/src/features/chat/modelCapabilities.server.test.tspackages/web/src/features/chat/modelCapabilities.server.tspackages/web/src/features/chat/modelsDevCatalog.server.tspackages/web/src/features/chat/types.tspackages/web/src/features/chat/utils.server.tspackages/web/src/features/chat/utils.tspackages/web/src/lib/posthogEvents.ts
✅ Files skipped from review due to trivial changes (4)
- packages/web/src/app/(app)/chat/chatLandingPage.tsx
- docs/docs/configuration/environment-variables.mdx
- packages/web/src/features/chat/components/chatBox/chatPaneDropzone.tsx
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (18)
- packages/web/src/features/chat/modelCapabilities.server.ts
- packages/web/src/features/chat/attachments/filename.ts
- packages/web/src/ee/features/chat/components/chatThread/chatThreadListItem.tsx
- packages/shared/src/env.server.ts
- packages/web/src/features/chat/types.ts
- packages/web/src/features/chat/components/chatBox/attachmentButton.tsx
- packages/web/src/lib/posthogEvents.ts
- packages/db/prisma/migrations/20260627000032_add_chat_attachments/migration.sql
- packages/web/src/features/chat/components/chatBox/attachmentViewerDialog.tsx
- packages/backend/src/index.ts
- packages/web/src/features/chat/modelCapabilities.server.test.ts
- packages/web/src/features/chat/utils.server.ts
- packages/web/src/features/chat/components/chatBox/attachmentTray.tsx
- packages/web/src/features/chat/modelsDevCatalog.server.ts
- packages/web/src/app/api/(server)/ee/chat/attachments/route.ts
- packages/web/src/features/chat/actions.ts
- packages/web/src/features/chat/attachmentUtils.ts
- packages/db/prisma/schema.prisma
…rphans The orphan sweep deleted blob bytes before the guarded row delete, so a PENDING attachment committed by a concurrent send (PENDING -> COMMITTED + linked) between the findMany and the byte delete kept its DB row and link but lost its bytes — a permanently broken attachment. Reorder to delete the row first (re-asserting the orphan criteria), then delete bytes only for batch rows that no longer exist, i.e. the rows the sweep actually removed. A deleted row can never reappear and a survivor is never deleted by the loop, so the check cannot misclassify. Also add a COMMITTED-with-zero-links sweep as a backstop for an interrupted web-app chat-delete sweep, which would otherwise leak those blobs forever (the pruner previously only touched PENDING rows). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The add_chat_attachments migration (20260627000032) predated add_oauth_dpop_binding (20260629193000), which merged to main after this branch was cut, tripping the CI migration-ordering check. The two migrations are independent (dpop touches none of the attachment tables; the attachment migration only references the long-existing Org/User/Chat tables), so resequencing it to run last is safe. Renamed to 20260629200000_add_chat_attachments. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deletion paths previously removed the DB row, then deleted bytes best-effort. Since the row is the only durable handle to the bytes, a failed/interrupted byte delete after the row was gone leaked those bytes with no way to ever find them again. Add a DELETING tombstone state. All reclamation now: (1) atomically flips the orphan to DELETING — the claim doubles as the concurrency guard, replacing the survivor-recheck — (2) deletes the bytes, (3) removes the row only once the bytes are confirmed gone. A failed byte delete leaves the row DELETING for the pruner's reclaim sweep to retry, so a transient storage error can never orphan bytes. - schema: add AttachmentStatus.DELETING (+ migration) - deleteOrphanedAttachments: claim -> DELETING, inline best-effort byte delete, remove only reclaimed rows; the rest fall through to the pruner - pruner: condemn PENDING + zero-link COMMITTED orphans to DELETING, then a single reclaim sweep deletes bytes and rows for all tombstones (also picking up tombstones the web app left behind) This unifies byte deletion into one retryable place and matters most ahead of a remote (S3) storage driver, where delete failures are routine. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/backend/src/attachmentPruner.ts`:
- Around line 115-119: The attachment pruning loop in attachmentPruner.ts keeps
appending every failed delete to failedIds and reusing it in the
attachment.findMany notIn filter, which can grow without bound when storage is
down. Update the pruning logic in the prune routine to cap the number of failed
deletions handled per run, stop the loop once that limit is reached, and let the
next scheduled run retry the remaining tombstones. Make sure the bound is
applied consistently across the batch-processing path and the failure handling
around failedIds so the query size stays limited.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eacd9c6b-92a6-4269-9031-918cc1b3e9a2
📒 Files selected for processing (4)
packages/backend/src/attachmentPruner.tspackages/db/prisma/migrations/20260629210000_add_attachment_deleting_status/migration.sqlpackages/db/prisma/schema.prismapackages/web/src/features/chat/utils.server.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/db/prisma/schema.prisma
- packages/web/src/features/chat/utils.server.ts
Adds support for binary (image) file attachments in Ask Sourcebot, building on the inline-text attachment work in the base branch. Users can attach PNG/JPEG/WebP/GIF images (drag-and-drop or file picker) to a chat message; the bytes are uploaded to app-mediated blob storage and sent to vision-capable models as native image content. Unlike text attachments, image bytes never travel in the
messagesJSON — they're referenced by id and served through an access-controlled route.This is an enterprise (
ee) feature, gated by theaskentitlement.What's included
Data model & storage
Attachmentmodel +ChatAttachmentjoin table +AttachmentStatus(PENDING→COMMITTED) enum, with migration. Attachments are uploaded before any chat exists and linked to chats viaChatAttachment, keeping access purely chat-derived.StorageBackendabstraction in@sourcebot/sharedwith aLocalFsStorageBackend(bytes underDATA_CACHE_DIR/attachments), shared by the web app and the backend orphan pruner. An S3 driver is planned as a follow-up.blobvariant on theAttachmentDatadiscriminated union (references stored bytes by id; bytes stay out of message JSON).Upload & serving
POST /api/ee/chat/attachments: authenticated (no anonymous uploads), entitlement-gated. Decodes the image withsharpto authoritatively determine the format (never the client MIME/extension; SVG excluded) and enforces server-side byte and pixel-dimension caps, the latter guarding against decompression bombs. Returns theattachmentId; images upload on select.commitMessageAttachmentsatomically links referenced blobs to the chat and flipsPENDING → COMMITTED, rejecting forged/unauthorized ids before the agent runs.GET /api/ee/chat/{chatId}/attachments/{attachmentId}: serves bytes to the uploader, or to any caller who can view the chat and has aChatAttachmentlink for it. SetsX-Content-Type-Options: nosniff,Cache-Control: private, no-store, and a header-safeContent-Disposition.Agent / model integration
mediaType → modalityresolver and→ model content partbuilder drive attachment handling, so support for additional modalities (PDF/audio/video) extends one place.resolveLatestTurnMedialoads bytes from storage for the latest user turn (only blobs linked to that chat and accepted by the model) andbuildUserModelMessageattaches native content parts. Media bytes are only sent on the turn they were added; a short marker is left when attachments are dropped, distinguishing an older turn, an unsupported modality, and a failed read.Client UI
acceptincludes image types only when supported.uploading/uploaded/error), on-hover preview, and a full viewer dialog. Submit is blocked while uploads are in flight or in an error state.ChatAttachmentlink exists.Lifecycle & cleanup
PENDING(uploaded-but-never-sent) blobs older than a TTL, along with their bytes, via the sharedStorageBackend.Config / observability
SOURCEBOT_CHAT_ATTACHMENT_MAX_IMAGE_BYTES(default 10 MiB) andSOURCEBOT_CHAT_ATTACHMENT_ORPHAN_TTL_HOURS(default 24,0disables). Documented inenvironment-variables.mdx.chat_attachment_uploaded,chat_attachment_degraded.Screen.Recording.2026-06-29.at.9.44.35.AM.mov
Screen.Recording.2026-06-29.at.9.46.31.AM.mov
Summary by CodeRabbit
New Features
Bug Fixes