Make detection of binary files more robust#515
Conversation
| (if (and mime-type | ||
| (not (or (string-prefix-p "text/" mime-type) | ||
| (string-match-p "\\(?:json\\|xml\\|yaml\\)$" mime-type)))) | ||
| ;; Read contents as base-64 encoded binary |
There was a problem hiding this comment.
Won't small non-image binary files (PDFs, .docx, etc.) that pass the size check will now be embedded as "resource" blocks with base64-encoded content in the text field? Before this PR, json-serialize would error on non-UTF-8 bytes and the condition-case would catch it, falling back to a plain text mention, somewhat accidentally the right behavior.
Maybe resource embed branch in the caller probably needs a (not (map-elt file :base64-p)) guard so binary non-image files fall through to resource_link instead?
There was a problem hiding this comment.
I worked on this, because I @-mentioned a small PDF file that was then embedded in the message but not base64 encoded. agent-shell then refused any further cooperation, when sending the request failed on json-serialize in acp.
| (push `((type . "resource") | ||
| (resource . ((uri . ,(concat "file://" resolved-path)) | ||
| (text . ,(map-elt file :content)) | ||
| (,(if (map-elt file :base64-p) 'blob 'text) . ,(map-elt file :content)) |
There was a problem hiding this comment.
This embeds small non-text files as binary resources now as documented here. And I think at least claude would also support embedded PDFs, if these API docs also apply to ACP.
Regarding and agent-shell-text-file-capabilities above, why is this necessary to work with embedded resources? From the naming and the pattern in the image case, I would think that supports-embedded-context would be the right test to check if the agent can deal with embedded context.
There was a problem hiding this comment.
Regarding and agent-shell-text-file-capabilities above, why is this necessary to work with embedded resources?
IIRC, this was somewhat relevant to running agents in a container and preventing access to files outside the container, but this is a moot point since we're crafting a prompt so the user intent is indeed to send the file. We can likely remove it.
0727958 to
934e3c1
Compare
934e3c1 to
388ac64
Compare
388ac64 to
3d02592
Compare
xenodium
left a comment
There was a problem hiding this comment.
Sorry for the delay. Just saw you made further changes. Thank you!
| (is-binary nil) | ||
| (content | ||
| (unless shallow | ||
| ;; If we have a known MIME type that is not explicitly a text type |
There was a problem hiding this comment.
Won't we get false positives here for a lot of text cases?
file mailcap mime classified as
/tmp/pr515-demo.el application/emacs-lisp BINARY (base64)
/tmp/pr515-demo.sh application/x-sh BINARY (base64)
/tmp/pr515-demo.py nil text
/tmp/pr515-demo.json nil text
/tmp/pr515-demo.md nil text
How about we consider peeking at file to determine if binary and use mailcap for mime-type? Maybe something like:
(defun agent-shell--file-binary-p (file-path)
"Return non-nil if FILE-PATH appears to be binary.
Sniffs for a NUL byte in the first 4 KB — the same fast-path
heuristic used by file(1), git, and grep -I. Reads at most 4 KB
regardless of file size."
(with-temp-buffer
(set-buffer-multibyte nil)
(insert-file-contents-literally file-path nil 0 4096)
(goto-char (point-min))
(search-forward "\0" nil t)))
The previous heuristic would try to embed non-image files as text, which would fail, for example, for small PDFs. For those files, which contain non-UTF-8 bytes,
json-serializewould raise an error, because it would be an invalid UTF-8 string.The new function has more robust MIME type detection based on Emacs'
mailcapand ensures that only files with valid multi-byte content will not be base64 encoded.Checklist
M-x checkdocandM-x byte-compile-file.