runtime/claude_code: materialize image blocks to tmpfile + extract image_cache module#1151
runtime/claude_code: materialize image blocks to tmpfile + extract image_cache module#1151benhoverter wants to merge 8 commits into
Conversation
|
Code review LGTM. Real fix for a real driver limitation, atomic publish, mtime refresh, the file:// gating in compactor is a nice security touch, and the tests are solid (mtime refresh + sweep regression both covered). Two things before merge:
Rebase and I'll merge. |
…tool can view them
Adds image materialization in the claude_code driver: inbound image
ContentBlocks are written to $HOME/.openfang/tmp/images/ so the Claude
CLI can view them via its Read tool (it cannot fetch URLs or read
in-memory bytes).
Also introduces `ContentBlock::Image::source_url: Option<String>` in
openfang-types and threads it through every consumer:
- openfang-channels/bridge.rs (sets source_url when downloading from URL)
- openfang-api/openai_compat.rs + routes.rs (serde round-trip)
- openfang-runtime drivers: anthropic, gemini, openai, vertex (pattern
match exhaustiveness)
- openfang-runtime: agent_loop, compactor (pattern match exhaustiveness)
The `source_url` field is the load-bearing schema change for downstream
work: later commits in this series consume it for cache lookup, and
outbound Discord (separate PR) uses it to round-trip image URLs back to
Discord without re-uploading bytes. No driver reads it on this branch
yet — it is wired through serde and pattern matches only.
`materialize_image` previously wrote bytes directly to the content-addressed destination via `fs::write`, which truncates-then-writes non-atomically. Two concurrent renders of the same image (or a re-render racing a Read tool invocation) could produce a torn, partially-written file readable by the CLI's Read tool — a real risk under load now that file-sharing is a first-class feature. Switch to write-tmp + rename(2): write the decoded bytes to a unique sibling tmpfile (suffixed with pid + nanos), then atomically rename into the content-addressed destination. rename(2) is atomic on the same filesystem, so readers either see the full file or nothing. Loser of a race still rename-replaces with byte-identical content. Orphan .tmp files from crashed processes are reaped by the existing 24h TTL sweep (mtime-based).
The CLI's Read tool refuses paths outside the agent's working directory unless explicitly granted via --add-dir (or unless --dangerously-skip-permissions is set, which we don't want to rely on as the only escape hatch). Materialized images live under $HOME/.openfang/tmp/images/, which is outside the agent workspace, so without --add-dir the materialization is a dead-end whenever skip_permissions is false. Append --add-dir <image_tmp_dir> to both the non-streaming and streaming Command builders. The directory is per-user and content-addressed, so the grant is narrow and idempotent.
Lift the content-addressed image tmpfile cache out of the Claude Code driver and into a sibling module so the upcoming outbound Discord file-sharing path can reuse the same cache. No behavior change — the extracted helpers (image_tmp_dir, ext_for_mime, materialize_image, sweep_old_image_tmpfiles, IMAGE_TMP_TTL_SECS, sweep guard) are byte-identical to the previous private impl. The driver now imports image_tmp_dir / materialize_image / spawn_sweep_once from crate::image_cache. The new module is publicly exported so producers outside the runtime crate (channel adapters) can resolve materialized paths from base64 image blocks before forwarding them to outbound transports.
`materialize_image` is content-addressed and idempotent: a re-render of the same image returns the existing path without rewriting. As a side effect, the tmpfile's mtime never advanced past its original write — so the 24h TTL sweep, which gates on `meta.modified()`, could GC a tmpfile still actively referenced by an in-scope ContentBlock::Image in a long-running conversation. Refresh mtime via `File::set_modified(SystemTime::now())` (futimens on Unix) on every cache hit. Read-only fd is sufficient: futimens only requires file ownership, not write access. Best-effort: any failure is debug-logged and the cached path is returned anyway — worst case is the prior 24h-GC behavior. Tests: cache-hit refreshes mtime and survives a sweep that would otherwise GC the file; companion test confirms the sweep does remove genuinely stale files.
ContentBlock::Image was being stringified to `[Image: {mime}]`,
silently dropping the `source_url` populated by the inbound Discord
path. That field exists so the outbound path (PR-C) can re-fetch
the original CDN-hosted image and re-attach it post-compaction —
without it, every compaction event quietly severed an image from
its CDN origin.
Emit `[Image: {mime} @ {url}]` when `source_url` is `http://` or
`https://`. `file://` (local tmpfile materialization) and any other
schemes fall back to the legacy mime-only form: those are internal
and must not leak into compacted summaries that may be persisted,
logged, or shipped across processes.
Tests cover all four arms (https, http, file://, None).
…or Windows Matches the HOME → USERPROFILE convention used elsewhere in the kernel (see crates/openfang-runtime/src/drivers/mod.rs:429). Falls back to std::env::temp_dir() only when neither is set. Addresses reviewer feedback on PR RightNow-AI#1151.
Upstream added msg_id/provider_msg_id fields to Message after this branch forked. Switch to ..Default::default() in the three source_url test cases to match the kernel-wide pattern and stay forward-compatible. No behavior change — these are test-only initializers.
a13171d to
66fcce1
Compare
|
Rebased onto
Ready for another look. |
Summary
Makes inbound image
ContentBlocks viewable by the Claude Code CLI driver, which cannot fetch URLs or read in-memory bytes — it can onlyReadfiles on disk. Images are now materialized to a content-addressed tmpfile under$HOME/.openfang/tmp/images/and the directory is granted to the CLI via--add-dir. Also extracts the cache into a sibling module so future producers (outbound Discord) can reuse it.Four commits, no merge commits, builds clean workspace-wide.
Commits
1203647runtime/claude_code: materialize images to tmpfile so the CLI's Read tool can view themAdds tmpfile materialization in the
claude_codedriver. Also introducesContentBlock::Image::source_url: Option<String>inopenfang-typesand threads it through every consumer (api, channels/bridge, all four model drivers for pattern-match exhaustiveness, agent_loop, compactor). The field is wired through serde and matches but is not consumed by any driver on this branch — it is the load-bearing schema change for the rest of the series and for outbound Discord (separate PR). Includes a regression test (bridge::tests::download_image_to_blocks_populates_source_url) assertingsource_urlis populated on the HTTP fetch path.021c6dffix(runtime/claude_code): atomically publish materialized image tmpfilesReplaces the non-atomic
fs::writewith write-tmp +rename(2). Closes a torn-file window where a re-render racing aReadtool call could read a partially-written image. Orphan.tmpfiles are reaped by the existing 24h TTL sweep.46e5c8cfix(runtime/claude_code): pass--add-dirfor materialized image tmpfilesThe CLI's
Readtool refuses paths outside the agent workspace unless--add-dir'd (we don't want--dangerously-skip-permissionsas the only escape hatch). Adds the grant to both streaming and non-streaming command builders. The directory is per-user and content-addressed, so the grant is narrow.c93318frefactor(runtime): extract image_cache moduleLifts the cache helpers (
image_tmp_dir,ext_for_mime,materialize_image,sweep_old_image_tmpfiles,IMAGE_TMP_TTL_SECS, sweep guard) out of the driver intocrates/openfang-runtime/src/image_cache.rs. No behavior change — extracted helpers are byte-identical. Module ispubso channel adapters can resolve cached paths from base64 image blocks before forwarding them outbound.Notes for reviewers
download_image_to_blocksincrates/openfang-channels/src/bridge.rs— fix(channels/discord): surface image attachments to text-only providers #1143 addedfile://support; this PR setssource_urlon the resultingContentBlock::Image. The conflict is mechanical (one extra field on a struct literal) and I'll resolve it on whichever branch lands second.source_urlfield added but not read on this branch. It's wired through serde and every pattern match for type-safety, but no driver consumes it here. The consumer ships in the outbound-Discord PR (forthcoming) where it lets us round-trip a Discord CDN URL back to Discord without re-uploading bytes.bridge.rsedit only sets the new field on download).Test plan
cargo check --workspaceclean (only pre-existingimap-protofuture-incompat warning)cargo test -p openfang-channels --libclean (477 passed)Readsucceeds (validated end-to-end with a real receipt image; daemon log confirms materialization + CLI describes contents)download_image_to_blockspopulatessource_urlon the resultingContentBlock::Image(new regression test,bridge::tests::download_image_to_blocks_populates_source_url)Atomicity (commit 2) and the 24h TTL sweep (commit 4) are verified by code inspection:
rename(2)is POSIX-atomic and the module extraction is byte-identical with no call-site change.