sandbox: Add more file read methods#68
sandbox: Add more file read methods#68scotttrinh wants to merge 7 commits intosandbox/better-error-hierarchyfrom
Conversation
- `iter_file`: a chunked bytes iterator - `download_file`: stream to a file
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR expands the Sandbox filesystem API to support streamed reads and direct downloads to disk, and refines error typing by introducing a dedicated 404 “not found” sandbox error.
Changes:
- Add
iter_file()(chunked iterator) anddownload_file()(stream-to-file) to both sync and async Sandbox APIs. - Introduce
SandboxNotFoundErrorand map HTTP 404 responses to it. - Change
read_file()behavior to raise on 404 (instead of returningNone), with updated tests/examples.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_sandbox_errors.py | Adds coverage for the new SandboxNotFoundError and 404 classification. |
| tests/live/test_sandbox_live.py | Updates live file-operation expectations (404 now raises) and exercises download_file(). |
| tests/integration/test_sandbox_sync_async.py | Adds integration coverage for iter_file()/download_file() and updates 404 expectations for read_file(). |
| tests/integration/test_sandbox_streaming_errors.py | Extends streaming error matrix to include 404 → SandboxNotFoundError. |
| src/vercel/sandbox/sandbox.py | Exposes iter_file()/download_file() and updates read_file() return type/behavior in public API. |
| src/vercel/sandbox/command.py | Updates command kill behavior to catch SandboxNotFoundError for 404s. |
| src/vercel/sandbox/init.py | Re-exports SandboxNotFoundError publicly. |
| src/vercel/_internal/sandbox/errors.py | Defines the new SandboxNotFoundError type. |
| src/vercel/_internal/sandbox/core.py | Maps 404s to SandboxNotFoundError; implements streaming read + download plumbing in ops clients. |
| src/vercel/_internal/sandbox/init.py | Re-exports SandboxNotFoundError for internal consumers. |
| examples/sandbox_02_fs_ops.py | Demonstrates iter_file() and download_file() usage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def read_file(self, path: str, *, cwd: str | None = None) -> bytes: | ||
| return await self.client.read_file(sandbox_id=self.sandbox.id, path=path, cwd=cwd) |
There was a problem hiding this comment.
read_file() now returns bytes and raises SandboxNotFoundError on 404 (previously it appears to have returned None for missing files, per updated tests). This is a breaking change for API consumers and isn’t mentioned in the PR title/description; consider either (a) documenting this behavior change prominently (README / docstring / release notes) or (b) preserving the old behavior via a separate method/flag (e.g., read_file(..., missing_ok=True)), so existing callers don’t unexpectedly start raising.
There was a problem hiding this comment.
Yeah, this is breaking... @vercel/team-python what do we think about making this dynamic via missing_ok (default True) so that it's backwards compatible?
src/vercel/_internal/sandbox/core.py
Outdated
| with open(temp_path, "wb") as f: | ||
| async for chunk in stream: | ||
| if chunk: | ||
| f.write(chunk) | ||
| os.replace(temp_path, destination) |
There was a problem hiding this comment.
BaseSandboxOpsClient.download_file() performs synchronous filesystem I/O (open(...), f.write(...), os.replace, os.remove) inside an async method. When used from AsyncSandbox.download_file(), this will block the event loop during writes, which can degrade async app responsiveness. Consider switching to anyio.open_file / anyio.Path (already a dependency) or running the blocking file operations via anyio.to_thread.run_sync / asyncio.to_thread to keep the method truly non-blocking for async callers.
There was a problem hiding this comment.
Nice catch, made a similar "sync vs async runtime" kind of split that we have in our http module to handle this kind of thing, now sync works via iter_coroutine and async works via anyio's file system API.
Introduce sync and async filesystem primitives that fit the iter_coroutine architecture.
Inject runtime-specific filesystem clients into sandbox ops and route local download path setup through the shared fs layer.
Route sandbox download writes through the filesystem client so async callers avoid blocking disk operations.
Keep download_file path validation in the shared sandbox ops client instead of duplicating it in the public wrappers.
iter_file: a chunked bytes iteratordownload_file: stream to a file