feat: Add fips-agents add vision (closes #20)#39
Merged
Conversation
Closes #20. Drops `examples/vision_client.py` showing the three `image_url` URL forms the agent runtime accepts: inline `data:` URIs, remote `https://` URLs, and the internal `file_id:<id>` scheme that the agent rewrites to a `data:` URI server-side. Image input flows through the agent's existing `model.endpoint` — no separate vision endpoint split. Set `MODEL_ENDPOINT` and `MODEL_NAME` to a vision-capable model (Granite Vision 3.2-2B and others) before running the agent. Precondition: `server.files.enabled` must be `true` in agent.yaml. The `file_id:<id>` URL scheme resolves bytes via the BytesStore, which only exists when files is enabled. The command refuses to apply (with an actionable hint) until `fips-agents add files` has been run. Requires fipsagents>=0.20.0 in the project's dependencies. Notes on issue #20's wording: - No `vision:` section is added to agent.yaml. The agent-template audit for issue #101 explicitly chose a single multimodal endpoint via existing `model.endpoint` — adding a `vision:` block now would bake in a split that hasn't been needed. - Example code lives at `examples/vision_client.py` (client-side), not `src/agent.py`. Content blocks are constructed by callers and flow through the agent runtime automatically; the agent code itself doesn't need to change. Assisted-by: Claude Code (Opus 4.7)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires multimodal (image input) example client into existing agent
projects. Closes #20.
The new `add vision` command:
the `file_id:` URL scheme resolves bytes via the BytesStore,
which only exists when files is enabled). Refuses to apply with an
actionable hint when not satisfied.
forms the agent runtime accepts: inline `data:`, remote `https://`,
and internal `file_id:`.
vision-capable model (Granite Vision 3.2-2B canonical example) and
run the example script.
Pairs with `fipsagents 0.20.0` (image input in OpenAI content blocks)
and `fips-agents add files`.
Notes on the issue wording
audit for #101 (image input on the runtime side) explicitly chose a
single multimodal endpoint via the existing `model.endpoint` —
adding a `vision:` block now would bake in a split that hasn't
been needed yet.
not `src/agent.py`. Content blocks are constructed by callers; the
agent runtime resolves them automatically. The agent code itself
doesn't need to change for image input.
Test plan
+ spec shape).
without files enabled fails with the right hint; after
`fips-agents add files`, succeeds and drops the example.
`already exists` and exits 0.