Skip to content

Feat/image input compression#2964

Open
afjcjsbx wants to merge 22 commits into
sipeed:mainfrom
afjcjsbx:feat/image-input-compression
Open

Feat/image input compression#2964
afjcjsbx wants to merge 22 commits into
sipeed:mainfrom
afjcjsbx:feat/image-input-compression

Conversation

@afjcjsbx
Copy link
Copy Markdown
Collaborator

📝 Description

This PR adds configurable inbound image compression for PicoClaw's vision pipeline.

Previously, inbound images from channels were only constrained by max_media_size, with no configurable multi-level compression policy before building the model payload. This could lead to oversized inline image payloads and unnecessary pressure on multimodal providers.

With this change, PicoClaw now supports a new agents.defaults.image_input configuration block that allows:

  • enabling or disabling automatic inline attachment of user images
  • choosing a compression preset (off, low, balanced, aggressive, extreme)
  • bounding inline payload size
  • resizing images with max width / height limits
  • tuning JPEG quality
  • selecting the target output format (auto, jpeg, png)

The implementation also preserves the existing local path-tag behavior, so images remain accessible through file references while vision-capable providers can receive a compressed inline image payload when enabled.

🗣️ Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 📖 Documentation update
  • ⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

  • 🤖 Fully AI-generated (100% AI, 0% Human)
  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
  • 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

N/A

📚 Technical Context (Skip for Docs)

  • Reference URL: N/A
  • Reasoning: The previous inbound media flow did not provide a configurable compression strategy for user-supplied images before sending them to multimodal models. This PR introduces a production-ready, config-driven image optimization layer that reduces the risk of oversized payloads while preserving compatibility with the existing media resolution flow.

🧪 Test Environment

  • Hardware:
  • OS:
  • Model/Provider:
  • Channels:

📸 Evidence (Optional)

Click to view Logs/Screenshots

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

@afjcjsbx afjcjsbx requested a review from alexhoshina May 28, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant