GitHub - x-hannibal/open-webui-easymage: Multi-engine image generation filter for Open WebUI. Features automated prompt enhancement, multi-language support, and real-time Vision QC scoring. Supports A1111, ComfyUI, and OpenAI backends with integrated performance telemetry.

✨Easymage: Generative Imaging & Prompt Engineering Filter

Easymage is a professional-grade orchestration filter for Open WebUI designed to transform your image generation workflow into a unified and intelligent experience. By simply prepending img triggers to any message, you activate an advanced pipeline that handles everything from multilingual prompt engineering to multi-engine generation and post-creation technical analysis.

This filter acts as an Intelligent Dispatcher, unlocking advanced, engine-specific parameters like seed, style, quality, and distilled CFG and many others that are not natively exposed through the standard Open WebUI interface. Version 0.9.1 introduces a streamlined Subcommand Architecture and a Unified Help System, creating a seamless bridge between local power-user tools (Forge, ComfyUI) and cloud simplicity (OpenAI, Gemini).

🆕 What's New in v0.9.3 (vs v0.9.2-beta.2)

Aspect Ratio Intelligence: Fixed a critical bug where a hidden default aspect ratio was overriding custom dimensions. The system now mathematically infers the correct ratio from your Size setting automatically.
CLI Typographic Normalization: The Prompt Parser now intercepts and normalizes typographic dashes (like the em-dash — or en-dash – automatically inserted by mobile keyboards) back into valid CLI parameters (--), preventing syntax errors on smartphones.
Privacy & Local Setup: Added a dedicated section explaining how to force 100% local execution using the easy_cloud_mode toggle.

⚡ Quick Start (Copy & Paste)

New to Easymage? Try these commands to see what it can do.

Goal	Command	What it does
Simple	`img A futuristic city made of glass`	Generates a 1024x1024 image using your default settings.
Portrait	`img ar=9:16 An astronaut in a flower field`	Creates a vertical image (Instagram/Reels format).
Advanced	`img sz=1280x720 +h -- A cyberpunk street rain`	Widescreen HD image with High-Res Fix enabled for extra detail.
Random	`img:r watercolor, peaceful`	Let the AI decide the subject, using "watercolor" as a style guide.
Native Lang	`img Un gatto che beve caffè`	Write in any language! Easymage translates and optimizes it for the model.

✨ Key Features

🌍 Multilingual Native: You don't need to speak English to get professional results. Write your prompt in any language. Easymage detects the language, translates it, and expands it into technical English optimized for the specific generation model.
Advanced Multi-Engine Routing: Native Direct HTTPX support for Forge, OpenAI (DALL-E-3 / DALL-E-2), and the full Google Ecosystem (Imagen4, Veo, Nano Banana), with a standard API fallback for ComfyUI. Easymage translates universal commands into the specific "technical dialect" of each API.

📝 Compatibility Note: Throughout this documentation, we use "Forge" to refer to local Stable Diffusion backends. Easymage is fully compatible with both WebUI Forge and the classic Automatic1111 SD-WebUI, as they share the same API structure.
VRAM Auto-Optimization: Automatically manages your GPU memory. Before generating an image, Easymage checks your Ollama server and unloads unused models to ensure Forge or ComfyUI have enough VRAM to operate without crashing.
Zero-Latency Dispatch: Uses a persistent connection pool to communicate with backends. This reduces the time-to-first-token and image generation start time by eliminating repetitive network handshakes.
Selective Negation Strategy: A smart logic core that determines how to handle negative prompts. It automatically decides whether to use native API fields (for Gemini or Forge with Hires Fix) or to "inject" the negation into the LLM-enhanced description (for OpenAI and ComfyUI), ensuring perfect visual results.
Automated Prompt Engineering: Expands minimalist user input into high-fidelity technical prompts. It dynamically incorporates details about lighting, camera angles, textures, environment, and artistic style based on your requirements.
Vision Quality Audit (QC): Provides a real-time technical critique of the generated image using Vision LLMs. It assigns a numerical score (0-100%) and evaluates specific artifacts like noise, grain, melting, and aliasing (jaggies).
Universal CLI Syntax: Control every generation aspect with a single syntax (sz=, stp=, ar=, etc.). No more switching between different interfaces or learning complex API structures.
Performance Analytics: Every generation tracks precise metrics, including total execution time, image generation latency, and LLM throughput (Tokens per second).

🧠 The Three-Step Pipeline

Expansion & Optimization: The system detects the language, identifies your requested styles and exclusions, and uses an LLM to build a professional prompt.
Smart Generation (w/ VRAM Handover):
- Easymage cleans the VRAM (unloading idle LLMs).
- The request is routed to the chosen engine via direct, optimized HTTPX calls.
- If native parameters are supported, they are passed directly; otherwise, logic fallbacks apply.
Visual Verification: If a Vision-capable model is available, the final image is audited for prompt alignment and technical integrity.

Universal CLI (Command Line Interface)

💡 Core Usage & Syntax

Easymage is activated by prepending a specific trigger command to your message. It parses technical instructions, stylistic choices, and the subject in a single pass.

Subcommand Architecture

img [prompt]: Standard Generation. Triggers the full pipeline: Prompt Enhancement → Image Generation → Vision Audit.
img:p [prompt]: Prompt Only Mode. Executes language detection and prompt enhancement logic but stops before generating the image. Useful for iterating on prompt engineering without burning compute credits or time.
img:r [styles]: Random / "I'm Feeling Lucky" Mode. The LLM acts as an "Engine of Total Entropy," rolling virtual dice to select a subject and style from diverse categories (Nature, Street Photography, Art, Pop Culture, Architecture, Abstract). Any text you provide is treated as a Style Constraint, not a subject.
img ?: Manual Mode. Displays the help menu, shortcuts, and parameter tables directly in the chat. Typing just img (with no prompt) also triggers this mode.

Command Structure

The universal format is: img:[subcmd] [ flags ] [ parameters ] [ styles -- ] [ subject] [ --no negative prompt ]

⚠️ CRITICAL SYNTAX RULE: If a parameter value contains spaces, you MUST use double quotes!

✅ Correct: smp="Euler a"

❌ Wrong: smp=Euler a

Part	Example	Description
Parameters	`sz=1024x1024 stp=30`	`key=value` pairs to control the generation (size, steps, etc.). Text values with spaces (e.g. sampler names) must be enclosed in double quotes: `smp="Euler a"`.
Flags	`+h -a`	Single-character toggles to enable (`+`) or disable (`-`) features.
Styles	`neon, 8k, macro`	Stylistic keywords that steer the LLM enhancer.
Subject	`-- a lonely robot`	The main content of your image. The `--` separator is mandatory only if you are specifying Styles to distinguish them from the Subject.
Negative Prompt	`--no people, blur`	Elements to exclude. The `--no` (or `—no`) separator triggers the logic fallback system.

💡 Smart Parsing: If you only use technical parameters (like sz=1024 or +h), Easymage automatically identifies the Subject as everything following the last parameter. You only need the -- separator when you want to provide descriptive Styles (e.g., img sz=1024 cinematic, low-angle -- a giant tree).

Configuration Hierarchy & Precedence

Easymage uses an intelligent, three-layer system to determine which settings to apply for each image generation. This tiered approach gives you maximum flexibility, allowing for quick, one-time experiments without altering your preferred defaults. The rule is simple: more specific settings always override more general ones.

Here is the order of priority, from highest to lowest:

Priority	Source	Description
1 (Highest)	Command Line (CLI)	Parameters typed directly into your chat message (e.g., `sz=1024x1024`). These are temporary and last for only one generation.
2 (Medium)	User Valves (Chat Controls)	Your personal defaults, configured via the Controls icon within a chat. These apply to all your `img` commands in that context unless a CLI parameter is used.
3 (Lowest)	Global OWUI Settings	The server-wide defaults for image generation, configured by an admin in `Admin Panel > Settings > Images`. Easymage uses these as a final fallback.

Practical Example

Let's see how this works for the size parameter:

Global Setting: Your Open WebUI server is configured with a default image size of 1024x1024.
Valve Override: You go into your Easymage Valves and set the Size to 512x512. From now on, every time you type img a cat, the image will be 512x512.
CLI Override: You want to create a single widescreen image. You type img sz=1792x1024 a cat. For this specific request, Easymage will ignore both the Global setting and your Valve, generating a 1792x1024 image.

The next time you type img a dog, it will revert to using your Valve setting of 512x512.

⚙️ Command Line Parameters

These parameters allow you to override backend settings directly from the chat.

Parameter	Example	Description	Supported Engines
`ge`	`ge=o`	Generation Engine. Selects the generation backend.	All
`mdl`	`mdl=d3`	Model. Selects a specific checkpoint or API model version.	All
`sz`	`sz=800`	Size. Accepts `WxH` or a single value `N` (auto-converted to square `NxN`). Dimensions are then normalized based on engine constraints and the `ar` parameter.	All
`ar`	`ar=16:9`	Aspect Ratio. Automatically calculates size based on ratio.	All
`stp`	`stp=25`	Steps. Number of sampling iterations.	Forge / A1111
`sd`	`sd=42`	Seed. Numeric value for reproducible generations.	Forge / A1111, Gemini
`cs`	`cs=7.0`	CFG Scale. Classifier-Free Guidance intensity.	Forge, Comfy
`dcs`	`dcs=3.5`	Distilled CFG Scale. Specialized for Flux/SD3 models.	Forge / A1111
`stl`	`stl=v`	Style. Choose between `v` (vivid) or `n` (natural).	OpenAI
`smp`	`smp=d2s`	Sampler. Selects the sampling algorithm.	Forge / A1111
`sch`	`sch=k`	Scheduler. Selects the noise schedule type.	Forge / A1111
`hr`	`hr=2.0`	High-Res Scale. Enables Hires Fix and sets the multiplier.	Forge / A1111
`hru`	`hru=Latent`	High-Res Upscaler. Specifies the upscaling model.	Forge / A1111
`hdcs`	`hdcs=3.5`	Hires Distilled CFG. Distilled scale during the HR pass.	Forge / A1111
`dns`	`dns=0.45`	Denoising Strength. Intensity of the HR fix pass.	Forge / A1111
`auth`	`auth=sk..`	Authentication. Overrides global/valve keys for this request.	All

Command Line Flags (Toggles)

Flags provide a quick way to override your default Valves configuration for a single message.

Flag	Name	Description
`+h` / `-h`	High-Res	Toggles High-Res Fix. For OpenAI, `+h` activates `quality: hd`.
`+p` / `-p`	Prompt	Toggles the LLM Prompt Enhancer.
`+a` / `-a`	Audit	Toggles the post-generation Vision Quality Audit.
`+d` / `-d`	Debug	Toggles Debug Mode (prints internal state JSON in chat).

⚡ Shortcut Tables

Easymage uses optimized shortcodes to minimize typing while maintaining full technical control. However, for those who prefer clarity or are using scripts, full parameter names are also supported.

⚡ Engine Shortcuts (`ge=...`) ▼

Shortcut	Engine Name	Backend API
`f` / `a`	`forge` / `automatic1111`	SD Forge / A1111 (Direct HTTPX)
`o`	`openai`	DALL-E 3 (Direct HTTPX)
`g`	`gemini`	Imagen 3 (Direct HTTPX)
`c`	`comfyui`	ComfyUI (Open WebUI API)

⚡ Model Shortcuts (`mdl=...`) ▼

Shortcut	API Model Name / Identifier	Note / Description
OpenAI
`d3`	`dall-e-3`	Standard DALL-E 3
`d2`	`dall-e-2`	Legacy DALL-E 2
`g4o`	`gpt-4o`	Multimodal
`g4om`	`gpt-4o-mini`	Multimodal Mini
Google Imagen Series
`i4`	`imagen-4.0-generate-001`	Imagen 4 (Full) - Standard 2026
`veo`	`veo-3.0-generate-preview`	Veo 3 (Frame Gen / 4K)
Google Gemini Series
`g2.5f`	`gemini-2.5-flash-image`	Nano Banana (Flash / Fast)
`g3p`	`gemini-3-pro-image-preview`	Nano Banana Pro (High Quality)
Local / Forge
`flux`	`flux1-dev.safetensors`	Flux 1 Dev
`sdxl`	`sd_xl_base_1.0.safetensors`	SDXL Base

⚡ Aspect Ratio Shortcuts (`ar=...`) ▼

Shortcut	Ratio	Common Use Case
`1`	`1:1`	Square (Social Media)
`16`	`16:9`	Cinematic Widescreen
`9`	`9:16`	Vertical / Reels
`4`	`4:3`	Photography Standard
`3`	`3:4`	Portrait
`21`	`21:9`	Ultra-Widescreen

⚡ Sampler Shortcuts (`smp=...`) ▼

Code	Full Sampler Name	Code	Full Sampler Name
`d3s`	DPM++ 3M SDE	`df`	DPM fast
`d2sh`	DPM++ 2M SDE Heun	`dad`	DPM adaptive
`d2s`	DPM++ 2M SDE	`r`	Restart
`d2m`	DPM++ 2M	`h2`	HeunPP2
`d2sa`	DPM++ 2S a	`ip`	IPNDM
`ds`	DPM++ SDE	`ipv`	IPNDM_V
`ea`	Euler a	`de`	DEIS
`e`	Euler	`u`	UniPC
`l`	LMS	`lcm`	LCM
`h`	Heun	`di`	DDIM
`d2`	DPM2	`dic`	DDIM CFG++
`d2a`	DPM2 a	`dp`	DDPM

⚡ Scheduler Shortcuts (`sch=...`) ▼

Code	Full Scheduler Name	Code	Full Scheduler Name
`a`	Automatic	`ays`	Align Your Steps
`u`	Uniform	`aysg`	Align Your Steps GITS
`k`	Karras	`ays11`	Align Your Steps 11
`e`	Exponential	`ays32`	Align Your Steps 32
`pe`	Polyexponential	`s`	Simple
`su`	SGM Uniform	`n`	Normal
`ko`	KL Optimal	`di`	DDIM
`b`	Beta	`t`	Turbo

💡 Note: Text values with spaces (e.g. sampler names) must be enclosed in double quotes: smp="Euler a").

The Selective Negation Logic

🧠 The Selective Negation Strategy

One of the most complex challenges in AI image generation is "negation" (telling the AI what not to include). Most engines are designed to follow positive instructions and often struggle with negative ones unless they are formatted specifically for their architecture.

Easymage introduces a Selective Negation Strategy that automatically chooses the best method to handle your --no requirements.

1. Native API Handling (High Fidelity)

If the generation engine has a dedicated technical field for negative prompts, Easymage passes your exclusions directly to the API. This provides a "surgical" removal of elements without affecting the creative description of the main subject.

Gemini (Imagen 3): Uses the negativePrompt parameter natively.
Forge (A1111): Uses the negative_prompt field only when High-Res Fix (+h) is enabled, ensuring maximum quality during the upscale pass.

2. LLM Fallback Integration (Natural Description)

Many engines (like DALL-E 3) do not have a native "Negative Prompt" field. For these cases, or when Forge is used in standard mode, Easymage uses its LLM Prompt Engineer to "digest" the exclusions.

Instead of simply appending a list of words, the LLM rewrites the description to ensure those elements are logically absent.
Example: If you specify --no people, the LLM won't just say "no people"; it will describe a "completely deserted, silent landscape where no human presence is visible," which is much more effective for models like DALL-E.

Summary Logic Table

Engine / Condition	Method	How it works
Gemini	Native	Sent to the `negativePrompt` API field.
OpenAI (DALL-E 3)	LLM Fallback	Integrated into the descriptive flow of the prompt.
Forge (Hires Fix ON)	Native	Sent to the technical `negative_prompt` field.
Forge (Standard)	LLM Fallback	Integrated into the descriptive prompt by the LLM.
ComfyUI (Fallback)	LLM Fallback	Integrated into the descriptive prompt by the LLM.
`img:p` Trigger	LLM Fallback	Always integrated into the text to provide a ready-to-use natural prompt.

🪄 The "AVOID" Protocol

When the LLM Fallback is active, the system prompt for the Enhancer is dynamically updated with the MANDATORY AVOID rule. This forces the LLM to verify that the forbidden elements are not just ignored, but that the scene is described in a way that confirms their absence, significantly improving the Vision Quality Audit success rate.

Intelligent Engine Router

⚙️ Engine & Parameter Mapping

Easymage acts as a high-level abstraction layer. It converts universal parameters into the specific JSON payloads required by each engine. Below are the technical mapping details.

⚡➡️ View Full Mapping Tables

1. Forge / Automatic1111 (Direct HTTPX)

Connection: Sends a direct POST request to /sdapi/v1/txt2img.

EasyMage Parameter	Forge API Field	Technical Logic
enhanced_prompt	`prompt`	Final expanded text from LLM.
negative_prompt	`negative_prompt`	⚠️ Conditional: Sent natively only if `+h` (Hires Fix) is ON. If OFF, it's integrated into the prompt text.
size	`width` / `height`	The `WxH` string is split into two integers.
stp / sd / cs	`steps` / `seed` / `cfg_scale`	Passed directly to the API.
dcs	`distilled_cfg_scale`	Used for Flux/SD3. Forces native `cfg_scale` to 1.0.
smp / sch	`sampler_name` / `scheduler`	Converted via `SAMPLER_MAP` and `SCHEDULER_MAP`.
hr / hru / dns	`enable_hr` / `hr_upscaler` / `denoising_strength`	Activates the Hires Fix pipeline.

2. OpenAI (DALL-E 3) (Direct HTTPX)

Connection: Sends a direct POST request to /images/generations.

EasyMage Parameter	OpenAI API Field	Technical Logic
enhanced_prompt	`prompt`	Main prompt.
negative_prompt	-	❌ LLM Fallback: Always integrated into the descriptive text.
enable_hr (`+h`)	`quality`	`True` → `hd`, `False` → `standard`.
stl	`style`	`v` → `vivid` (default), `n` → `natural`.
size	`size`	Snaps to `1024x1024`, `1792x1024`, or `1024x1792`.
user (Context)	`user`	Automatically passes your Open WebUI User ID.
-	`n`	⚠️ Hardcoded to 1 (API limitation).
-	`response_format`	⚠️ Hardcoded to `b64_json`.

3. Gemini (Imagen 3) (Direct HTTPX)

Connection: Sends a direct POST request to Google's :predict or :generateContent endpoint.

EasyMage Parameter	Gemini API Field	Technical Logic
enhanced_prompt	`instances[0].prompt`	Main prompt.
negative_prompt	`parameters.negativePrompt`	✅ Always Native: Supported directly by Gemini.
n	`parameters.sampleCount`	Number of images (1-4).
sd	`parameters.seed`	Supported because `addWatermark` is set to false.
ar / sz	`parameters.aspectRatio`	⚠️ Calculated: Pixel dimensions are converted back to a ratio string (`1:1`, `16:9`, etc.).
-	`parameters.safetySetting`	⚠️ Hardcoded to `block_none` for maximum flexibility.
-	`parameters.personGeneration`	⚠️ Hardcoded to `allow_all`.
-	`parameters.addWatermark`	⚠️ Hardcoded to `false`.
-	`parameters.includeReasoning`	⚠️ Hardcoded to `false`.

4. ComfyUI / Fallback (via Open WebUI API)

Connection: Uses the internal image_generations router of Open WebUI.

EasyMage Parameter	OWUI Form Field	Technical Logic
enhanced_prompt	`prompt`	Main prompt.
negative_prompt	-	❌ LLM Fallback: Integrated into the descriptive text.
mdl	`model`	Checkpoint name.
sz	`size`	Size string.
n	`n`	Number of images.
All others	-	❌ Unsupported: The standard OWUI router does not expose seed, steps, or CFG Scale for this method.

📐 Size & Aspect Ratio Logic

Easymage handles dimensions dynamically to satisfy different engine requirements:

Input Parsing: Using sz=N automatically sets the target to NxN. Using sz=WxH sets specific targets.
Normalization:
- OpenAI (DALL-E 3): Ignores specific pixels and "snaps" to the closest supported HD resolution (1024x1024, 1792x1024, or 1024x1792) based on the calculated aspect ratio.
- Gemini (Imagen 3): Converts the dimensions into one of the supported ratio strings (1:1, 4:3, 16:9, etc.).
- Forge / A1111: Uses the requested pixels but rounds them to the nearest multiple of 8 to ensure hardware compatibility.
Aspect Ratio Intelligence: The ar (Aspect Ratio) parameter has mathematical priority over the sz (Size) parameter. If an explicit Aspect Ratio is provided (via CLI or User Valves), the width of your sz is kept as the base reference, and the height is automatically recalculated to perfectly match the requested ratio (Height = Width / Ratio). If no ar is provided, the system infers the exact ratio automatically from your sz dimensions using their greatest common divisor.

🔧 Configuration & Valves

Easymage uses a hierarchical configuration system. Settings are applied in the following order of precedence (highest priority first):

CLI Command (e.g., img sz=1024x1024) -> Overrides everything for a single request.
User Valves (Personal Settings) -> Overrides Admin defaults. Accessed via Chat Controls.
Admin Valves (System Settings) -> Overrides Open WebUI global variables. Managed by Admins.
Global OWUI Settings -> Native Open WebUI settings (e.g., Admin Panel > Settings > Images).

1. User Valves (Personal Preferences)

These settings are specific to your account or current chat context. You can access them via the Controls icon (sliders) directly within a chat.

Valve	Default	Description
🔐 Authentication
`openai_auth`	`""`	Your personal OpenAI API Key. Overrides the system default.
`gemini_auth`	`""`	Your personal Google Gemini API Key. Overrides the system default.
`automatic1111_auth`	`""`	Your personal Forge credentials (`user:password`).
🎨 Workflow
`enhanced_prompt`	`True`	If enabled, the LLM rewrites and improves your prompt before generation.
`quality_audit`	`True`	Enables the Vision LLM to critique the generated image and assign a score.
`strict_audit`	`False`	Enables "Ruthless Mode" for the audit (penalizes hallucinations severely).
`debug`	`False`	Prints the full internal state JSON and API payloads to the server console.
⚙️ Generation Parameters
`model`	`None`	Forces a specific checkpoint (e.g., `flux1-dev.safetensors`) for all requests.
`size`	`1024x1024`	Default image resolution (`WxH`). It serves as the absolute base reference for the image width.
`aspect_ratio`	`""` (Empty)	Target aspect ratio (e.g., `16:9`, `4:3`). Behavior: If left empty, it is automatically derived from the `size`. If explicitly set, it overrides the height of your `size` parameter to enforce perfect proportions (Height = Width / Ratio).
`steps`	`20`	Number of sampling steps (Forge / A1111 only).
`seed`	`-1`	Default seed (`-1` = Random).
`cfg_scale`	`1.0`	Classifier-Free Guidance scale.
`distilled_cfg_scale`	`3.5`	Specific CFG for distilled models like Flux/SD3.
🛠️ Forge / A1111 Specifics
`sampler_name`	`Euler`	The sampling algorithm (e.g., `DPM++ 2M SDE`).
`scheduler`	`Simple`	The noise schedule type (e.g., `Karras`, `SGM Uniform`).
`enable_hr`	`False`	Enables High-Res Fix (Forge) or HD Quality (OpenAI).
`hr_scale`	`2.0`	Upscale multiplier (e.g., 1.5x, 2.0x).
`hr_upscaler`	`Latent`	The algorithm used for the upscaling pass.
`hr_distilled_cfg`	`3.5`	CFG scale used specifically during the Hires pass.
`denoising_strength`	`0.45`	How much the upscaler can modify the original image (0.0 - 1.0).

2. Admin Valves (System Defaults)

These settings are managed by the Administrator and apply to all users unless overridden. They control the infrastructure and connection behavior.

Valve	Default	Description
`easy_cloud_mode`	`True`	If `True`, ignores custom/local URLs for OpenAI/Gemini and uses the official public endpoints (`api.openai.com`, etc.). Disable this if you use a reverse proxy.
`generation_timeout`	`120`	Maximum time (seconds) to wait for an API response before failing.
`extreme_vram_cleanup`	`False`	Memory Safety: If `True`, unloads everything (including the active Chat LLM) before generation. If `False`, only unloads other idle models.
`persistent_vision_cache`	`False`	If `True`, saves the "Vision Capability" test results to a JSON file to speed up server restarts.
🔑 Global Auth Defaults
`openai_auth`	`""`	Global fallback API Key for OpenAI.
`gemini_auth`	`""`	Global fallback API Key for Gemini.
`automatic1111_auth`	`""`	Global fallback credentials for Forge (`user:password`).

🛡️ Privacy & Local Execution

Easymage is designed to respect your privacy. If you use local backends like LM Studio, Ollama, or Local Forge/ComfyUI, ensure that you configure the settings correctly to prevent external API calls.

Local-Only Mode: By default, Easymage might attempt to use official API endpoints for OpenAI/Gemini to increase performance.
The Switch: To keep everything 100% local, go to Admin Valves and set easy_cloud_mode to False. This forces the filter to use your local Open WebUI proxies and custom URLs instead of reaching out to the internet.

📌 Output, Citations & Performance

Easymage provides a transparent output system. Beneath the generated image, you will see three linked citations and a performance status bar.

1. The Citation System

[🚀 PROMPT]: Shows the Enhanced Prompt (the text actually seen by the GPU) followed by a structured recap of your original Styles and Negative Prompt.
[🟢 SCORE: XX%]: The result of the Visual Quality Audit. It includes a technical critique and a colored emoji indicator based on the score (80+ 🟢, 70+ 🔵, 60+ 🟡, 40+ 🟠, <40 🔴).
[🔍 DETAILS]: A full technical recap including the backend Engine used, the active Model, Resolution, and specific latency for each pipeline stage.
[ℹ️ INFO]: (Help Mode only) Displays version metadata and project links.

2. Real-Time Performance Tracking

The final status bar provides a detailed snapshot of the generation efficiency: [Total Time]s total | [Image Gen]s img | [Total Tokens] tk | [Throughput] tk/s

Total Time: The entire duration from your message to the final output.
img: The time spent waiting specifically for the Image Engine.
tk / tk/s: Token count and speed of the LLM during the Prompt Enhancement phase.

🛠️ Diagnostics & Debugging

If you encounter issues or want to see how Easymage is "thinking," you can activate Debug Mode via the Valve or by adding +d to your message.

In-Chat Debug: Easymage will print two formatted JSON blocks containing the current Internal Model State (parsed values, calculated ratios, selected engine) and the current Valve Configuration.
Docker Logs: Detailed "⚡ EASYMAGE DEBUG" logs are printed to the server console, including the raw System Prompts sent to the LLM and the raw responses from the image APIs.
Error Handling: If an engine fails, Easymage will intercept the error and display a detailed "❌ EASYMAGE ERROR" message in the chat, preventing the filter from crashing.

🔮 Future Developments

Easymage is constantly evolving. Here are the key features currently in the pipeline:

AWS Nova Canvas Integration: Integration of Amazon’s Nova Canvas model via AWS Bedrock into the EM image generation pipeline.
Filter Chainability: Insert EM logic into the native filters sequence, allowing it to interact with, modify, or pass data to other active filters.
Multi-Image & Batching: Support for generating multiple iterations in a single call and automated batch processing for high-volume workflows.
ComfyUI Native Integration: Bridging the gap with ComfyUI backends to leverage its node-based power directly through Easymage's streamlined syntax.
Fine-Tuned Control (LoRAs): Comprehensive support for custom LoRA injection (A1111/Forge & ComfyUI), enabling precise style and character consistency.
Image-to-Image (Img2Img): Implementation of the Img2Img pipeline, allowing users to use reference images as a foundation for Easymage-driven transformations.

📄 License

Easymage is released under the MIT License. Feel free to use, modify, and distribute it within the Open WebUI community.

🤝 Contributing & Support

Easymage is an orchestration layer for a complex and fragmented ecosystem. While developed on high-end hardware, its core mission is universal compatibility and robust control. Given the thousands of possible combinations between LLMs, Image Engines, and UI parameters, this version is a Public Beta.

We actively encourage feedback and issue reports regarding:

Engine Mappings: Incorrect parameter translations or missing features.
Runtime Errors: Crashes, hangs, or unexpected behavior in the Open WebUI pipeline.
Environment Issues: Compatibility bugs across different hardware or Docker setups.

Help us harden the orchestration logic by reporting any anomaly you encounter.

If you encounter bugs or have feature requests, please open an issue on the GitHub Repository or contact the author through the Open WebUI community portal./).

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
__pycache__		__pycache__
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
easymage.py		easymage.py

Folders and files

Latest commit

History

Repository files navigation