Skip to content

x-hannibal/open-webui-easymage

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

✨Easymage: Generative Imaging & Prompt Engineering Filter

Easymage is a professional-grade orchestration filter for Open WebUI designed to transform your image generation workflow into a unified and intelligent experience. By simply prepending img triggers to any message, you activate an advanced pipeline that handles everything from multilingual prompt engineering to multi-engine generation and post-creation technical analysis.

This filter acts as an Intelligent Dispatcher, unlocking advanced, engine-specific parameters like seed, style, quality, and distilled CFG and many others that are not natively exposed through the standard Open WebUI interface. Version 0.9.1 introduces a streamlined Subcommand Architecture and a Unified Help System, creating a seamless bridge between local power-user tools (Forge, ComfyUI) and cloud simplicity (OpenAI, Gemini).

GitHub Repo Open WebUI Filter License

🆕 What's New in v0.9.3 (vs v0.9.2-beta.2)

  • Aspect Ratio Intelligence: Fixed a critical bug where a hidden default aspect ratio was overriding custom dimensions. The system now mathematically infers the correct ratio from your Size setting automatically.
  • CLI Typographic Normalization: The Prompt Parser now intercepts and normalizes typographic dashes (like the em-dash or en-dash automatically inserted by mobile keyboards) back into valid CLI parameters (--), preventing syntax errors on smartphones.
  • Privacy & Local Setup: Added a dedicated section explaining how to force 100% local execution using the easy_cloud_mode toggle.

⚡ Quick Start (Copy & Paste)

New to Easymage? Try these commands to see what it can do.

Goal Command What it does
Simple img A futuristic city made of glass Generates a 1024x1024 image using your default settings.
Portrait img ar=9:16 An astronaut in a flower field Creates a vertical image (Instagram/Reels format).
Advanced img sz=1280x720 +h -- A cyberpunk street rain Widescreen HD image with High-Res Fix enabled for extra detail.
Random img:r watercolor, peaceful Let the AI decide the subject, using "watercolor" as a style guide.
Native Lang img Un gatto che beve caffè Write in any language! Easymage translates and optimizes it for the model.

✨ Key Features

  • 🌍 Multilingual Native: You don't need to speak English to get professional results. Write your prompt in any language. Easymage detects the language, translates it, and expands it into technical English optimized for the specific generation model.
  • Advanced Multi-Engine Routing: Native Direct HTTPX support for Forge, OpenAI (DALL-E-3 / DALL-E-2), and the full Google Ecosystem (Imagen4, Veo, Nano Banana), with a standard API fallback for ComfyUI. Easymage translates universal commands into the specific "technical dialect" of each API.

    📝 Compatibility Note: Throughout this documentation, we use "Forge" to refer to local Stable Diffusion backends. Easymage is fully compatible with both WebUI Forge and the classic Automatic1111 SD-WebUI, as they share the same API structure.

  • VRAM Auto-Optimization: Automatically manages your GPU memory. Before generating an image, Easymage checks your Ollama server and unloads unused models to ensure Forge or ComfyUI have enough VRAM to operate without crashing.
  • Zero-Latency Dispatch: Uses a persistent connection pool to communicate with backends. This reduces the time-to-first-token and image generation start time by eliminating repetitive network handshakes.
  • Selective Negation Strategy: A smart logic core that determines how to handle negative prompts. It automatically decides whether to use native API fields (for Gemini or Forge with Hires Fix) or to "inject" the negation into the LLM-enhanced description (for OpenAI and ComfyUI), ensuring perfect visual results.
  • Automated Prompt Engineering: Expands minimalist user input into high-fidelity technical prompts. It dynamically incorporates details about lighting, camera angles, textures, environment, and artistic style based on your requirements.
  • Vision Quality Audit (QC): Provides a real-time technical critique of the generated image using Vision LLMs. It assigns a numerical score (0-100%) and evaluates specific artifacts like noise, grain, melting, and aliasing (jaggies).
  • Universal CLI Syntax: Control every generation aspect with a single syntax (sz=, stp=, ar=, etc.). No more switching between different interfaces or learning complex API structures.
  • Performance Analytics: Every generation tracks precise metrics, including total execution time, image generation latency, and LLM throughput (Tokens per second).

🧠 The Three-Step Pipeline

  1. Expansion & Optimization: The system detects the language, identifies your requested styles and exclusions, and uses an LLM to build a professional prompt.
  2. Smart Generation (w/ VRAM Handover):
    • Easymage cleans the VRAM (unloading idle LLMs).
    • The request is routed to the chosen engine via direct, optimized HTTPX calls.
    • If native parameters are supported, they are passed directly; otherwise, logic fallbacks apply.
  3. Visual Verification: If a Vision-capable model is available, the final image is audited for prompt alignment and technical integrity.

Universal CLI (Command Line Interface)

💡 Core Usage & Syntax

Easymage is activated by prepending a specific trigger command to your message. It parses technical instructions, stylistic choices, and the subject in a single pass.

Subcommand Architecture

  • img [prompt]: Standard Generation. Triggers the full pipeline: Prompt Enhancement → Image Generation → Vision Audit.
  • img:p [prompt]: Prompt Only Mode. Executes language detection and prompt enhancement logic but stops before generating the image. Useful for iterating on prompt engineering without burning compute credits or time.
  • img:r [styles]: Random / "I'm Feeling Lucky" Mode. The LLM acts as an "Engine of Total Entropy," rolling virtual dice to select a subject and style from diverse categories (Nature, Street Photography, Art, Pop Culture, Architecture, Abstract). Any text you provide is treated as a Style Constraint, not a subject.
  • img ?: Manual Mode. Displays the help menu, shortcuts, and parameter tables directly in the chat. Typing just img (with no prompt) also triggers this mode.

Command Structure

The universal format is: img:[subcmd] [ flags ] [ parameters ] [ styles -- ] [ subject] [ --no negative prompt ]

⚠️ CRITICAL SYNTAX RULE: If a parameter value contains spaces, you MUST use double quotes!

  • ✅ Correct: smp="Euler a"
  • ❌ Wrong: smp=Euler a
Part Example Description
Parameters sz=1024x1024 stp=30 key=value pairs to control the generation (size, steps, etc.). Text values with spaces (e.g. sampler names) must be enclosed in double quotes: smp="Euler a".
Flags +h -a Single-character toggles to enable (+) or disable (-) features.
Styles neon, 8k, macro Stylistic keywords that steer the LLM enhancer.
Subject -- a lonely robot The main content of your image. The -- separator is mandatory only if you are specifying Styles to distinguish them from the Subject.
Negative Prompt --no people, blur Elements to exclude. The --no (or —no) separator triggers the logic fallback system.

💡 Smart Parsing: If you only use technical parameters (like sz=1024 or +h), Easymage automatically identifies the Subject as everything following the last parameter. You only need the -- separator when you want to provide descriptive Styles (e.g., img sz=1024 cinematic, low-angle -- a giant tree).

Configuration Hierarchy & Precedence

Easymage uses an intelligent, three-layer system to determine which settings to apply for each image generation. This tiered approach gives you maximum flexibility, allowing for quick, one-time experiments without altering your preferred defaults. The rule is simple: more specific settings always override more general ones.

Here is the order of priority, from highest to lowest:

Priority Source Description
1 (Highest) Command Line (CLI) Parameters typed directly into your chat message (e.g., sz=1024x1024). These are temporary and last for only one generation.
2 (Medium) User Valves (Chat Controls) Your personal defaults, configured via the Controls icon within a chat. These apply to all your img commands in that context unless a CLI parameter is used.
3 (Lowest) Global OWUI Settings The server-wide defaults for image generation, configured by an admin in Admin Panel > Settings > Images. Easymage uses these as a final fallback.

Practical Example

Let's see how this works for the size parameter:

  1. Global Setting: Your Open WebUI server is configured with a default image size of 1024x1024.
  2. Valve Override: You go into your Easymage Valves and set the Size to 512x512. From now on, every time you type img a cat, the image will be 512x512.
  3. CLI Override: You want to create a single widescreen image. You type img sz=1792x1024 a cat. For this specific request, Easymage will ignore both the Global setting and your Valve, generating a 1792x1024 image.

The next time you type img a dog, it will revert to using your Valve setting of 512x512.


⚙️ Command Line Parameters

These parameters allow you to override backend settings directly from the chat.

Parameter Example Description Supported Engines
ge ge=o Generation Engine. Selects the generation backend. All
mdl mdl=d3 Model. Selects a specific checkpoint or API model version. All
sz sz=800 Size. Accepts WxH or a single value N (auto-converted to square NxN). Dimensions are then normalized based on engine constraints and the ar parameter. All
ar ar=16:9 Aspect Ratio. Automatically calculates size based on ratio. All
stp stp=25 Steps. Number of sampling iterations. Forge / A1111
sd sd=42 Seed. Numeric value for reproducible generations. Forge / A1111, Gemini
cs cs=7.0 CFG Scale. Classifier-Free Guidance intensity. Forge, Comfy
dcs dcs=3.5 Distilled CFG Scale. Specialized for Flux/SD3 models. Forge / A1111
stl stl=v Style. Choose between v (vivid) or n (natural). OpenAI
smp smp=d2s Sampler. Selects the sampling algorithm. Forge / A1111
sch sch=k Scheduler. Selects the noise schedule type. Forge / A1111
hr hr=2.0 High-Res Scale. Enables Hires Fix and sets the multiplier. Forge / A1111
hru hru=Latent High-Res Upscaler. Specifies the upscaling model. Forge / A1111
hdcs hdcs=3.5 Hires Distilled CFG. Distilled scale during the HR pass. Forge / A1111
dns dns=0.45 Denoising Strength. Intensity of the HR fix pass. Forge / A1111
auth auth=sk.. Authentication. Overrides global/valve keys for this request. All

Command Line Flags (Toggles)

Flags provide a quick way to override your default Valves configuration for a single message.

Flag Name Description
+h / -h High-Res Toggles High-Res Fix. For OpenAI, +h activates quality: hd.
+p / -p Prompt Toggles the LLM Prompt Enhancer.
+a / -a Audit Toggles the post-generation Vision Quality Audit.
+d / -d Debug Toggles Debug Mode (prints internal state JSON in chat).

⚡ Shortcut Tables

Easymage uses optimized shortcodes to minimize typing while maintaining full technical control. However, for those who prefer clarity or are using scripts, full parameter names are also supported.

⚡ Engine Shortcuts (`ge=...`) ▼
Shortcut Engine Name Backend API
f / a forge / automatic1111 SD Forge / A1111 (Direct HTTPX)
o openai DALL-E 3 (Direct HTTPX)
g gemini Imagen 3 (Direct HTTPX)
c comfyui ComfyUI (Open WebUI API)
⚡ Model Shortcuts (`mdl=...`) ▼
Shortcut API Model Name / Identifier Note / Description
OpenAI
d3 dall-e-3 Standard DALL-E 3
d2 dall-e-2 Legacy DALL-E 2
g4o gpt-4o Multimodal
g4om gpt-4o-mini Multimodal Mini
Google Imagen Series
i4 imagen-4.0-generate-001 Imagen 4 (Full) - Standard 2026
veo veo-3.0-generate-preview Veo 3 (Frame Gen / 4K)
Google Gemini Series
g2.5f gemini-2.5-flash-image Nano Banana (Flash / Fast)
g3p gemini-3-pro-image-preview Nano Banana Pro (High Quality)
Local / Forge
flux flux1-dev.safetensors Flux 1 Dev
sdxl sd_xl_base_1.0.safetensors SDXL Base
⚡ Aspect Ratio Shortcuts (`ar=...`) ▼
Shortcut Ratio Common Use Case
1 1:1 Square (Social Media)
16 16:9 Cinematic Widescreen
9 9:16 Vertical / Reels
4 4:3 Photography Standard
3 3:4 Portrait
21 21:9 Ultra-Widescreen
⚡ Sampler Shortcuts (`smp=...`) ▼
Code Full Sampler Name Code Full Sampler Name
d3s DPM++ 3M SDE df DPM fast
d2sh DPM++ 2M SDE Heun dad DPM adaptive
d2s DPM++ 2M SDE r Restart
d2m DPM++ 2M h2 HeunPP2
d2sa DPM++ 2S a ip IPNDM
ds DPM++ SDE ipv IPNDM_V
ea Euler a de DEIS
e Euler u UniPC
l LMS lcm LCM
h Heun di DDIM
d2 DPM2 dic DDIM CFG++
d2a DPM2 a dp DDPM
⚡ Scheduler Shortcuts (`sch=...`) ▼
Code Full Scheduler Name Code Full Scheduler Name
a Automatic ays Align Your Steps
u Uniform aysg Align Your Steps GITS
k Karras ays11 Align Your Steps 11
e Exponential ays32 Align Your Steps 32
pe Polyexponential s Simple
su SGM Uniform n Normal
ko KL Optimal di DDIM
b Beta t Turbo

💡 Note: Text values with spaces (e.g. sampler names) must be enclosed in double quotes: smp="Euler a").


The Selective Negation Logic

🧠 The Selective Negation Strategy

One of the most complex challenges in AI image generation is "negation" (telling the AI what not to include). Most engines are designed to follow positive instructions and often struggle with negative ones unless they are formatted specifically for their architecture.

Easymage introduces a Selective Negation Strategy that automatically chooses the best method to handle your --no requirements.

1. Native API Handling (High Fidelity)

If the generation engine has a dedicated technical field for negative prompts, Easymage passes your exclusions directly to the API. This provides a "surgical" removal of elements without affecting the creative description of the main subject.

  • Gemini (Imagen 3): Uses the negativePrompt parameter natively.
  • Forge (A1111): Uses the negative_prompt field only when High-Res Fix (+h) is enabled, ensuring maximum quality during the upscale pass.

2. LLM Fallback Integration (Natural Description)

Many engines (like DALL-E 3) do not have a native "Negative Prompt" field. For these cases, or when Forge is used in standard mode, Easymage uses its LLM Prompt Engineer to "digest" the exclusions.

  • Instead of simply appending a list of words, the LLM rewrites the description to ensure those elements are logically absent.
  • Example: If you specify --no people, the LLM won't just say "no people"; it will describe a "completely deserted, silent landscape where no human presence is visible," which is much more effective for models like DALL-E.

Summary Logic Table

Engine / Condition Method How it works
Gemini Native Sent to the negativePrompt API field.
OpenAI (DALL-E 3) LLM Fallback Integrated into the descriptive flow of the prompt.
Forge (Hires Fix ON) Native Sent to the technical negative_prompt field.
Forge (Standard) LLM Fallback Integrated into the descriptive prompt by the LLM.
ComfyUI (Fallback) LLM Fallback Integrated into the descriptive prompt by the LLM.
img:p Trigger LLM Fallback Always integrated into the text to provide a ready-to-use natural prompt.

🪄 The "AVOID" Protocol

When the LLM Fallback is active, the system prompt for the Enhancer is dynamically updated with the MANDATORY AVOID rule. This forces the LLM to verify that the forbidden elements are not just ignored, but that the scene is described in a way that confirms their absence, significantly improving the Vision Quality Audit success rate.


Intelligent Engine Router

⚙️ Engine & Parameter Mapping

Easymage acts as a high-level abstraction layer. It converts universal parameters into the specific JSON payloads required by each engine. Below are the technical mapping details.

⚡➡️ View Full Mapping Tables

1. Forge / Automatic1111 (Direct HTTPX)

Connection: Sends a direct POST request to /sdapi/v1/txt2img.

EasyMage Parameter Forge API Field Technical Logic
enhanced_prompt prompt Final expanded text from LLM.
negative_prompt negative_prompt ⚠️ Conditional: Sent natively only if +h (Hires Fix) is ON. If OFF, it's integrated into the prompt text.
size width / height The WxH string is split into two integers.
stp / sd / cs steps / seed / cfg_scale Passed directly to the API.
dcs distilled_cfg_scale Used for Flux/SD3. Forces native cfg_scale to 1.0.
smp / sch sampler_name / scheduler Converted via SAMPLER_MAP and SCHEDULER_MAP.
hr / hru / dns enable_hr / hr_upscaler / denoising_strength Activates the Hires Fix pipeline.

2. OpenAI (DALL-E 3) (Direct HTTPX)

Connection: Sends a direct POST request to /images/generations.

EasyMage Parameter OpenAI API Field Technical Logic
enhanced_prompt prompt Main prompt.
negative_prompt - LLM Fallback: Always integrated into the descriptive text.
enable_hr (+h) quality Truehd, Falsestandard.
stl style vvivid (default), nnatural.
size size Snaps to 1024x1024, 1792x1024, or 1024x1792.
user (Context) user Automatically passes your Open WebUI User ID.
- n ⚠️ Hardcoded to 1 (API limitation).
- response_format ⚠️ Hardcoded to b64_json.

3. Gemini (Imagen 3) (Direct HTTPX)

Connection: Sends a direct POST request to Google's :predict or :generateContent endpoint.

EasyMage Parameter Gemini API Field Technical Logic
enhanced_prompt instances[0].prompt Main prompt.
negative_prompt parameters.negativePrompt Always Native: Supported directly by Gemini.
n parameters.sampleCount Number of images (1-4).
sd parameters.seed Supported because addWatermark is set to false.
ar / sz parameters.aspectRatio ⚠️ Calculated: Pixel dimensions are converted back to a ratio string (1:1, 16:9, etc.).
- parameters.safetySetting ⚠️ Hardcoded to block_none for maximum flexibility.
- parameters.personGeneration ⚠️ Hardcoded to allow_all.
- parameters.addWatermark ⚠️ Hardcoded to false.
- parameters.includeReasoning ⚠️ Hardcoded to false.

4. ComfyUI / Fallback (via Open WebUI API)

Connection: Uses the internal image_generations router of Open WebUI.

EasyMage Parameter OWUI Form Field Technical Logic
enhanced_prompt prompt Main prompt.
negative_prompt - LLM Fallback: Integrated into the descriptive text.
mdl model Checkpoint name.
sz size Size string.
n n Number of images.
All others - Unsupported: The standard OWUI router does not expose seed, steps, or CFG Scale for this method.

📐 Size & Aspect Ratio Logic

Easymage handles dimensions dynamically to satisfy different engine requirements:

  1. Input Parsing: Using sz=N automatically sets the target to NxN. Using sz=WxH sets specific targets.
  2. Normalization:
    • OpenAI (DALL-E 3): Ignores specific pixels and "snaps" to the closest supported HD resolution (1024x1024, 1792x1024, or 1024x1792) based on the calculated aspect ratio.
    • Gemini (Imagen 3): Converts the dimensions into one of the supported ratio strings (1:1, 4:3, 16:9, etc.).
    • Forge / A1111: Uses the requested pixels but rounds them to the nearest multiple of 8 to ensure hardware compatibility.
  3. Aspect Ratio Intelligence: The ar (Aspect Ratio) parameter has mathematical priority over the sz (Size) parameter. If an explicit Aspect Ratio is provided (via CLI or User Valves), the width of your sz is kept as the base reference, and the height is automatically recalculated to perfectly match the requested ratio (Height = Width / Ratio). If no ar is provided, the system infers the exact ratio automatically from your sz dimensions using their greatest common divisor.

🔧 Configuration & Valves

Easymage uses a hierarchical configuration system. Settings are applied in the following order of precedence (highest priority first):

  1. CLI Command (e.g., img sz=1024x1024) -> Overrides everything for a single request.
  2. User Valves (Personal Settings) -> Overrides Admin defaults. Accessed via Chat Controls.
  3. Admin Valves (System Settings) -> Overrides Open WebUI global variables. Managed by Admins.
  4. Global OWUI Settings -> Native Open WebUI settings (e.g., Admin Panel > Settings > Images).

1. User Valves (Personal Preferences)

These settings are specific to your account or current chat context. You can access them via the Controls icon (sliders) directly within a chat.

Valve Default Description
🔐 Authentication
openai_auth "" Your personal OpenAI API Key. Overrides the system default.
gemini_auth "" Your personal Google Gemini API Key. Overrides the system default.
automatic1111_auth "" Your personal Forge credentials (user:password).
🎨 Workflow
enhanced_prompt True If enabled, the LLM rewrites and improves your prompt before generation.
quality_audit True Enables the Vision LLM to critique the generated image and assign a score.
strict_audit False Enables "Ruthless Mode" for the audit (penalizes hallucinations severely).
debug False Prints the full internal state JSON and API payloads to the server console.
⚙️ Generation Parameters
model None Forces a specific checkpoint (e.g., flux1-dev.safetensors) for all requests.
size 1024x1024 Default image resolution (WxH). It serves as the absolute base reference for the image width.
aspect_ratio "" (Empty) Target aspect ratio (e.g., 16:9, 4:3).
Behavior: If left empty, it is automatically derived from the size. If explicitly set, it overrides the height of your size parameter to enforce perfect proportions (Height = Width / Ratio).
steps 20 Number of sampling steps (Forge / A1111 only).
seed -1 Default seed (-1 = Random).
cfg_scale 1.0 Classifier-Free Guidance scale.
distilled_cfg_scale 3.5 Specific CFG for distilled models like Flux/SD3.
🛠️ Forge / A1111 Specifics
sampler_name Euler The sampling algorithm (e.g., DPM++ 2M SDE).
scheduler Simple The noise schedule type (e.g., Karras, SGM Uniform).
enable_hr False Enables High-Res Fix (Forge) or HD Quality (OpenAI).
hr_scale 2.0 Upscale multiplier (e.g., 1.5x, 2.0x).
hr_upscaler Latent The algorithm used for the upscaling pass.
hr_distilled_cfg 3.5 CFG scale used specifically during the Hires pass.
denoising_strength 0.45 How much the upscaler can modify the original image (0.0 - 1.0).

2. Admin Valves (System Defaults)

These settings are managed by the Administrator and apply to all users unless overridden. They control the infrastructure and connection behavior.

Valve Default Description
easy_cloud_mode True If True, ignores custom/local URLs for OpenAI/Gemini and uses the official public endpoints (api.openai.com, etc.). Disable this if you use a reverse proxy.
generation_timeout 120 Maximum time (seconds) to wait for an API response before failing.
extreme_vram_cleanup False Memory Safety: If True, unloads everything (including the active Chat LLM) before generation. If False, only unloads other idle models.
persistent_vision_cache False If True, saves the "Vision Capability" test results to a JSON file to speed up server restarts.
🔑 Global Auth Defaults
openai_auth "" Global fallback API Key for OpenAI.
gemini_auth "" Global fallback API Key for Gemini.
automatic1111_auth "" Global fallback credentials for Forge (user:password).

🛡️ Privacy & Local Execution

Easymage is designed to respect your privacy. If you use local backends like LM Studio, Ollama, or Local Forge/ComfyUI, ensure that you configure the settings correctly to prevent external API calls.

  • Local-Only Mode: By default, Easymage might attempt to use official API endpoints for OpenAI/Gemini to increase performance.
  • The Switch: To keep everything 100% local, go to Admin Valves and set easy_cloud_mode to False. This forces the filter to use your local Open WebUI proxies and custom URLs instead of reaching out to the internet.

📌 Output, Citations & Performance

Easymage provides a transparent output system. Beneath the generated image, you will see three linked citations and a performance status bar.

1. The Citation System

  • [🚀 PROMPT]: Shows the Enhanced Prompt (the text actually seen by the GPU) followed by a structured recap of your original Styles and Negative Prompt.
  • [🟢 SCORE: XX%]: The result of the Visual Quality Audit. It includes a technical critique and a colored emoji indicator based on the score (80+ 🟢, 70+ 🔵, 60+ 🟡, 40+ 🟠, <40 🔴).
  • [🔍 DETAILS]: A full technical recap including the backend Engine used, the active Model, Resolution, and specific latency for each pipeline stage.
  • [ℹ️ INFO]: (Help Mode only) Displays version metadata and project links.

2. Real-Time Performance Tracking

The final status bar provides a detailed snapshot of the generation efficiency: [Total Time]s total | [Image Gen]s img | [Total Tokens] tk | [Throughput] tk/s

  • Total Time: The entire duration from your message to the final output.
  • img: The time spent waiting specifically for the Image Engine.
  • tk / tk/s: Token count and speed of the LLM during the Prompt Enhancement phase.

🛠️ Diagnostics & Debugging

If you encounter issues or want to see how Easymage is "thinking," you can activate Debug Mode via the Valve or by adding +d to your message.

  • In-Chat Debug: Easymage will print two formatted JSON blocks containing the current Internal Model State (parsed values, calculated ratios, selected engine) and the current Valve Configuration.
  • Docker Logs: Detailed "⚡ EASYMAGE DEBUG" logs are printed to the server console, including the raw System Prompts sent to the LLM and the raw responses from the image APIs.
  • Error Handling: If an engine fails, Easymage will intercept the error and display a detailed "❌ EASYMAGE ERROR" message in the chat, preventing the filter from crashing.

🔮 Future Developments

Easymage is constantly evolving. Here are the key features currently in the pipeline:

  • AWS Nova Canvas Integration: Integration of Amazon’s Nova Canvas model via AWS Bedrock into the EM image generation pipeline.

  • Filter Chainability: Insert EM logic into the native filters sequence, allowing it to interact with, modify, or pass data to other active filters.

  • Multi-Image & Batching: Support for generating multiple iterations in a single call and automated batch processing for high-volume workflows.

  • ComfyUI Native Integration: Bridging the gap with ComfyUI backends to leverage its node-based power directly through Easymage's streamlined syntax.

  • Fine-Tuned Control (LoRAs): Comprehensive support for custom LoRA injection (A1111/Forge & ComfyUI), enabling precise style and character consistency.

  • Image-to-Image (Img2Img): Implementation of the Img2Img pipeline, allowing users to use reference images as a foundation for Easymage-driven transformations.


📄 License

Easymage is released under the MIT License. Feel free to use, modify, and distribute it within the Open WebUI community.


🤝 Contributing & Support

Easymage is an orchestration layer for a complex and fragmented ecosystem. While developed on high-end hardware, its core mission is universal compatibility and robust control. Given the thousands of possible combinations between LLMs, Image Engines, and UI parameters, this version is a Public Beta.

We actively encourage feedback and issue reports regarding:

  • Engine Mappings: Incorrect parameter translations or missing features.

  • Runtime Errors: Crashes, hangs, or unexpected behavior in the Open WebUI pipeline.

  • Environment Issues: Compatibility bugs across different hardware or Docker setups.

Help us harden the orchestration logic by reporting any anomaly you encounter.

If you encounter bugs or have feature requests, please open an issue on the GitHub Repository or contact the author through the Open WebUI community portal./).

About

Multi-engine image generation filter for Open WebUI. Features automated prompt enhancement, multi-language support, and real-time Vision QC scoring. Supports A1111, ComfyUI, and OpenAI backends with integrated performance telemetry.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages