Skip to content

EN First Generation

AnonBOTpl edited this page May 25, 2026 · 3 revisions

First Generation (English)

Opening the Application

After running start.bat (Windows) or python main.py (Linux), the SwiftDiffusion main window appears. If this is your first run, the WelcomeDialog will guide you through language and theme selection.


Generating Your First Image

  1. Select a model from the dropdown at the top. Models are loaded from the models/ directory (.safetensors, .pth, .onnx). The Generate buttons are disabled until a model finishes loading.

  2. Write a prompt in the Text2Image prompt field. Example:

    a majestic mountain landscape at sunset, volumetric lighting, highly detailed
    
  3. Click Generate in the Text2Image tab. A live preview shows generation progress in real time with a pulsing border animation.

  4. Once complete, the image appears in the result area and is automatically saved to the output gallery.


Generation Modes

SwiftDiffusion offers 6 generation modes, each with its own tab:

Text2Image

The default tab. Generates images from a text prompt alone.

Parameter Description
Prompt What you want the image to contain. Supports wildcards (__word__ → random line from wildcards/word.txt).
Negative Prompt What you want to exclude.
Width / Height Output resolution (256–2048, defaults to 512).
CFG Scale How closely to follow the prompt (1.0–30.0, default 7.0).
Sampler Sampling method (Euler a, DPM++ 2M Karras, etc.).
Scheduler Scheduler type (Normal, Karras, Exponential, etc.).
Steps Number of denoising steps (1–150, default 20).
Seed Random seed for reproducibility (-1 for random).

Img2Img

Transforms an existing image using a prompt.

  • Source Image -- Upload an image via the file dialog.
  • Denoising Strength -- How much to change the image (0.0 = none, 1.0 = completely new).
  • All Text2Image parameters apply.

Inpainting

Fills or replaces specific areas of an image.

  • Upload an image and paint a mask using the built-in canvas editor (Undo/Redo with Ctrl+Z/Y).
  • The split preview shows the mask input on the left and the result on the right.
  • Adjust mask blur and other settings.

ControlNet (Canny)

Guides generation using edge detection of a reference image.

  • Load a reference image; Canny edge detection runs automatically.
  • The split preview shows the reference on the left and the result on the right.
  • Adjust ControlNet weight to control its influence.

ADetailer

Automatic face enhancement using YOLOv8 face detection. Runs a separate inpainting pass on detected faces for improved quality.

  • Works as a standalone tab with its own prompt fields and Generate button.
  • No extra VRAM cost since it runs sequentially on the same pipeline.

Upscaler

High-quality upscaling using the spandrel library.

  • Select an upscaling model from the dropdown (models loaded from models/upscaler/).
  • Choose the scale factor (2x, 4x, etc.).

Performance Settings

Found in the Settings dialog (gear icon in sidebar):

Setting Description
VRAM Slicing Reduces VRAM usage by processing layers sequentially. Enable if you get OOM errors.
Attention Slicing Slices attention computation to reduce VRAM.
Tiled VAE Processes VAE decoder in tiles. Useful for high-resolution outputs (works independently from VRAM Slicing).
CPU Offload Offloads model weights to RAM when not in active use.

Stop Button

During generation, the Generate button changes to a STOP button (red). Click it to abort generation immediately. This works in all generation tabs.


Next Steps

Clone this wiki locally