EN First Generation

First Generation (English)

Opening the Application

After running start.bat (Windows) or python main.py (Linux), the SwiftDiffusion main window appears. If this is your first run, the WelcomeDialog will guide you through language and theme selection.

Generating Your First Image

Select a model from the dropdown at the top. Models are loaded from the models/ directory (.safetensors, .pth, .onnx). The Generate buttons are disabled until a model finishes loading.

Write a prompt in the Text2Image prompt field. Example:

a majestic mountain landscape at sunset, volumetric lighting, highly detailed

Click Generate in the Text2Image tab. A live preview shows generation progress in real time with a pulsing border animation.
Once complete, the image appears in the result area and is automatically saved to the output gallery.

Generation Modes

SwiftDiffusion offers 6 generation modes, each with its own tab:

Text2Image

The default tab. Generates images from a text prompt alone.

Parameter	Description
Prompt	What you want the image to contain. Supports wildcards (`__word__` → random line from `wildcards/word.txt`).
Negative Prompt	What you want to exclude.
Width / Height	Output resolution (256–2048, defaults to 512).
CFG Scale	How closely to follow the prompt (1.0–30.0, default 7.0).
Sampler	Sampling method (Euler a, DPM++ 2M Karras, etc.).
Scheduler	Scheduler type (Normal, Karras, Exponential, etc.).
Steps	Number of denoising steps (1–150, default 20).
Seed	Random seed for reproducibility (-1 for random).

Img2Img

Transforms an existing image using a prompt.

Source Image -- Upload an image via the file dialog.
Denoising Strength -- How much to change the image (0.0 = none, 1.0 = completely new).
All Text2Image parameters apply.

Inpainting

Fills or replaces specific areas of an image.

Upload an image and paint a mask using the built-in canvas editor (Undo/Redo with Ctrl+Z/Y).
The split preview shows the mask input on the left and the result on the right.
Adjust mask blur and other settings.

ControlNet (Canny)

Guides generation using edge detection of a reference image.

Load a reference image; Canny edge detection runs automatically.
The split preview shows the reference on the left and the result on the right.
Adjust ControlNet weight to control its influence.

ADetailer

Automatic face enhancement using YOLOv8 face detection. Runs a separate inpainting pass on detected faces for improved quality.

Works as a standalone tab with its own prompt fields and Generate button.
No extra VRAM cost since it runs sequentially on the same pipeline.

Upscaler

High-quality upscaling using the spandrel library.

Select an upscaling model from the dropdown (models loaded from models/upscaler/).
Choose the scale factor (2x, 4x, etc.).

Performance Settings

Found in the Settings dialog (gear icon in sidebar):

Setting	Description
VRAM Slicing	Reduces VRAM usage by processing layers sequentially. Enable if you get OOM errors.
Attention Slicing	Slices attention computation to reduce VRAM.
Tiled VAE	Processes VAE decoder in tiles. Useful for high-resolution outputs (works independently from VRAM Slicing).
CPU Offload	Offloads model weights to RAM when not in active use.

Stop Button

During generation, the Generate button changes to a STOP button (red). Click it to abort generation immediately. This works in all generation tabs.

Next Steps

Prompt Builder -- Build prompts visually with tags.
Style Presets -- Apply predefined style templates.
Custom Tag Categories -- Create your own tag groups.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EN First Generation

First Generation (English)

Opening the Application

Generating Your First Image

Generation Modes

Text2Image

Img2Img

Inpainting

ControlNet (Canny)

ADetailer

Upscaler

Performance Settings

Stop Button

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

🇬🇧 EN

🇵🇱 PL

Clone this wiki locally