Latent Space Optimization using Diffusion Models

Thesis Overview

This thesis investigates latent space optimization (LSO) using diffusion models, aiming to scale black-box optimization to larger generators and high-resolution images. We explore how to maximize a quantifiable attribute (e.g., smiling) while preserving realism and faithfulness. We compare three search spaces: (i) Stable Diffusion autoencoder latents, (ii) a compact learned latent (LatentVQVAE over SD latents), and (iii) LoRA space optimization (LoRASO), which optimizes low-dimensional adapter embeddings. Under matched settings, LoRASO achieves the strongest attribute gains and realism, highlighting the effectiveness of conditioning space optimization over traditional latent space methods.

Repository Description

Official code repository for the Master's thesis Latent Space Optimization using Diffusion Models, submitted to the Data and Web Science Group (Prof. Dr.-Ing. Margret Keuper) at the University of Mannheim.

Setup Guide

Environment

conda create -n optdif1 python=3.12
conda activate optdif1
pip install -r requirements.txt

Data Import

The FFHQ dataset is too large to be included in this repository. The images1024x1024 version of the FFHQ dataset can be downloaded from here. Copy the Google Drive folder data to the root directory of the workspace, and unzip the image archive. The directory structure should look like this:

├── data
│   └── ffhq
│       ├── images1024x1024
│       │   ├── 00000.png
│       │   ├── ...
│       │   ├── 69999.png
│       ├── ffhq-dataset-v2.json
│       ├── smile_scores.json
│       └── smile_scores_scaled.json

Classifier Import

The CelebA classifier can be downloaded from here. Copy the Google Drive folder models to the root directory of the workspace.

Model Training

This repository provides scripts for training various autoencoder models and finetuning the Stable Diffusion VAE or other components. The model implementations can be found under src/models/, and their training scripts are located in src/run/. Training can be started by adapting and running the appropriate Slurm scripts under slurm/train/.

Optimization

The primary work of this thesis is to perform optimization within autoencoder latent spaces or alternative embedding spaces. For optimization, we rely on direct gradient-based optimization (src/gbo/) or a Bayesian optimization framework (src/bo/). We consider three main approaches to optimization and their implementation can be found directly in src/lso_<approach>.py and launched via the respective Slurm script in slurm/lso/:

Optimization in Stable Diffusion Latent Space: Direct optimization in a Stable Diffusion autoencoder latent space.
Optimization in LatentVQVAE Latent Space: Optimization in a compact learned latent space, exemplified by a LatentVQVAE that further encodes Stable Diffusion (SD) latents.
Optimization in LoRA Conditioning Space: This approach builds on the CTRLorALTer paper and exploits intermediate representations of a LoRAdapter for optimization.

Credits

Thanks to the author of Latent Space Optimization via Weighted Retraining of Deep Generative Models with Application on Image Data for implementations of latent space optimization upon which parts of this repository are based.
Thanks to the authors of Stable Diffusion for the autoencoder implementation reused here.
Thanks to the authors of CTRLorALTer for providing their LoRAdapter implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
data		data
demo		demo
models		models
slurm		slurm
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Latent Space Optimization using Diffusion Models

Thesis Overview

Repository Description

Setup Guide

Environment

Data Import

Classifier Import

Model Training

Optimization

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mograev/OptDif

Folders and files

Latest commit

History

Repository files navigation

Latent Space Optimization using Diffusion Models

Thesis Overview

Repository Description

Setup Guide

Environment

Data Import

Classifier Import

Model Training

Optimization

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages