Skip to content

Gorluxor/PartEdit

Repository files navigation

PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models

arxiv 🎨 SIGGRAPH 2025 license website benchmark extra demo

teaser

🎉News

  • [2025-12-15] Released the training code with huggingface dataset support (also example dataset on huggingface).
  • [2025-09-02] PartEdit local gradio demo released and huggingface demo live at huggingface.
  • [2025-09-01] PartEdit embeddings and custom training data used released on huggingface .
  • [2025-06-02] PartEdit updated version now on Arxiv.
  • [2025-04-01] PartEdit was accepted to Siggraph 2025 conference track.
  • [2025-03-09] PartEdit benchmark available on huggingface.
  • [2025-02-06] PartEdit now available on Arxiv.

📖Introduction

In this paper, We present the first text-based image editing approach for object parts based on pre-trained diffusion models. Diffusion-based image editing approaches capitalized on the deep understanding of diffusion models of image semantics to perform a variety of edits. However, existing diffusion models lack sufficient understanding of many object parts, hindering fine-grained edits requested by users. To address this, we propose to expand the knowledge of pre-trained diffusion models to allow them to understand various object parts, enabling them to perform fine-grained edits. We achieve this by learning special textual tokens that correspond to different object parts through an efficient token optimization process. These tokens are optimized to produce reliable localization masks at each inference step to localize the editing region. Leveraging these masks, we design feature-blending and adaptive thresholding strategies to execute the edits seamlessly. To evaluate our approach, we establish a benchmark and an evaluation protocol for part editing. Experiments show that our approach outperforms existing editing methods on all metrics and is preferred by users 66-90% of the time in conducted user studies.

🚀Getting Started

Installation

# from the folder containing environment.yaml
conda env create -f environment.yaml
# (or, faster) 
mamba env create -f environment.yaml

followed by

conda activate partedit

Note for newer pytorch, they switched for pip only

Notebook example

The Jupyter notebook getting_started.ipynb contains a full example of how to use PartEdit with SDXL.

Gradio demo

To run the demo, simply execute, the downloading of model and embeddings will happen automatically:

hf login # if you have a token
# get a token from https://huggingface.co/settings/tokens
# older versions use `huggingface-cli login`

followed by

python app.py

Then open your browser at http://localhost:7860 (or the link provided in the terminal).

Different Stable Diffusion versions

The current code has been tested with diffusers library. But there might be minor differences for some samples between different versions.

Data

The datasets generated in the experiments can be found at Pascal Part and PartImageNet. We train Human Torso, Human Head and Human Hair from Pascal Part and PartImageNet for the rest that is not custom. I have extracted example dataset hosted on huggingface to provide example, and remove the hard link with detectron2 used previously for training.

💖Acknowledgements

We want to thank the authors of Prompt-to-Prompt-with-sdxl and DAAM, StabilityAI (Stable diffusion XL), and OVAM which was used for the training base OVAMXL and SLiMe for layer selection optimization.

Todos

  • Fix fp16 training

Training

To train install the updated enviroment.yaml or update with pip install torchmetrics git+https://github.com/Gorluxor/ovamxl.git

python -m src.unified_training --config configs/quadruped_head.yaml 

Can check the dataset for other classes or create your own from synthetic or real data.

Note: Takes around 64GB with fp32 and 8 layers selected on 100 images with 2000 steps (approx ~1.5 hours on Nvidia A100 80GB). Training with full gradiant over the small dataset.

🎈Citation

BibTeX:

@inproceedings{cvejic2025partedit,
  title={PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models},
  author={Cvejic, Aleksandar and Eldesokey, Abdelrahman and Wonka, Peter},
  booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
  pages={1--11},
  year={2025}
}

APA:

Cvejic, A., Eldesokey, A., & Wonka, P. (2025, August). PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models. In Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers (pp. 1-11).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors