Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Project | Paper | Huggingface

Kaixin Ding¹, Yang Zhou², Xi Chen¹, Miao Yang³, Jiarong Ou³, Rui Chen³, Xin Tao³, Hengshuang Zhao¹.
HKU¹, SCUT², Kuaishou Technology, Kling team³

Abstract

Recent advances in Text-to-Image (T2I) generative models, such as Imagen, Stable Diffusion, and FLUX, have led to remarkable improvements in visual quality. However, their performance is fundamentally limited by the quality of training data. Web-crawled and synthetic image datasets often contain low-quality or redundant samples, which lead to degraded visual fidelity, unstable training, and inefficient computation. Hence, effective data selection is crucial for improving data efficiency. Existing approaches rely on costly manual curation or heuristic scoring based on single-dimensional features in Text-to-Image data filtering. Although meta-learning based method has been explored in LLM, there is no adaptation for image modalities. To this end, we propose Alchemist, a meta-gradient-based framework to select a suitable subset from large-scale text-image data pairs. Our approach automatically learns to assess the influence of each sample by iteratively optimizing the model from a data-centric perspective. Alchemist consists of two key stages: data rating and data pruning. We train a lightweight rater to estimate each sample's influence based on gradient information, enhanced with multi-granularity perception. We then use the Shift-Gsampling strategy to select informative subsets for efficient model training. Alchemist is the first automatic, scalable, meta-gradient-based data selection framework for Text-to-Image model training. Experiments on both synthetic and web-crawled datasets demonstrate that Alchemist consistently improves visual quality and downstream performance. Training on an Alchemist-selected 50% of the data can outperform training on the full dataset.

Quick Start

Environment Setup

Create a conda/micromamba environment with all dependencies:

# Using conda
conda env create -f environment.yaml
conda activate alchemist

Training the Rater

Prerequisites

Before training, you need to download the train dataset https://laion.ai/. and divid 1M for the val dataset

Train on LAION Dataset

Train a model from this script:

bash train_rater.sh

Training Configurations

All training configurations are stored in configs/config_rater.json.

Dataset Settings:

train_csv_path: Path to training CSV file
val_csv_path: Path to validation CSV file
text_enc_path: Text encoder model path
ckpt_path: Checkpoint directory path

Model Architecture:

depth: Model depth (default: 16)
raterDepth: Rater depth (default: 8)
patch_size: Patch size (default: 16)
patch_nums: Multi-scale patch numbers

Training Hyperparameters:

bs: Batch size (default: 64)
ep: Number of epochs (default: 10)
tlr: Learning rate (default: 1e-4)
twd: Weight decay (default: 0.05)
cfg: Classifier-free guidance scale (default: 4.0)

Optimization:

opt: Optimizer type (default: adamw)
sche: Learning rate scheduler (default: lin0)
wp: Warmup proportion
fp16: Enable mixed precision training

Output:

local_out_dir_path: Output directory for checkpoints and logs
exp_name: Experiment name

Scoring

bash infer_rater.sh

and rank the data samples by ratings in descending order.

Acknowledgments

Our code is built upon STAR-T2I and SEAL.

Citation

If you use Alchemist in your research, please cite:

@article{ding2025alchemist,
  title={Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection},
  author={Ding, Kaixin and Zhou, Yang and Chen, Xi and Yang, Miao and Ou, Jiarong and Chen, Rui and Tao, Xin and Zhao, Hengshuang},
  journal={arXiv preprint arXiv:2512.16905},
  year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
configs/LAION-30m		configs/LAION-30m
dataset		dataset
models		models
tools		tools
utils		utils
.gitignore		.gitignore
README.md		README.md
dist.py		dist.py
environment.yaml		environment.yaml
infer_datarater_multigpu_cogvlm.py		infer_datarater_multigpu_cogvlm.py
infer_rater.sh		infer_rater.sh
train.py		train.py
train.sh		train.sh
train_rater.py		train_rater.py
train_rater.sh		train_rater.sh
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Abstract

Quick Start

Environment Setup

Training the Rater

Prerequisites

Train on LAION Dataset

Training Configurations

Scoring

Acknowledgments

Citation

About

Uh oh!

Releases

Packages

Languages

KlingTeam/Alchemist

Folders and files

Latest commit

History

Repository files navigation

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Abstract

Quick Start

Environment Setup

Training the Rater

Prerequisites

Train on LAION Dataset

Training Configurations

Scoring

Acknowledgments

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages