We use sampels/faces as example, with additional token <face>.
accelerate launch scripts/train.py --config-file configs/train/textual_inversion.py
accelerate launch scripts/train.py --config-file configs/train/dreambooth.pyWe use image_dataset in unidiffusion/datasets/image_dataset to load images and set a text token in a photo of <>.
dataset = get_config("common/data/image_dataset.py").dataset
dataset.path = 'samples/faces'
dataset.placeholder = None
dataset.inversion_placeholder = '<face>' # set textual inversion tokensdataset.inversion_placeholder indicates additional textual inversion token (we suggest using <xxx> format), while dataset.placeholder means use existing token in tokenizer and will be not trained.
For training arguments, we set unet and text_encoder
# textual inversion not set unet.training_args
# set mode to 'lora' to enabled dreambooth_lora
unet.training_args = {
'': {
'mode': 'finetune',
'optim_kwargs': {'lr': '${optimizer.lr}'}
}
}
text_encoder.training_args = {
'text_embedding': {
'initial': True, # whether to init additional token by their text.
'optim_kwargs': {'lr': '${optimizer.lr}'}
}
}We use pokemon-blip-captions dataset as example.
accelerate launch scripts/train.py --config-file configs/train/lora_pokemon.py
accelerate launch scripts/train.py --config-file configs/train/text_to_image_finetune.pydataset.path = 'lambdalabs/pokemon-blip-captions'
unet.training_args = {
'': {
'mode': 'finetune', # or 'lora'
'optim_kwargs': {'lr': '${optimizer.lr}'}
}
}Set mode to 'finetune' or 'lora' to enable each finetuning mechanism.