Conversation
- Fix issues in previous LoRA implementation - Implement custom LoRA interface based on PEFT library - Add proper injection, saving, loading, and unloading functions for LoRA adapters - Expose LoRA utility functions in package exports
- Implement multi-resolution packing for CogView4 to improve training efficiency - Add QLoRA support for both CogView and CogVideo models - Refactor trainers to fix training bugs and optimize computation pipeline - Update dataset utilities and fine-tuning base components This update significantly improves model training efficiency and flexibility.
Collaborator
Author
|
We will follow up with a comparison of resize training, packing training, and QLoRA+packing training effects on the pixelart dataset. |
Collaborator
Author
|
Collaborator
Author
Collaborator
Author
Collaborator
Author
Collaborator
Author
|
TLDR; Based on pixelart, it can be observed that: 1) packing has a faster convergence speed than resize; 2) packing yields better generation quality than resize, reflected in the details and sharpness of the generated images (results from the resize approach tend to be blurry); 3) although qlora can reduce the hardware threshold for fine-tuning, it significantly impacts model quality on the pixelart dataset (resulting in grid-like images) |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.















Description
This PR introduces significant changes to enable:
Key improvements:
Notes
Multi-resolution packing currently supports only the cogview series.
For cogvideo, QLoRA offers limited benefits due to activation-dominated memory usage and is typically suitable only for batch size = 1 scenarios.
On the pixelart dataset, QLoRA fine-tuning may negatively impact final generation quality.
Changes include modifications to CogView4 attention processor in diffusers to support packed multi-resolution sequences (see this PR).