Flux2 klein by Aatricks · Pull Request #17 · Aatricks/LightDiffusion-Next

Aatricks · 2026-02-04T06:41:15Z

This pull request implements comprehensive support for the Flux.2 Klein 4B Distilled model, porting and optimizing the architecture from ComfyUI to LightDiffusion-Next. It includes a custom transformer implementation, a specialized text encoder based on the Qwen architecture, and various pipeline optimizations to ensure high-quality, efficient generation.

Summary of Changes

Flux.2 Transformer Architecture

Implemented the core Flux2 transformer in src/NeuralNetwork/flux2, featuring joint image-text attention via DoubleStreamBlock and SingleStreamBlock.
Added EmbedND for RoPE positional embeddings with Flux-specific axis handling.
Implemented QKNorm and high-precision RMSNorm to ensure numerical stability.

Klein Text Encoder (Qwen-based)

Created KleinCLIP and KleinEncoder to handle the unique conditioning requirements of the Klein model.
Integrated a Qwen3-4B decoder-only transformer as the text backbone, extracting and concatenating features from layers 9, 18, and 27.
Implemented a specialized KleinTokenizer that applies the required chat-template formatting for optimal prompt adherence.

Optimized Latent Pipeline & Sampling

Added Flux2LatentFormat supporting 128 channels with calibrated scale (0.3611) and shift (0.1159) factors.
Implemented ModelSamplingFlux2 with resolution-dependent timestep shifting to align the sampling schedule with Flux's distilled training.
Forced a static guidance embedding (default 3.5) to maintain the generation manifold, resolving issues with low-contrast or distorted outputs in distilled weights.

Model Integration & UI Enhancements

Enhanced ModelFactory with advanced state_dict detection to automatically identify and load Flux2/Klein checkpoints.
Updated the Streamlit UI to support Flux-specific resolution presets and synchronized dimension handling.
Enabled support for Img2Img and HiresFix to respect Flux2's unique latent dimensions and sampling parameters.

Bug Fixes & Refinement

Geometry Fix: Resolved severe geometric distortion by aligning positional embedding IDs and text padding (Left-padding) with reference implementations.
Color/Contrast Fix: Corrected VAE decoding by auto-detecting Flux-specific VAEs and applying the correct RGB conversion factors, eliminating "washed-out" images.
Stability: Fixed a series of crashes related to text normalization (txt_norm) and attention mask mismatches.
Performance: Optimized VRAM management through partial model loading and improved SDPA backend selection.

Verification

Conducted side-by-side parity tests against ComfyUI workflows.
Verified functionality across standard (512x512) and high-resolution (1024x1024+) presets.
Validated that existing SD1.5/SDXL pipelines remain unaffected through regression testing.

- Implement comprehensive unit tests for SD1.5 model components including latent format, model configuration, checkpoint loading, CLIP encoding, and tokenizer functionality. - Introduce unit tests for SDXL model components covering latent format, model configuration, dual CLIP tokenizer/encoder, resolution handling, and differences from SD1.5. - Ensure tests validate expected behaviors and configurations for both models, including checks for required attributes, correct dimensionality, and integration with other components.

- Removed references to Flux and GGUF files in model loading and detection. - Updated `list_available_models` to exclude .gguf files. - Modified `detect_model_type` to raise ValueError for GGUF files. - Adjusted pipeline functions to eliminate Flux-specific logic. - Updated integration and unit tests to reflect the removal of GGUF/FLUX support. - Changed default model path in webui_settings.json for consistency.

- Implement tests to verify that HiresFix and Adetailer processors are called correctly based on user input. - Include tests for the Img2Img processor and ensure proper context creation with `PipelineContext.from_kwargs`. - Validate model type detection and capabilities for SD15 and SDXL models. - Add a test for context creation with hires settings. - Ensure all tests report their results clearly.

…on logic - Updated test cases in `test_feature_triggers.py` to replace `PipelineContext` with `Context`. - Renamed test functions for clarity and consistency. - Adjusted assertions in model detection tests to reflect changes in detection logic. - Updated error messages for clarity regarding unsupported GGUF files. - Ensured that Juggernaut XL models are correctly detected as SDXL based on new naming conventions.

…dability and performance - Simplified the calculation of start and end timesteps in `ksampler_util.py`. - Streamlined pre-run control logic and condition handling. - Enhanced the `apply_empty_x_to_equal_area` function for better clarity. - Updated `get_area_and_mult` to optimize batch handling and tensor operations. - Consolidated multi-scale preset definitions in `multiscale_presets.py` for brevity. - Improved logging and documentation across various functions in `sampling_util.py`. - Refined noise sampling methods to enhance efficiency and maintainability.

…ance - Simplified functions in SDToken.py, including parse_parentheses, token_weights, escape_important, and load_embed. - Removed unnecessary comments and docstrings for cleaner code. - Optimized embedding loading logic and error handling in load_embed. - Enhanced the SDTokenizer class by streamlining the initialization and tokenization processes. - Updated utility functions in utils.py for better clarity and efficiency, including parse_blocks, convert_time, and get_sigma. - Improved integration handling in the Integrations class and JHDIntegrations. - Consolidated and optimized the scale_samples function for better performance.

- Consolidated import statements for better readability. - Removed unnecessary docstrings and comments to streamline the code. - Simplified the logic in several methods, including `init_integrations`, `build`, and `get_window_args`. - Enhanced the handling of scale methods and warnings related to incompatible sizes. - Improved the structure of the `State` and `Config` classes for clarity. - Updated the `patch` method to handle YAML parameters more efficiently. - Refined the `window_partition` and `window_reverse` methods for better performance. - Adjusted the `get_shift` method to simplify the logic for avoiding duplicate shifts. - Cleaned up the `ApplyMSWMSAAttentionSimple` class for consistency with the main attention class.

…and maintainability

…t for improved model integration, fixed double CFG application in the sampling code

…ion for improved efficiency and consistency

- Added Flux2 model implementation in `src/NeuralNetwork/flux2/model.py`, featuring a dual-stream architecture for image generation. - Introduced `Flux2Params` dataclass for model configuration parameters. - Created `Flux2` class with methods for forward pass, patchifying, and unpatchifying images. - Developed `Flux2` latent format in `src/Utilities/Latent.py` with identity processing and VAE compatibility. - Implemented Klein text encoder in `src/clip/KleinEncoder.py`, including tokenization and attention mechanisms. - Updated sampling methods in `src/sample/sampling.py` to support Flux2 models. - Enhanced model sampling logic to accommodate new Flux2 architecture.

- Enhanced the convert_cond function to include pooled_output as 'y' in model_conds for models requiring it, such as Flux2.

…lux2 models

…n mask handling and configuration options

…on for Flux2 models

…ading functions

…loading support

…on mechanisms for improved performance

…Klein model

…t channel count

…Flux2 Klein model

…oss multiple modules

…r dtypes in QwenAttention

…cross layers

…ion support

…d align tokenizer behavior with ComfyUI

…ing model reuse logic

… in Adetailer, HiresFix, Img2Img, and sampling functions

…oved model options and positional embedding support

…dimensions for Flux2 compatibility

…mensions and presets

…t handling for mock objects

Aatricks added 30 commits January 31, 2026 14:17

Refactor code structure for improved readability and maintainability

5ed21fa

Refactor: remove unused modules and classes from WaveSpeed and API

2a99dff

Refactor: remove Flux-related code and settings for improved clarity …

74f87f8

…and maintainability

Refactor: enhance utility functions and add timestep embedding suppor…

3e98357

…t for improved model integration, fixed double CFG application in the sampling code

Refactor: replace manual CI workflow with a streamlined CI configurat…

6852886

…ion for improved efficiency and consistency

Add support for pooled_output in convert_cond function

f022488

- Enhanced the convert_cond function to include pooled_output as 'y' in model_conds for models requiring it, such as Flux2.

feat: Enhance text conditioning by adding attention mask support in F…

186117f

…lux2 models

feat: Update Flux2KleinModel and KleinTokenizer for improved attentio…

cf78995

…n mask handling and configuration options

feat: Enhance timestep embedding and model sampling shift configurati…

9785001

…on for Flux2 models

feat: Add Flux2 Klein model support with optimized loading and downlo…

361a617

…ading functions

feat: Optimize Flux2 model handling with VRAM management and partial …

a1e9c85

…loading support

feat: Implement SDPA backend priority management and optimize attenti…

505a297

…on mechanisms for improved performance

feat: Update CFG scale handling and apply optimal settings for Flux2 …

0c2b0d0

…Klein model

feat: Update Flux2 latent format to use identity transforms and adjus…

53af6fc

…t channel count

feat: Update context dimensions and enhance attention mechanisms for …

959cf1b

…Flux2 Klein model

feat: Update random seed generation to comply with PyTorch limits acr…

69b7287

…oss multiple modules

feat: Enhance RMS normalization precision and ensure consistent tenso…

d711141

…r dtypes in QwenAttention

feat: Update Flux2 model parameters and enhance numerical stability a…

a9c7217

…cross layers

feat: Enhance Flux2 Klein model with vector input and text normalizat…

ab60479

…ion support

feat: Update Flux2 Klein model to remove unnecessary padding logic an…

b401329

…d align tokenizer behavior with ComfyUI

feat: Optimize model loading by preventing redundant loads and enhanc…

0062df5

…ing model reuse logic

feat: Add Flux and Flux2 model flags and adjust processing parameters…

981b1db

… in Adetailer, HiresFix, Img2Img, and sampling functions

Aatricks added 3 commits February 3, 2026 23:47

feat: Enhance Flux2 and Variational Autoencoder integration with impr…

7941fa6

…oved model options and positional embedding support

feat: Improve Img2Img upscale factor calculation and adjust sampling …

c306769

…dimensions for Flux2 compatibility

feat: Refactor Flux2 Klein settings application and enhance UI for di…

47ee0af

…mensions and presets

Aatricks self-assigned this Feb 4, 2026

feat: Update CI configuration, enhance model loading, and improve tes…

1e067c0

…t handling for mock objects

Aatricks merged commit 6b77645 into main Feb 4, 2026
0 of 2 checks passed

Aatricks deleted the flux2-klein branch February 5, 2026 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux2 klein#17

Flux2 klein#17
Aatricks merged 34 commits intomainfrom
flux2-klein

Aatricks commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aatricks commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant