Skip to content

Flux2 klein#17

Merged
Aatricks merged 34 commits intomainfrom
flux2-klein
Feb 4, 2026
Merged

Flux2 klein#17
Aatricks merged 34 commits intomainfrom
flux2-klein

Conversation

@Aatricks
Copy link
Owner

@Aatricks Aatricks commented Feb 4, 2026

This pull request implements comprehensive support for the Flux.2 Klein 4B Distilled model, porting and optimizing the architecture from ComfyUI to LightDiffusion-Next. It includes a custom transformer implementation, a specialized text encoder based on the Qwen architecture, and various pipeline optimizations to ensure high-quality, efficient generation.

Summary of Changes

  1. Flux.2 Transformer Architecture
  • Implemented the core Flux2 transformer in src/NeuralNetwork/flux2, featuring joint image-text attention via DoubleStreamBlock and SingleStreamBlock.
  • Added EmbedND for RoPE positional embeddings with Flux-specific axis handling.
  • Implemented QKNorm and high-precision RMSNorm to ensure numerical stability.
  1. Klein Text Encoder (Qwen-based)
  • Created KleinCLIP and KleinEncoder to handle the unique conditioning requirements of the Klein model.
  • Integrated a Qwen3-4B decoder-only transformer as the text backbone, extracting and concatenating features from layers 9, 18, and 27.
  • Implemented a specialized KleinTokenizer that applies the required chat-template formatting for optimal prompt adherence.
  1. Optimized Latent Pipeline & Sampling
  • Added Flux2LatentFormat supporting 128 channels with calibrated scale (0.3611) and shift (0.1159) factors.
  • Implemented ModelSamplingFlux2 with resolution-dependent timestep shifting to align the sampling schedule with Flux's distilled training.
  • Forced a static guidance embedding (default 3.5) to maintain the generation manifold, resolving issues with low-contrast or distorted outputs in distilled weights.
  1. Model Integration & UI Enhancements
  • Enhanced ModelFactory with advanced state_dict detection to automatically identify and load Flux2/Klein checkpoints.
  • Updated the Streamlit UI to support Flux-specific resolution presets and synchronized dimension handling.
  • Enabled support for Img2Img and HiresFix to respect Flux2's unique latent dimensions and sampling parameters.

Bug Fixes & Refinement

  • Geometry Fix: Resolved severe geometric distortion by aligning positional embedding IDs and text padding (Left-padding) with reference implementations.
  • Color/Contrast Fix: Corrected VAE decoding by auto-detecting Flux-specific VAEs and applying the correct RGB conversion factors, eliminating "washed-out" images.
  • Stability: Fixed a series of crashes related to text normalization (txt_norm) and attention mask mismatches.
  • Performance: Optimized VRAM management through partial model loading and improved SDPA backend selection.

Verification

  • Conducted side-by-side parity tests against ComfyUI workflows.
  • Verified functionality across standard (512x512) and high-resolution (1024x1024+) presets.
  • Validated that existing SD1.5/SDXL pipelines remain unaffected through regression testing.

- Implement comprehensive unit tests for SD1.5 model components including latent format, model configuration, checkpoint loading, CLIP encoding, and tokenizer functionality.
- Introduce unit tests for SDXL model components covering latent format, model configuration, dual CLIP tokenizer/encoder, resolution handling, and differences from SD1.5.
- Ensure tests validate expected behaviors and configurations for both models, including checks for required attributes, correct dimensionality, and integration with other components.
- Removed references to Flux and GGUF files in model loading and detection.
- Updated `list_available_models` to exclude .gguf files.
- Modified `detect_model_type` to raise ValueError for GGUF files.
- Adjusted pipeline functions to eliminate Flux-specific logic.
- Updated integration and unit tests to reflect the removal of GGUF/FLUX support.
- Changed default model path in webui_settings.json for consistency.
- Implement tests to verify that HiresFix and Adetailer processors are called correctly based on user input.
- Include tests for the Img2Img processor and ensure proper context creation with `PipelineContext.from_kwargs`.
- Validate model type detection and capabilities for SD15 and SDXL models.
- Add a test for context creation with hires settings.
- Ensure all tests report their results clearly.
…on logic

- Updated test cases in `test_feature_triggers.py` to replace `PipelineContext` with `Context`.
- Renamed test functions for clarity and consistency.
- Adjusted assertions in model detection tests to reflect changes in detection logic.
- Updated error messages for clarity regarding unsupported GGUF files.
- Ensured that Juggernaut XL models are correctly detected as SDXL based on new naming conventions.
…dability and performance

- Simplified the calculation of start and end timesteps in `ksampler_util.py`.
- Streamlined pre-run control logic and condition handling.
- Enhanced the `apply_empty_x_to_equal_area` function for better clarity.
- Updated `get_area_and_mult` to optimize batch handling and tensor operations.
- Consolidated multi-scale preset definitions in `multiscale_presets.py` for brevity.
- Improved logging and documentation across various functions in `sampling_util.py`.
- Refined noise sampling methods to enhance efficiency and maintainability.
…ance

- Simplified functions in SDToken.py, including parse_parentheses, token_weights, escape_important, and load_embed.
- Removed unnecessary comments and docstrings for cleaner code.
- Optimized embedding loading logic and error handling in load_embed.
- Enhanced the SDTokenizer class by streamlining the initialization and tokenization processes.
- Updated utility functions in utils.py for better clarity and efficiency, including parse_blocks, convert_time, and get_sigma.
- Improved integration handling in the Integrations class and JHDIntegrations.
- Consolidated and optimized the scale_samples function for better performance.
- Consolidated import statements for better readability.
- Removed unnecessary docstrings and comments to streamline the code.
- Simplified the logic in several methods, including `init_integrations`, `build`, and `get_window_args`.
- Enhanced the handling of scale methods and warnings related to incompatible sizes.
- Improved the structure of the `State` and `Config` classes for clarity.
- Updated the `patch` method to handle YAML parameters more efficiently.
- Refined the `window_partition` and `window_reverse` methods for better performance.
- Adjusted the `get_shift` method to simplify the logic for avoiding duplicate shifts.
- Cleaned up the `ApplyMSWMSAAttentionSimple` class for consistency with the main attention class.
…t for improved model integration, fixed double CFG application in the sampling code
- Added Flux2 model implementation in `src/NeuralNetwork/flux2/model.py`, featuring a dual-stream architecture for image generation.
- Introduced `Flux2Params` dataclass for model configuration parameters.
- Created `Flux2` class with methods for forward pass, patchifying, and unpatchifying images.
- Developed `Flux2` latent format in `src/Utilities/Latent.py` with identity processing and VAE compatibility.
- Implemented Klein text encoder in `src/clip/KleinEncoder.py`, including tokenization and attention mechanisms.
- Updated sampling methods in `src/sample/sampling.py` to support Flux2 models.
- Enhanced model sampling logic to accommodate new Flux2 architecture.
- Enhanced the convert_cond function to include pooled_output as 'y' in model_conds for models requiring it, such as Flux2.
… in Adetailer, HiresFix, Img2Img, and sampling functions
@Aatricks Aatricks self-assigned this Feb 4, 2026
@Aatricks Aatricks merged commit 6b77645 into main Feb 4, 2026
0 of 2 checks passed
@Aatricks Aatricks deleted the flux2-klein branch February 5, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant