Skip to content

Conversation

@Pfannkuchensack
Copy link
Contributor

@Pfannkuchensack Pfannkuchensack commented Dec 4, 2025

Summary

This PR adds Regional Guidance support for Z-Image (S3-DiT Transformer) models, enabling users to apply different prompts to different regions of the image using attention masks.

Key implementation details:

Backend:

  • New ZImageRegionalPromptingExtension class that builds regional attention masks
  • New ZImageTextConditioning and ZImageRegionalTextConditioning dataclasses for managing regional text embeddings
  • Transformer forward patching via patch_transformer_for_regional_prompting context manager
  • Attention mask format: 4D additive float mask (0.0 = attend, -inf = block) in bfloat16 dtype
  • Alternating layer strategy: even layers use regional mask, odd layers use full attention for global coherence
  • Z-Image uses sequence order [img_tokens, txt_tokens] (different from FLUX's [txt_tokens, img_tokens])

Frontend:

  • Updated buildZImageGraph.ts to support regional conditioning collectors
  • Updated addRegions.ts to create z_image_text_encoder nodes for Z-Image regions
  • Updated addZImageLoRAs.ts to handle optional negCond when guidance_scale=0
  • Added Z-Image validation in validators.ts (no IP adapters, no autoNegative support)
  • Negative conditioning nodes only created when guidance_scale > 0

Related Issues / Discussions

#8670
Extends Z-Image support (from the Z-Image-Turbo PR) with regional prompting capabilities.

QA Instructions

  1. Select a Z-Image model (e.g., Z-Image-Turbo)
  2. Create two or more Regional Guidance layers in the Control Layers panel
  3. Draw masks for each region
  4. Add different prompts to each region (e.g., "red apple" for left region, "blue sky" for right region)
  5. Add a global prompt (optional)
  6. Generate image
  7. Verify that different regions follow their respective prompts

Merge Plan

Should be merged after the main Z-Image support PR (feat/z-image-turbo-support), as this builds on top of that implementation.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added api python PRs that change python files Root invocations PRs that change invocations backend PRs that change backend files frontend PRs that change frontend files python-deps PRs that change python dependencies labels Dec 4, 2025
@lstein
Copy link
Collaborator

lstein commented Dec 16, 2025

I will start my review after the base z-image-turbo support is merged, which should be soon.

@blessedcoolant
Copy link
Collaborator

Where are we on this one? Now that the primary Z Image and Controlnet PR's ae in, this should be pretty straight forward? Can you rebase this please? I'll run through the code and tests.

@Pfannkuchensack Pfannkuchensack force-pushed the feat/z-image-regional-guidance branch from f4929ae to 56c7fbc Compare December 23, 2025 01:26
Implements regional prompting for Z-Image (S3-DiT Transformer) allowing
different prompts to affect different image regions using attention masks.

Backend changes:
- Add ZImageRegionalPromptingExtension for mask preparation
- Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes
- Patch transformer forward to inject 4D regional attention masks
- Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility
- Alternate regional/full attention layers for global coherence

Frontend changes:
- Update buildZImageGraph to support regional conditioning collectors
- Update addRegions to create z_image_text_encoder nodes for regions
- Update addZImageLoRAs to handle optional negCond when guidance_scale=0
- Add Z-Image validation (no IP adapters, no autoNegative)
Fix windows path again
@Pfannkuchensack Pfannkuchensack force-pushed the feat/z-image-regional-guidance branch from 56c7fbc to 1a37af3 Compare December 23, 2025 01:36
@Pfannkuchensack Pfannkuchensack marked this pull request as ready for review December 23, 2025 02:32
Changed the guidance_scale check from > 0 to > 1 for Z-Image models.
Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention),
negative conditioning should only be created when guidance_scale > 1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api backend PRs that change backend files frontend PRs that change frontend files invocations PRs that change invocations python PRs that change python files python-deps PRs that change python dependencies Root

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants