Skip to content

feat(rebase): Transformers 4.57.3 cache/rebase fixes and stabilization#865

Open
vbaddi wants to merge 4 commits intoquic:mainfrom
vbaddi:dev/rebase_transformers_v4_57_3
Open

feat(rebase): Transformers 4.57.3 cache/rebase fixes and stabilization#865
vbaddi wants to merge 4 commits intoquic:mainfrom
vbaddi:dev/rebase_transformers_v4_57_3

Conversation

@vbaddi
Copy link
Contributor

@vbaddi vbaddi commented Mar 17, 2026

Rebased QEff. wrappers to transformers==4.57.3 with minimal, targeted compatibility updates for cache and modeling.

What changed

  • Updated cache handling for HF 4.57.3 cache object behavior (layers/get_seq_length patterns).

    • Standardized kv_seq_len resolution across affected modeling wrappers.
    • Moved resolve_kv_seq_len utility to QEfficient/utils/_utils.py and imported it safely to avoid circular imports.
    • Updated affected model wrappers to use the shared helper.
    • Added compatibility updates in quantizer paths (including AWQ-related drift).
    • Fixed quickcheck VLM fallback handling for qwen2_5_vl_text.
  • Ran:
    python -m pytest -q tests/test_model_quickcheck.py -n auto

vbaddi added 4 commits March 17, 2026 17:53
- Pin transformers to 4.57.3
- Keep QEff cache internals self-owned (CacheLayerMixin/Cache adapter path), with legacy interop.
- Update model kv_seq_len calls to use cross-version cache-length resolution.
- Add small quantizer compatibility guards (AWQ/update_dtype paths).

Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>
@vbaddi vbaddi added the enhancement New feature or request label Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants