Remove head_mask and attention weights from VideoGPT by stashuk-olek · Pull Request #536 · facebookresearch/multimodal

stashuk-olek · 2026-02-11T19:33:24Z

Summary:
Remove dead head_mask, return_attn_weights, and attention_weights from the VideoGPT stack. These features were never used by any consumer — head_mask was always None or all-ones, and return_attn_weights was always False except in tests that verified the feature itself.

This removes:

attention_weights field from TransformerDecoderOutput and TransformerLayerOutput NamedTuples
head_mask and return_attn_weights params from MultimodalGPT, MultimodalTransformerDecoder, TransformerDecoder, and TransformerDecoderLayer
head_mask param from AxialAttention.forward in video_vqvae.py
return_attn_weights param from GenerationUtil.sample
All head_mask and return_attn_weights usage from tests

Differential Revision: D92927089

meta-codesync · 2026-02-11T19:33:37Z

@stashuk-olek has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92927089.

…h#536) Summary: Remove dead `head_mask`, `return_attn_weights`, and `attention_weights` from the VideoGPT stack. These features were never used by any consumer — `head_mask` was always `None` or all-ones, and `return_attn_weights` was always `False` except in tests that verified the feature itself. This removes: - `attention_weights` field from `TransformerDecoderOutput` and `TransformerLayerOutput` NamedTuples - `head_mask` and `return_attn_weights` params from `MultimodalGPT`, `MultimodalTransformerDecoder`, `TransformerDecoder`, and `TransformerDecoderLayer` - `head_mask` param from `AxialAttention.forward` in video_vqvae.py - `return_attn_weights` param from `GenerationUtil.sample` - All `head_mask` and `return_attn_weights` usage from tests Differential Revision: D92927089

meta-codesync · 2026-02-11T19:34:35Z

@stashuk-olek has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92927089.

…h#536) Summary: Remove dead `head_mask`, `return_attn_weights`, and `attention_weights` from the VideoGPT stack. These features were never used by any consumer — `head_mask` was always `None` or all-ones, and `return_attn_weights` was always `False` except in tests that verified the feature itself. Differential Revision: D92927089

…h#536) Summary: Remove dead `head_mask`, `return_attn_weights`, and `attention_weights` from the VideoGPT stack. These features were never used by any consumer — `head_mask` was always `None` or all-ones, and `return_attn_weights` was always `False` except in tests that verified the feature itself. Reviewed By: OmarPavel Differential Revision: D92927089

… weights in FLAVA (facebookresearch#535) Summary: The `attentions` field on `TransformerOutput` and `return_attn_weights`/`head_mask` parameters in the FLAVA encoder stack were never used by any consumer. This diffs cleans it up. Later the intent is to simplify attention usage / use common API for them. Reviewed By: OmarPavel Differential Revision: D92927086

…h#536) Summary: Remove dead `head_mask`, `return_attn_weights`, and `attention_weights` from the VideoGPT stack. These features were never used by any consumer — `head_mask` was always `None` or all-ones, and `return_attn_weights` was always `False` except in tests that verified the feature itself. Reviewed By: OmarPavel Differential Revision: D92927089

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 11, 2026

meta-codesync bot added fb-exported meta-exported labels Feb 11, 2026

stashuk-olek force-pushed the export-D92927089 branch from 371945b to 79d87de Compare February 11, 2026 19:34

stashuk-olek added 2 commits February 25, 2026 15:40

stashuk-olek force-pushed the export-D92927089 branch from 79d87de to 92fdefc Compare February 25, 2026 23:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove head_mask and attention weights from VideoGPT#536

Remove head_mask and attention weights from VideoGPT#536
stashuk-olek wants to merge 2 commits intofacebookresearch:mainfrom
stashuk-olek:export-D92927089

stashuk-olek commented Feb 11, 2026

Uh oh!

meta-codesync bot commented Feb 11, 2026

Uh oh!

meta-codesync bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stashuk-olek commented Feb 11, 2026

Uh oh!

meta-codesync bot commented Feb 11, 2026

Uh oh!

meta-codesync bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant