Are all I-frame tokens intended to be preserved in the current implementation?

Hi, thanks for the great work on this HEVC-based token selection pipeline. I have a question about how I-frames are handled in ap_dataloader_dali_codec.py.

My understanding from the paper is that all tokens from I-frames are preserved, while Top-K selection is only applied to P-frame patches based on codec-derived saliency. In particular, Equation (2) seems to describe the HEVC input as keeping the full patchified I-frame and applying the visibility mask only to decoded P-frames.

However, in get_frame_id_list, I noticed that residuals at I-frame positions are explicitly zeroed out:

if pos in I_pos_set:
    residuals_y[pos] = np.zeros((H0, W0), dtype=dtype0 or np.uint8)


Since patch scores in compute_visible_indices_cpu are computed from residual energy, this seems to imply that all I-frame patches receive a score of 0 and therefore would not be selected by Top-K, except possibly through tie-breaking or the static_fallback path.

So I wanted to check whether I am misunderstanding the implementation, or whether the current code is using a different behavior from what I inferred from the paper. If I-frame tokens are indeed intended to be fully preserved, could you clarify where that happens in the pipeline?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are all I-frame tokens intended to be preserved in the current implementation? #113

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Are all I-frame tokens intended to be preserved in the current implementation? #113

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions