Skip to content

How to achieve frame-level interaction #6

@LeeKeyu

Description

@LeeKeyu

Hi thanks for your impressive work!
After reading your paper, I have a question about the frame-level interaction control. To my understanding, the actions are injected as a (1+n) length sequence to generate (1+n) images together, and autoregressively extended to a long video.

So during inference, is it possible to provide one action a time to generate the next content? or how do you define the frame-level control. Thank you a lot in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions