Is there Padding Mask when training the model? #29

Open

opened

on Feb 26, 2024

Hi, I notice there is no padding mask when training the model.
Actually, there exist many padding tokens in a batch of data?

I wonder how Mamba handles these padding tokens?

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests