Why is `FastInstDecoder.meta_pos_embed` uninitialized?

In https://github.com/junjiehe96/FastInst/blob/4996a613ef86305a58e990fa414f5957a5639cbe/fastinst/modeling/transformer_decoder/fastinst_decoder.py#L52 a set of weights is created that represents the learnable positional embeddings for the pixel features, as described in the paper. However, it doesn't receive any initialization, and this means that the initial values can possibly be NaN. Why is this? Does it not make sense to give it some kind of initialization, either sine/cosine or normally distributed?

Also, could you provide some insights as to why the resolution of the learnable positional embeddings is directly tied to the number of queries? In case one only wants to detect 2-3 objects per image, there is no need for 100 queries. But there is still a need for 100 positional embedding values.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is `FastInstDecoder.meta_pos_embed` uninitialized? #57

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why is FastInstDecoder.meta_pos_embed uninitialized? #57

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Why is `FastInstDecoder.meta_pos_embed` uninitialized? #57