Skip to content

Conversation

@young917
Copy link

@young917 young917 commented Sep 26, 2020

Remove restriction of length of seq(=7)

  • Before pass input to the Encoder or Decoder (def controller_train / controller_infer / generate_synthetic_controller_data):

sort seqs by length of seq in one batch.

  • Decoder:

pack_padded_sequence before rnn : make rnn ignore the inputs outside the length of the seq.

pad_packed_seqeunce after rnn : restore original shape.

pass mask to attention model : Because the length of seqs can be different each other in one batch, use mask to ignore out-of-length.

  • Encoder:

pack_padded_sequence before rnn : make rnn ignore the inputs outside the length of the seq.

pad_packed_seqeunce after rnn : restore original shape

  • DataLoader:

add 'input_len' (list of length of seqs in one batch) -> pass to the encoder / decoder

  • compatible with our nasbench api

@young917 young917 requested a review from juice500ml September 26, 2020 00:17
@young917 young917 self-assigned this Sep 26, 2020
@young917
Copy link
Author

young917 commented Sep 26, 2020

Checked that the loss is reducing.
Checked that the final results include seq with length < 7.

Copy link

@juice500ml juice500ml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating this code to self-supervised-nas will take more time. Let's make sure this works!

mask = torch.zeros(bsz, x.size(1))
for i,l in enumerate(x_len):
for j in range(l):
mask[i][j] = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

for i, l in enumerate(x_len):
  mask[i, :j] = 1

?

if len(fixed_stat['module_operations']) < 7:
continue
#if len(fixed_stat['module_operations']) < 7:
# continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove plz


def forward(self, x, encoder_hidden=None, encoder_outputs=None):
def forward(self, x, x_len, encoder_hidden=None, encoder_outputs=None):
# x is decoder_inputs = [0] + encoder_inputs[:-1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this

self.offsets.append( (i + 3) * i // 2 - 1)

def forward(self, x, encoder_hidden=None, encoder_outputs=None):
def forward(self, x, x_len, encoder_hidden=None, encoder_outputs=None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_len feels like a somewhat misnomer. If i understood it correctly, it is somewhat like this:
x_len = [len(x) for x in xs]
So, maybe x_len_per_elem? x_len_list? Or at least some comments would be helpful!

x = (residual + x) * math.sqrt(0.5)
predicted_softmax = F.log_softmax(self.out(x.view(-1, self.hidden_size)), dim=-1)
predicted_softmax = predicted_softmax.view(bsz, tgt_len, -1)
return predicted_softmax, None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is predicted softmax returns sane values? If padded elements are zero-initialized, then probabilty values will be broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants