-
Notifications
You must be signed in to change notification settings - Fork 0
Remove restriction #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: self-supervised-nas
Are you sure you want to change the base?
Conversation
juice500ml
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating this code to self-supervised-nas will take more time. Let's make sure this works!
| mask = torch.zeros(bsz, x.size(1)) | ||
| for i,l in enumerate(x_len): | ||
| for j in range(l): | ||
| mask[i][j] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
for i, l in enumerate(x_len):
mask[i, :j] = 1
?
| if len(fixed_stat['module_operations']) < 7: | ||
| continue | ||
| #if len(fixed_stat['module_operations']) < 7: | ||
| # continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove plz
|
|
||
| def forward(self, x, encoder_hidden=None, encoder_outputs=None): | ||
| def forward(self, x, x_len, encoder_hidden=None, encoder_outputs=None): | ||
| # x is decoder_inputs = [0] + encoder_inputs[:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this
| self.offsets.append( (i + 3) * i // 2 - 1) | ||
|
|
||
| def forward(self, x, encoder_hidden=None, encoder_outputs=None): | ||
| def forward(self, x, x_len, encoder_hidden=None, encoder_outputs=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_len feels like a somewhat misnomer. If i understood it correctly, it is somewhat like this:
x_len = [len(x) for x in xs]
So, maybe x_len_per_elem? x_len_list? Or at least some comments would be helpful!
| x = (residual + x) * math.sqrt(0.5) | ||
| predicted_softmax = F.log_softmax(self.out(x.view(-1, self.hidden_size)), dim=-1) | ||
| predicted_softmax = predicted_softmax.view(bsz, tgt_len, -1) | ||
| return predicted_softmax, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is predicted softmax returns sane values? If padded elements are zero-initialized, then probabilty values will be broken.


Remove restriction of length of seq(=7)
sort seqs by length of seq in one batch.
pack_padded_sequence before rnn : make rnn ignore the inputs outside the length of the seq.
pad_packed_seqeunce after rnn : restore original shape.
pass mask to attention model : Because the length of seqs can be different each other in one batch, use mask to ignore out-of-length.
pack_padded_sequence before rnn : make rnn ignore the inputs outside the length of the seq.
pad_packed_seqeunce after rnn : restore original shape
add 'input_len' (list of length of seqs in one batch) -> pass to the encoder / decoder