Issue Description:
Hi there π,
While reviewing and testing the provided Transformer architecture in PyTorch, I came across several important bugs and typos that need fixing to ensure the model runs correctly and yields expected behavior. Below is a breakdown of each issue for clarity and maintainability:
π§ 1. Incorrect Variable Name: quereis = self.quereis(queries)
- Location:
SelfAttention.forward()
- Problem:
- Typo in variable and method call:
quereis = self.quereis(queries)
- Should likely be:
queries = self.queries(query)
- Fix:
queries = self.queries(query)
π§ 2. Softmax Missing on Attention Scores
- Location:
SelfAttention.forward()
- Problem:
- After calculating
energy, the code attempts:
attention = torch.masked_fill(mask == 0, float("-1e20"))
- This is both syntactically incorrect and missing the softmax operation.
- Should apply
F.softmax(energy, dim=-1) after masking.
- Fix:
energy = energy.masked_fill(mask == 0, float("-1e20"))
attention = torch.softmax(energy, dim=-1)
π§ 3. Typo in Decoder Positional Embedding Line
- Location:
Decoder.forward()
- Problem:
- Line:
x = self.dropout((self.word_embedding(x) + self_position_embedding(embeddings)))
- Issue:
self_position_embedding and embeddings are undefined.
- Fix:
x = self.dropout(self.word_embedding(x) + self.position_embedding(positions))
π§ 4. Misspelled Class Name: Tranformer β Transformer
- Location: Class Declaration
- Problem:
class Tranformer(nn.Module): is misspelled and will break when imported or referenced.
- Fix:
class Transformer(nn.Module):
π§ 5. Incorrect Class Reference in super()
- Location: Inside Transformer class
- Problem:
- Line:
super(Trasnformer, self).__init__() (typo in class name again)
- Fix:
super(Transformer, self).__init__()
π§ 6. Bug in make_src_mask() Function
- Location: Transformer class
- Problem:
- Line:
src_mask = (src != self.src.pad_idx)... β self.src.pad_idx is incorrect.
- Fix:
src_mask = (src != self.src_pad_idx).unsqueeze(1).unsqueeze(2)
β
Suggestions
- Add
import torch.nn.functional as F to support softmax if not already present.
- Consider adding
assert statements to check tensor shapes where appropriate.
- Modularize testing of each block to catch bugs early.
Issue Description:
Hi there π,
While reviewing and testing the provided Transformer architecture in PyTorch, I came across several important bugs and typos that need fixing to ensure the model runs correctly and yields expected behavior. Below is a breakdown of each issue for clarity and maintainability:
π§ 1. Incorrect Variable Name:
quereis = self.quereis(queries)SelfAttention.forward()quereis = self.quereis(queries)queries = self.queries(query)π§ 2. Softmax Missing on Attention Scores
SelfAttention.forward()energy, the code attempts:F.softmax(energy, dim=-1)after masking.π§ 3. Typo in Decoder Positional Embedding Line
Decoder.forward()x = self.dropout((self.word_embedding(x) + self_position_embedding(embeddings)))self_position_embeddingandembeddingsare undefined.π§ 4. Misspelled Class Name:
TranformerβTransformerclass Tranformer(nn.Module):is misspelled and will break when imported or referenced.π§ 5. Incorrect Class Reference in
super()super(Trasnformer, self).__init__()(typo in class name again)π§ 6. Bug in
make_src_mask()Functionsrc_mask = (src != self.src.pad_idx)...βself.src.pad_idxis incorrect.β Suggestions
import torch.nn.functional as Fto support softmax if not already present.assertstatements to check tensor shapes where appropriate.