Skip to content

❗ Bugs in Transformer Model Implementation (Attention Layer, Typo, and Masking Issues)Β #1

@ShigrafS

Description

@ShigrafS

Issue Description:

Hi there πŸ‘‹,

While reviewing and testing the provided Transformer architecture in PyTorch, I came across several important bugs and typos that need fixing to ensure the model runs correctly and yields expected behavior. Below is a breakdown of each issue for clarity and maintainability:


πŸ”§ 1. Incorrect Variable Name: quereis = self.quereis(queries)

  • Location: SelfAttention.forward()
  • Problem:
    • Typo in variable and method call: quereis = self.quereis(queries)
    • Should likely be: queries = self.queries(query)
  • Fix:
    queries = self.queries(query)

πŸ”§ 2. Softmax Missing on Attention Scores

  • Location: SelfAttention.forward()
  • Problem:
    • After calculating energy, the code attempts:
      attention = torch.masked_fill(mask == 0, float("-1e20"))
    • This is both syntactically incorrect and missing the softmax operation.
    • Should apply F.softmax(energy, dim=-1) after masking.
  • Fix:
    energy = energy.masked_fill(mask == 0, float("-1e20"))
    attention = torch.softmax(energy, dim=-1)

πŸ”§ 3. Typo in Decoder Positional Embedding Line

  • Location: Decoder.forward()
  • Problem:
    • Line: x = self.dropout((self.word_embedding(x) + self_position_embedding(embeddings)))
    • Issue: self_position_embedding and embeddings are undefined.
  • Fix:
    x = self.dropout(self.word_embedding(x) + self.position_embedding(positions))

πŸ”§ 4. Misspelled Class Name: Tranformer β†’ Transformer

  • Location: Class Declaration
  • Problem:
    • class Tranformer(nn.Module): is misspelled and will break when imported or referenced.
  • Fix:
    class Transformer(nn.Module):

πŸ”§ 5. Incorrect Class Reference in super()

  • Location: Inside Transformer class
  • Problem:
    • Line: super(Trasnformer, self).__init__() (typo in class name again)
  • Fix:
    super(Transformer, self).__init__()

πŸ”§ 6. Bug in make_src_mask() Function

  • Location: Transformer class
  • Problem:
    • Line: src_mask = (src != self.src.pad_idx)... β€” self.src.pad_idx is incorrect.
  • Fix:
    src_mask = (src != self.src_pad_idx).unsqueeze(1).unsqueeze(2)

βœ… Suggestions

  • Add import torch.nn.functional as F to support softmax if not already present.
  • Consider adding assert statements to check tensor shapes where appropriate.
  • Modularize testing of each block to catch bugs early.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions