❗ Bugs in Transformer Model Implementation (Attention Layer, Typo, and Masking Issues)

**Issue Description:**

Hi there 👋,

While reviewing and testing the provided Transformer architecture in PyTorch, I came across several **important bugs and typos** that need fixing to ensure the model runs correctly and yields expected behavior. Below is a breakdown of each issue for clarity and maintainability:

---

### 🔧 1. **Incorrect Variable Name: `quereis = self.quereis(queries)`**
- **Location:** `SelfAttention.forward()`
- **Problem:** 
   - Typo in variable and method call: `quereis = self.quereis(queries)`
   - Should likely be: `queries = self.queries(query)`
- **Fix:**
   ```python
   queries = self.queries(query)
   ```

---

### 🔧 2. **Softmax Missing on Attention Scores**
- **Location:** `SelfAttention.forward()`
- **Problem:**
   - After calculating `energy`, the code attempts:
     ```python
     attention = torch.masked_fill(mask == 0, float("-1e20"))
     ```
   - This is both syntactically incorrect and missing the softmax operation.
   - Should apply `F.softmax(energy, dim=-1)` **after masking**.
- **Fix:**
   ```python
   energy = energy.masked_fill(mask == 0, float("-1e20"))
   attention = torch.softmax(energy, dim=-1)
   ```

---

### 🔧 3. **Typo in Decoder Positional Embedding Line**
- **Location:** `Decoder.forward()`
- **Problem:**
   - Line: `x = self.dropout((self.word_embedding(x) + self_position_embedding(embeddings)))`
   - Issue: `self_position_embedding` and `embeddings` are undefined.
- **Fix:**
   ```python
   x = self.dropout(self.word_embedding(x) + self.position_embedding(positions))
   ```

---

### 🔧 4. **Misspelled Class Name: `Tranformer` → `Transformer`**
- **Location:** Class Declaration
- **Problem:**
   - `class Tranformer(nn.Module):` is misspelled and will break when imported or referenced.
- **Fix:**
   ```python
   class Transformer(nn.Module):
   ```

---

### 🔧 5. **Incorrect Class Reference in `super()`**
- **Location:** Inside Transformer class
- **Problem:**
   - Line: `super(Trasnformer, self).__init__()` (typo in class name again)
- **Fix:**
   ```python
   super(Transformer, self).__init__()
   ```

---

### 🔧 6. **Bug in `make_src_mask()` Function**
- **Location:** Transformer class
- **Problem:**
   - Line: `src_mask = (src != self.src.pad_idx)...` — `self.src.pad_idx` is incorrect.
- **Fix:**
   ```python
   src_mask = (src != self.src_pad_idx).unsqueeze(1).unsqueeze(2)
   ```

---

### ✅ Suggestions
- Add `import torch.nn.functional as F` to support softmax if not already present.
- Consider adding `assert` statements to check tensor shapes where appropriate.
- Modularize testing of each block to catch bugs early.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❗ Bugs in Transformer Model Implementation (Attention Layer, Typo, and Masking Issues) #1

🔧 1. Incorrect Variable Name: `quereis = self.quereis(queries)`

🔧 2. Softmax Missing on Attention Scores

🔧 3. Typo in Decoder Positional Embedding Line

🔧 4. Misspelled Class Name: `Tranformer` → `Transformer`

🔧 5. Incorrect Class Reference in `super()`

🔧 6. Bug in `make_src_mask()` Function

✅ Suggestions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

❗ Bugs in Transformer Model Implementation (Attention Layer, Typo, and Masking Issues) #1

Description

🔧 1. Incorrect Variable Name: quereis = self.quereis(queries)

🔧 2. Softmax Missing on Attention Scores

🔧 3. Typo in Decoder Positional Embedding Line

🔧 4. Misspelled Class Name: Tranformer → Transformer

🔧 5. Incorrect Class Reference in super()

🔧 6. Bug in make_src_mask() Function

✅ Suggestions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

🔧 1. Incorrect Variable Name: `quereis = self.quereis(queries)`

🔧 4. Misspelled Class Name: `Tranformer` → `Transformer`

🔧 5. Incorrect Class Reference in `super()`

🔧 6. Bug in `make_src_mask()` Function