Title: Weight slicing bypasses LoRA layers
Problem:
# Current - bypasses LoRA
gate_proj = self.gate_proj.weight[:m]
Fix: Use output slicing instead:
# Fixed - LoRA applied
gate_out = self.gate_proj(x)[:, :, :m]
up_out = self.up_proj(x)[:, :, :m]
hidden = gate_out * up_out
output = self.down_proj(F.pad(hidden, (0, dff - m)))
Why: Direct .weight access skips LoRA's W + BA computation.
Title: Weight slicing bypasses LoRA layers
Problem:
Fix: Use output slicing instead:
Why: Direct
.weightaccess skips LoRA'sW + BAcomputation.