- Create a full transformer with dense layers and everything else we've built up to this point - Add some metrics like BLEU - Should have decent outputs - This should be the same structure as gpt2 so that we can load weights into it later