This colab contains code that is taken from the references below and added it my own comments or tweaks in my journey to understand how the transformer works and the intuition behind it Building Large Language model from scratch
Happy learning ! code_llm_from_scratch.ipynb