Makemore MLP Model

This repository contains an implementation of a simple Multi-Layer Perceptron (MLP) model inspired by Andrej Karpathy's "makemore" series. The model is designed to generate text by predicting the next character/word in a sequence based on previous inputs.

Model Overview

The MLP class is built using PyTorch and consists of:

Token Embedding Layer (nn.Embedding): Converts token indices into dense vector representations.
Multi-Layer Perceptron (nn.Sequential):
- Fully connected layers (nn.Linear)
- Tanh activation function
- Final output layer predicting the next token

How It Works

The input sequence of tokens is embedded into dense vectors.
The model shifts input tokens and replaces the first token with a special <BLANK> token.
The embeddings of previous words are concatenated and passed through an MLP.
The model outputs logits (predictions) for the next token.
During training, the model computes cross-entropy loss if target labels are provided.

References

Andrej Karpathy's makemore series

License

This project is open-source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
makemore_MLP.ipynb		makemore_MLP.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Makemore MLP Model

Model Overview

How It Works

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Makemore MLP Model

Model Overview

How It Works

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages