🤖 Neural Machine Translation with Sequence-to-Sequence Learning

🎯 Project Overview

A PyTorch implementation of Neural Machine Translation (NMT) using Sequence-to-Sequence architecture with LSTM networks. This project implements the approach described in the paper "Sequence to Sequence Learning with Neural Networks" (Sutskever et al., 2014), with optimizations for modern GPU acceleration.

✨ Key Features

🏗️ Architecture: Multi-layer LSTM-based Encoder-Decoder model
📈 Performance: Achieved BLEU score of 23 on English-French translation task
🚀 GPU Optimization: CUDA-accelerated with support for RTX series GPUs
🔄 Data Processing: Efficient handling of WMT'14 dataset with dynamic batching
📚 Vocabulary Management: Implemented frequency-based vocabulary construction

🛠️ Technical Implementation

🧬 Model Architecture

# Core model parameters
BATCH_SIZE = 16
ENC_EMB_DIM = 768
DEC_EMB_DIM = 768
HID_DIM = 768
N_LAYERS = 4
N_EPOCHS = 15
CLIP = 5

🔋 Key Components

🔍 Encoder: Multi-layer LSTM with dropout for regularization
🎯 Decoder: Multi-layer LSTM with output projection layer
📊 Embedding: 768-dimensional word embeddings
⚡ Optimization: Adam optimizer with gradient clipping
📉 Loss Function: Cross-entropy with padding mask

💻 GPU Optimizations

torch.backends.cudnn.benchmark = True
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True

📊 Results

📈 BLEU Score: 23.0 on WMT'14 test set
⚡ Training Time: Optimized for RTX 3060 GPU
💾 Memory Efficiency: Batch size optimization for 6GB VRAM

🚀 Installation and Setup

Clone the repository:

git clone https://github.com/yourusername/nmt-project.git
cd nmt-project

Install requirements:

pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download fr_core_news_sm

Prepare the environment:

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512

🎮 Usage

🏃‍♂️ Training

python train.py --max_samples 1000 --force_rebuild_vocab

🔄 Translation

python utils.py --model_path checkpoints/seq2seq-model-final.pt --mode translate --text "Hello, how are you?"

📊 Evaluation

python utils.py --model_path checkpoints/seq2seq-model-final.pt --mode evaluate --num_samples 100

🔬 Technical Details

🔄 Data Processing Pipeline

📝 Tokenization: SpaCy-based tokenization for both languages
📚 Vocabulary Building: Frequency-based with special tokens
🔄 Batch Processing: Dynamic batching with padding
🔀 Source Sentence Reversal: Implemented as per original paper

🎯 Model Features

📊 Gradient Clipping: Prevents exploding gradients
🎓 Teacher Forcing: Implemented during training
🎲 Dropout Regularization: Prevents overfitting
💾 Checkpoint Management: Saves best and final models

🔮 Future Improvements

⚡ Implementation of attention mechanism
🔍 Beam search for better inference
📈 Learning rate scheduling
⚖️ Layer normalization
🔄 Bidirectional encoder

📋 Requirements

🐍 Python 3.8+
🔥 PyTorch 2.0+
💻 CUDA compatible GPU
💾 6GB+ VRAM
🔤 spaCy
📊 sacrebleu
📈 tqdm

📚 Citation

@article{sutskever2014sequence,
  title={Sequence to Sequence Learning with Neural Networks},
  author={Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V},
  journal={arXiv preprint arXiv:1409.3215},
  year={2014}
}

📄 License

MIT License

🙏 Acknowledgments

📊 WMT'14 dataset providers
🔥 PyTorch team for the deep learning framework
📚 Original Seq2Seq paper authors

🤝 Contributing

Feel free to:

🐛 Report bugs
💡 Suggest features
🔀 Submit PRs

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
translate.py		translate.py
utility.py		utility.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Neural Machine Translation with Sequence-to-Sequence Learning

🎯 Project Overview

✨ Key Features

🛠️ Technical Implementation

🧬 Model Architecture

🔋 Key Components

💻 GPU Optimizations

📊 Results

🚀 Installation and Setup

🎮 Usage

🏃‍♂️ Training

🔄 Translation

📊 Evaluation

🔬 Technical Details

🔄 Data Processing Pipeline

🎯 Model Features

🔮 Future Improvements

📋 Requirements

📚 Citation

📄 License

🙏 Acknowledgments

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Neural Machine Translation with Sequence-to-Sequence Learning

🎯 Project Overview

✨ Key Features

🛠️ Technical Implementation

🧬 Model Architecture

🔋 Key Components

💻 GPU Optimizations

📊 Results

🚀 Installation and Setup

🎮 Usage

🏃‍♂️ Training

🔄 Translation

📊 Evaluation

🔬 Technical Details

🔄 Data Processing Pipeline

🎯 Model Features

🔮 Future Improvements

📋 Requirements

📚 Citation

📄 License

🙏 Acknowledgments

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages