English-German Machine Translation using LSTM Encoder-Decoder

This repository contains a Neural Machine Translation (NMT) system implemented in PyTorch. It translates English to German using an LSTM-based Encoder-Decoder architecture.

Project Overview

The project explores two main architectures:

Basic Encoder-Decoder LSTM: A standard Seq2Seq model with a bidirectional LSTM encoder and a unidirectional LSTM decoder.
Encoder-Decoder with Attention: Enhances the basic model with an attention mechanism to improve translation quality for longer sequences.

Dataset

The project uses the Multi30k dataset provided by torchtext.

Source Language: English
Target Language: German
Vocabulary: Built using spacy tokenizers for both languages.

Models & Results

The models are evaluated using the BLEU score metric.

Model Configuration	BLEU Score	Description
Basic LSTM	~25.66 - 26.78	Bidirectional Encoder, Greedy/Beam Search Decoding
LSTM + Attention	~30.07	Adds Attention mechanism, significantly improving performance

Requirements

Python 3.1x
PyTorch
TorchText
Spacy (with en_core_web_sm and de_core_news_sm models)
NLTK (for BLEU score calculation)
NumPy, Matplotlib, Seaborn, Tqdm

Usage

Install Dependencies:

pip install torch torchtext spacy nltk numpy matplotlib seaborn tqdm
python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Run the Notebooks:
- nmt_encoder_decoder_lstm_basic_25.66.ipynb: Training and evaluation of the basic model with greedy decoding.
- nmt_encoder_decoder_lstm_atten.ipynb: Training and evaluation of the based and attention-based models with beam search.

File Structure

nmt_encoder_decoder_lstm_basic_25.66.ipynb: Notebook for the basic LSTM model using greedy decoding.
nmt_encoder_decoder_lstm_atten.ipynb: Notebook for the based and Attention-based LSTM model using beam search.
Output model/: Directory containing saved model weights (.pth files).

Members

Members: Huỳnh Thanh Tuấn

Members: Nguyễn Trọng Nghĩa

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Output model		Output model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
all_models_comparison.png		all_models_comparison.png
nmt_encoder_decoder_lstm_atten.ipynb		nmt_encoder_decoder_lstm_atten.ipynb
nmt_encoder_decoder_lstm_basic_25.66.ipynb		nmt_encoder_decoder_lstm_basic_25.66.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

English-German Machine Translation using LSTM Encoder-Decoder

Project Overview

Dataset

Models & Results

Requirements

Usage

File Structure

Members

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

CallmeSen/English-German-Machine-Translation-using-LSTM-Encoder-Decoder-Model

Folders and files

Latest commit

History

Repository files navigation

English-German Machine Translation using LSTM Encoder-Decoder

Project Overview

Dataset

Models & Results

Requirements

Usage

File Structure

Members

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages