This project replicates the core empirical claim of Hinton & Salakhutdinov (2006), Reducing the Dimensionality of Data with Neural Networks, using modern tools and training practices.
The original paper showed that nonlinear autoencoders can achieve lower reconstruction error than PCA when compressing high-dimensional data. In this project, we compare a standard PCA baseline with a multilayer perceptron autoencoder under matched latent dimensionality constraints.
📄 Technical Report:
Revisiting Dimensionality Reduction with Autoencoders — A Modern Empirical Replication (PDF)
- MNIST (28×28 grayscale images)
- Methods: PCA, MLP Autoencoder
- Latent dimensions: 2, 16, 32, 64
- Metric: mean squared reconstruction error
- Qualitative comparison via reconstruction visualizations
src/— core implementations (data loading, models, training, baselines)scripts/— experiment runnersnotebooks/— analysis and figure generationresults/— metrics and visual outputsreport/— LaTeX write-up of results
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m scripts.sweep_latents
python -m scripts.make_figures