A weekend project exploring adversarial attacks on convolutional neural networks using PyTorch. It demonstrates how to generate images that fool a pretrained ResNet-18 model into making incorrect predictions. I'm sure there's literature out there on how to do this better, though I can't say I've looked into it that much. Had this idea while swimming a couple days ago and wanted to give it a shot without looking into existing work.
The notebook (inverse_resnet.ipynb) contains two main experiments:
-
Inverse Image Generation: Starting from random noise, optimize pixel values to generate images that the model classifies as a target class (goldfish).
-
Adversarial Image Manipulation: Take real ImageNet images and subtly modify them so the model misclassifies them as a different target class, while keeping the images visually similar to the originals.
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt- Download ImageNet sample images to
./imagenet-sample-images/ - Ensure
imagenet_class_idx.jsoncontains the ImageNet class labels - Run the Jupyter notebook:
jupyter notebook inverse_resnet.ipynb- Python 3.10+
- PyTorch with CUDA support (optional, for GPU acceleration)
- See
requirements.txtfor full dependencies

