#Image Classification using Convolutional Neural Networks (CNN) on CIFAR-10 Dataset#
Overview This project demonstrates how to build a Convolutional Neural Network (CNN) to classify images from the CIFAR-10 dataset. The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 different categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck). The model is trained to classify these images into their respective categories.
Project Features: Built using Python and TensorFlow/Keras. CNN architecture with multiple convolutional layers, max pooling, and fully connected layers. Uses the Adam optimizer and categorical cross-entropy loss. Data normalization and one-hot encoding of labels. Evaluation on a separate test dataset with performance metrics including accuracy and confusion matrix. Visualization of training performance and sample predictions. Dataset The CIFAR-10 dataset is a widely used dataset for image classification. It contains:
50,000 training images. 10,000 test images. Images are categorized into 10 different classes: Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck. All images are 32x32 pixels with three color channels (RGB).
Project Structure bash
Copy code ├── README.md # Project documentation ├── cifar10_image_classification.py # Python script containing the project code ├── requirements.txt # Required libraries └── results # Directory to store result plots (accuracy, confusion matrix, etc.) Model Architecture The Convolutional Neural Network (CNN) used in this project consists of:
Input layer: Accepts 32x32x3 images (CIFAR-10 format). Convolutional Layers: 3 convolutional layers with increasing number of filters (32, 64, 128). ReLU activation for non-linearity. Max Pooling after each convolutional block to downsample the feature maps. Fully Connected Layer: A fully connected layer with 128 neurons followed by a dropout layer to reduce overfitting. Output Layer: A softmax layer with 10 neurons for multi-class classification. Installation Prerequisites: To run this project, you need to have Python 3.x installed along with the following libraries:
tensorflow numpy matplotlib seaborn scikit-learn You can install the dependencies using the following command:
bash Copy code pip install -r requirements.txt Cloning the repository: You can clone this project using:
bash Copy code git clone <repository_link> Running the Project Download and Prepare Dataset: The CIFAR-10 dataset is loaded directly from keras.datasets, so no manual download is necessary. Run the Script: To train the model, evaluate it on the test set, and visualize results, run the Python script: bash Copy code python cifar10_image_classification.py Evaluation and Results The model is trained for 25 epochs with a batch size of 64. After training, the model is evaluated on the test set, which contains 10,000 images.
Key Metrics: Accuracy: The model achieves a certain level of accuracy (to be measured after training). Confusion Matrix: Displays the number of correct and incorrect classifications for each class. Loss and Accuracy Plots: The training and validation loss/accuracy are visualized over the epochs. Sample Visualizations: Confusion Matrix: Shows how well the model performs across the 10 categories.
Training and Validation Accuracy/Loss: Plots showing the model’s performance over the training epochs.
Sample Predictions: The script shows a few test images with their true and predicted labels.
Future Improvements Data Augmentation: To improve the generalization of the model, we can apply techniques such as rotation, zoom, flipping, etc. Hyperparameter Tuning: Experimenting with different learning rates, batch sizes, number of filters, and layers can improve the model. Transfer Learning: Using a pre-trained model (such as VGG or ResNet) could improve accuracy. Conclusion This project successfully demonstrates image classification using CNNs on the CIFAR-10 dataset. While the model achieves good performance, further improvements can be made through hyperparameter tuning and data augmentation.
License This project is open-source and available under the MIT License.