Skip to content

jacekkala/food_classification_cnn

Repository files navigation

πŸ” Food & Beverage Classification using CNN

πŸ“Œ Project Overview

This project classifies food & beverage images using a Convolutional Neural Network (CNN). The dataset consists of 9323 training images and 484 test images across 61 classes (e.g., water, pizza-margherita-baked, broccoli, salad, egg, etc.).

πŸ”Ή Objective

  • Train a CNN model to classify food items from images.
  • Improve generalization using data augmentation.
  • Monitor training progress with validation curves.
  • Provide model interpretability using Grad-CAM visualizations.

πŸ“‚ Dataset Details

πŸ“ Directories

data/
β”œβ”€β”€ training_set_128/    # 9323 images (train + validation)
β”œβ”€β”€ test_set_128/        # 484 images (unlabeled test data)
loss_accuracy_curves/    # images for some of the models tested
β”œβ”€β”€ accuracy/
β”œβ”€β”€ loss/
saved_models/            # best model trained
images/                  # for README

πŸ“ Example Classes & Image Distribution

Class Number of Images
Water 863
Bread-White 595
Salad-Leaf 535
Pickle 28

πŸ“ Example Training Images

Below are some sample images from the training set (some of them are difficult to recognize even for human eyes - hard-cheese hard ineed!):

Training Images


πŸ”Ή Data Preprocessing & Augmentation

To improve generalization, applied:

  • Rotation (Β±30Β°)
  • Zooming (20%)
  • Shifting (20%)
  • Horizontal Flipping
  • Rescaling (0-255 β†’ 0-1)

πŸ—οΈ Model Architecture

The CNN consists of:

  • 4 Convolutional Blocks with ReLU activation
  • Softmax Activation for multi-class classification
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    
    Conv2D(256, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    Flatten(),
    Dense(128, activation='relu'),
    Dense(61, activation='softmax')
])

πŸ“Š Model Training & Evaluation

  • Optimizer: Adam (learning_rate=0.001)
  • Loss Function: Categorical Crossentropy
  • Metrics: Accuracy, Precision, Recall

πŸ–₯️ Training Curves

Loss & Accuracy over epochs:

Training Curves Placeholder

πŸ” Confusion Matrix

Visualizing misclassified images:

Confusion Matrix Placeholder


🎨 Model Interpretability

We use Grad-CAM to visualize important regions in an image that influenced predictions.

Example Grad-CAM Visualization:

Grad-CAM Placeholder


πŸ“Œ Future Improvements

  • Use Transfer Learning (e.g., MobileNet, ResNet) for better accuracy
  • Optimize hyperparameters using KerasTuner

About

🍽️ Food & Beverage Multiclass Image Classification with Convolutional Neural Network (CNN) πŸ€–

Topics

Resources

License

Stars

Watchers

Forks

Contributors