Skip to content

prabhaankala18/Yoga-Pose-Detection-System

Repository files navigation

🧘 Yoga Pose Classification with Real-Time Feedback (GCN + Pose Keypoints)

📌 Overview

This project classifies 5 yoga poses using MediaPipe Pose keypoints and a Hybrid Graph Neural Network (GCN + GAT).
It supports:

  • Training the model in Kaggle.
  • Real‑time webcam pose classification in VS Code.
  • Live pose correction feedback for improving alignment.

Classes:
Downdog, Goddess, Plank, Tree, Warrior2


🚀 Key Features

  • Skeleton-based representation — robust to lighting/background changes.
  • Hybrid GNN model:
    GCN → GCN → GCN → GAT → GAT → (Mean + Max + Sum Pooling) → MLP
    
  • Training enhancements: AdamW optimizer, cosine annealing learning rate schedule, label smoothing, early stopping, gradient clipping.
  • Data augmentation: horizontal flip, brightness variation.
  • Real-time webcam inference with pose names.
  • Live feedback system — joint-angle & alignment checks to give instant corrective tips.

📂 Repository Structure

yoga-pose-classification/
├── All_Models (1).ipynb           # Baseline ML models (SVM, KNN, etc.)
├── gcn-mlp-att-3 (2) (1).ipynb    # Main GCN+MLP+Attention training notebook
├── best_gcn_mlp_att_model (1).pth # Pre-trained model weights
├── realtime_gcn_mlp_3layers.py    # Real-time classification + feedback
├── README.md                      # This file

⚡ Quick Start

1️⃣ Train on Kaggle (Optional)

If you want to retrain instead of using the given .pth:

  1. Open Kaggle → New Notebook.
  2. Upload:
    • gcn-mlp-att-3 (2) (1).ipynb
    • Or All_Models (1).ipynb for classic ML baselines.
  3. Attach dataset:
    • Kaggle → “Add Dataset” → Search yoga-poses-dataset (by niharika41298)
  4. Run all cells — after completion you’ll get:
    best_gcn_mlp_att_model.pth
    
    in /working/.

2️⃣ Download Model

  • From Kaggle output, download best_gcn_mlp_att_model.pth
  • Place it in the same folder as realtime_gcn_mlp_3layers.py on your PC.

3️⃣ Run Real-Time on VS Code

  1. Install Python 3.9+ and VS Code.
  2. In terminal:
    pip install torch torchvision torchaudio
    pip install torch-geometric
    pip install mediapipe opencv-python numpy
  3. Run:
    python realtime_gcn_mlp_3layers.py --model_path best_gcn_mlp_att_model.pth
  4. Webcam will open → shows predicted pose + live feedback corrections.

🛠 Feedback System

In real time, joint angles & alignments are checked per pose:

Pose Example Tips
Warrior2 Bend front knee to ~90°, Arms at shoulder height, Keep torso upright
Tree Level hips, Straighten standing leg, Adjust foot placement
Plank Keep straight line shoulders→heels, Avoid sagging/raising hips
Downdog Push hips back, Press heels toward floor
Goddess Align knees over toes, Lower hips for depth

Rules are hardcoded inside realtime_gcn_mlp_3layers.py and can be tuned.


📊 Performance Summary

Overall

  • Validation Accuracy: 96.99%
  • Test Accuracy: 99.14%
  • Macro F1 (Test): 0.9913
  • Test Loss: ~0.051

Class-wise Report

              precision    recall  f1-score   support
downdog         1.00        1.00      1.00       93
goddess         0.99        0.95      0.97       80
plank           0.99        1.00      1.00      115
tree            0.97        1.00      0.99       68
warrior2        0.98        0.98      0.98      107

🧪 Detailed Analysis

Confusion Matrix Insights

  • Perfect predictions: Downdog & Plank — very distinct geometric signatures.
  • Minor confusions: Goddess ↔ Warrior2 due to similar stance & leg positioning.
  • Tree misclassifications occur only when lifted leg position is unclear or out of frame.

Why Accuracy is High

  • Landmark-based features independent of background/colors.
  • GCN captures local anatomical relationships.
  • GAT layers highlight key discriminative joints.
  • Combined pooling extracts richer graph features.
  • Strong data augmentation improves generalization.

Pose Behavior

  • Downdog: Hip height & spine–arm line make it distinctive.
  • Plank: Straight line torso → heels key feature.
  • Warrior2 vs Goddess: Arm elevation is main discriminator.
  • Tree: Balance on one leg is a strong cue.

Real-Time Reliability

  • Runs consistently in webcam mode due to stable landmark extraction.
  • Temporal smoothing further reduces flicker.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors