This project classifies 5 yoga poses using MediaPipe Pose keypoints and a Hybrid Graph Neural Network (GCN + GAT).
It supports:
- Training the model in Kaggle.
- Real‑time webcam pose classification in VS Code.
- Live pose correction feedback for improving alignment.
Classes:
Downdog, Goddess, Plank, Tree, Warrior2
- Skeleton-based representation — robust to lighting/background changes.
- Hybrid GNN model:
GCN → GCN → GCN → GAT → GAT → (Mean + Max + Sum Pooling) → MLP - Training enhancements: AdamW optimizer, cosine annealing learning rate schedule, label smoothing, early stopping, gradient clipping.
- Data augmentation: horizontal flip, brightness variation.
- Real-time webcam inference with pose names.
- Live feedback system — joint-angle & alignment checks to give instant corrective tips.
yoga-pose-classification/
├── All_Models (1).ipynb # Baseline ML models (SVM, KNN, etc.)
├── gcn-mlp-att-3 (2) (1).ipynb # Main GCN+MLP+Attention training notebook
├── best_gcn_mlp_att_model (1).pth # Pre-trained model weights
├── realtime_gcn_mlp_3layers.py # Real-time classification + feedback
├── README.md # This file
If you want to retrain instead of using the given .pth:
- Open Kaggle → New Notebook.
- Upload:
gcn-mlp-att-3 (2) (1).ipynb- Or
All_Models (1).ipynbfor classic ML baselines.
- Attach dataset:
- Kaggle → “Add Dataset” → Search
yoga-poses-dataset(by niharika41298)
- Kaggle → “Add Dataset” → Search
- Run all cells — after completion you’ll get:
in
best_gcn_mlp_att_model.pth/working/.
- From Kaggle output, download
best_gcn_mlp_att_model.pth - Place it in the same folder as
realtime_gcn_mlp_3layers.pyon your PC.
- Install Python 3.9+ and VS Code.
- In terminal:
pip install torch torchvision torchaudio pip install torch-geometric pip install mediapipe opencv-python numpy
- Run:
python realtime_gcn_mlp_3layers.py --model_path best_gcn_mlp_att_model.pth
- Webcam will open → shows predicted pose + live feedback corrections.
In real time, joint angles & alignments are checked per pose:
| Pose | Example Tips |
|---|---|
| Warrior2 | Bend front knee to ~90°, Arms at shoulder height, Keep torso upright |
| Tree | Level hips, Straighten standing leg, Adjust foot placement |
| Plank | Keep straight line shoulders→heels, Avoid sagging/raising hips |
| Downdog | Push hips back, Press heels toward floor |
| Goddess | Align knees over toes, Lower hips for depth |
Rules are hardcoded inside realtime_gcn_mlp_3layers.py and can be tuned.
- Validation Accuracy: 96.99%
- Test Accuracy: 99.14%
- Macro F1 (Test): 0.9913
- Test Loss: ~0.051
precision recall f1-score support
downdog 1.00 1.00 1.00 93
goddess 0.99 0.95 0.97 80
plank 0.99 1.00 1.00 115
tree 0.97 1.00 0.99 68
warrior2 0.98 0.98 0.98 107
- Perfect predictions: Downdog & Plank — very distinct geometric signatures.
- Minor confusions: Goddess ↔ Warrior2 due to similar stance & leg positioning.
- Tree misclassifications occur only when lifted leg position is unclear or out of frame.
- Landmark-based features independent of background/colors.
- GCN captures local anatomical relationships.
- GAT layers highlight key discriminative joints.
- Combined pooling extracts richer graph features.
- Strong data augmentation improves generalization.
- Downdog: Hip height & spine–arm line make it distinctive.
- Plank: Straight line torso → heels key feature.
- Warrior2 vs Goddess: Arm elevation is main discriminator.
- Tree: Balance on one leg is a strong cue.
- Runs consistently in webcam mode due to stable landmark extraction.
- Temporal smoothing further reduces flicker.