Real-Time Gesture Recognition App (Desktop & Web)

A real-time computer vision application that detects and classifies hand gestures from a webcam feed using a powerful combination of deep learning and machine learning models. This repository contains both a local desktop version and a web-based Streamlit application.

This project uses:

YOLOv8 for initial, robust hand region detection.
MediaPipe for accurate, high-fidelity hand landmark extraction.
TensorFlow/Keras for custom gesture classification.
OpenCV for local video capture and rendering.
Streamlit and Streamlit-WebRTC for the interactive web interface.

How It Works

The application follows a multi-stage pipeline to achieve efficient and accurate gesture recognition:

Video Input: The app can use a local OpenCV window (yolodetect.py) or a browser's webcam feed via Streamlit-WebRTC (app_streamlit.py).
Region of Interest (ROI) Detection: Instead of scanning the entire frame, the app first uses a pre-trained YOLOv8 model to detect a person. This quickly and reliably identifies the main area where a hand is likely to be.
Landmark Extraction: The detected ROI is cropped and passed to the MediaPipe Hands model to extract 21 detailed 3D landmarks for the hand.
Gesture Classification: The 3D coordinates of the landmarks are flattened and fed into a custom Keras neural network, which classifies the gesture into predefined categories.

This pipeline approach is highly efficient, as the heavy-duty landmark extraction is only performed on a small, relevant section of the video frame.

Features

Dual Versions: Run the app locally in a desktop window or as an interactive web application in your browser.
Real-Time Performance: Optimized pipeline runs smoothly on a standard webcam.
Robust Detection: Uses YOLOv8 to reliably find the hand's location.
Extensible: Easily train the Keras model to recognize your own custom gestures.
Well-Documented Code: Scripts are cleaned and commented for easy understanding and modification.

Setup and Installation

Prerequisites

Python 3.9+
A webcam
A modern web browser (for the Streamlit version)

Installation Steps

Clone the repository:

git clone https://github.com/chmj/app_yolodetect.git
cd app_yolodetect

Create and activate a virtual environment:

macOS/Linux:

python3 -m venv .venv
source .venv/bin/activate

Windows:

python -m venv .venv
.\.venv\Scripts\activate

Install the required dependencies: The requirements.txt file contains all dependencies for both the desktop and web versions.
```
pip install -r requirements.txt
```

Usage

You can run either the local desktop version or the Streamlit web app.

Option 1: Running the Desktop App

This will open a standard OpenCV window on your desktop to display the webcam feed.

python3 yolodetect.py

Press the ESC key to close the application window.

Option 2: Running the Streamlit Web App

This will launch a local web server and open the application in your browser.

streamlit run app_streamlit.py

Your browser will open a new tab. Click the "START" button and grant webcam permissions when prompted.

Customization: Training Your Own Gestures

The included Keras model is a placeholder and is not trained. To recognize your own gestures:

Collect Landmark Data: Modify either yolodetect.py or app_streamlit.py to save the flattened landmark vectors (landmarks variable in the code) to a CSV file. Create separate files or use labels for each gesture you want to train.
Train the Keras Model: Create a separate Python script to:
- Load the data from your CSV files.
- Build the create_gesture_model().
- Train the model on your landmark data.
- Save the trained model's weights: model.save_weights('my_gesture_model.h5').
Load Your Trained Model: In yolodetect.py or app_streamlit.py, uncomment the following line and update the path to your saved weights file:
```
# gesture_model.load_weights('my_gesture_model.h5')
```

License

This project is licensed under the MIT License. See the LICENSE file for details.

Author

Charles Majola
GitHub: @chmj

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
.gitignore		.gitignore
README.md		README.md
app_streamlit.py		app_streamlit.py
requirements.txt		requirements.txt
yolodetect.py		yolodetect.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Gesture Recognition App (Desktop & Web)

How It Works

Features

Setup and Installation

Prerequisites

Installation Steps

Usage

Option 1: Running the Desktop App

Option 2: Running the Streamlit Web App

Customization: Training Your Own Gestures

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Real-Time Gesture Recognition App (Desktop & Web)

How It Works

Features

Setup and Installation

Prerequisites

Installation Steps

Usage

Option 1: Running the Desktop App

Option 2: Running the Streamlit Web App

Customization: Training Your Own Gestures

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages