👁️ VISION: AI Virtual Mouse & Assistant

Vision is a comprehensive, multimodal Human-Computer Interaction (HCI) system designed to bridge the gap between physical peripherals and natural user behavior. It replaces the traditional mouse and keyboard with Hand Gestures, Eye Gaze, and Voice Commands, powered by a robust Generative AI backend for system automation.

🚀 Features

1. 👋 Virtual Mouse (Hand Gestures)

Control your cursor with precision using natural hand movements.
Technology: Google MediaPipe Hands (21-point Landmark Detection).
Gestures:

Move Cursor: Index Finger Up.
Left Click: Pinch (Thumb + Index).
Right Click: Victory Sign (Index + Middle).
Scroll: Pinky Finger Up.
Smoothing: Weighted Moving Average algorithm eliminates cursor jitter.

2. 👁️ Virtual Mouse (Eye Tracking)

Hands-free navigation for complete accessibility.
Technology: Dlib 68-Point Facial Landmark Predictor.
Functionality:

Navigation: Head Pose Estimation (Nose Tip tracking).
Clicking: Eye Aspect Ratio (EAR) based blink detection.
Auto-Calibration: 3-second setup on launch to determine neutral head position.

3. 🧠 AI Assistant (Gemini Brain)

An intelligent agent that understands context and controls the OS.
Backend: Custom Raw HTTP Wrapper for Google Gemini 2.5 Flash.
Capabilities:

Context-Aware Chat: Maintains conversation history.
System Automation: Opens apps (e.g., "Open Chrome"), checks time, and manages system processes.
Heuristic Routing: Local execution for system commands (<0.1s latency) vs. Cloud execution for complex reasoning.

4. 🗣️ Voice Command & Automation

Smart Phonetic Correction: Automatically fixes common speech-to-text errors (e.g., "Start High" $\rightarrow$ "Start Eye").
Process Orchestration: Manages system resources by automatically killing conflicting vision processes (e.g., stopping Hand Tracking before starting Eye Tracking).

🛠️ System Architecture

The project utilizes a Multi-Process, Event-Driven Architecture to ensure responsiveness.

Frontend: CustomTkinter (Modern Floating Dock UI).
Vision Engine: OpenCV, MediaPipe, Dlib.
Intelligence: Google Gemini API, SpeechRecognition.
Automation: PyAutoGUI, Psutil, Subprocess.

💻 Installation

Prerequisites

Python 3.8.10 (Recommended for MediaPipe stability).
Visual Studio Build Tools (Required for Dlib).
Webcam & Microphone.

Setup Steps

i) Clone the Repository

git clone https://github.com/Syed-centem/Virtual-Mouse.git cd Virtual-Mouse

ii) Install Dependencies

pip install -r requirements.txt

Note: If dlib fails, ensure CMake is installed (pip install cmake).

iii) Configure Environment Create a .Env file in the root directory and add your API Key:

GOOGLE_API_KEY=AIzaSy...YourKeyHere

iV) Run the Application

python Main.py

🎮 Usage Guide

Voice Commands

"Start Hand": Activates Hand Gesture Mouse.
"Start Eye": Activates Eye Tracking Mouse.
"Stop": Stops all vision tracking.

Gestures

Hand: Index Up to move, Pinch to click.
Eye: Look to move cursor, Blink to click.

🔮 Future Scope

[ ] IoT Integration: Control smart home devices via gestures.

[ ] Offline LLM: Integrate Llama-3 for privacy-focused, offline reasoning.

[ ] Gaze Typing: On-screen keyboard for paralyzed users.

👥 Contributors

SYED JUNAID HUSSAIN

MOHAMMED SUBHANI MUSHARRAF

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👁️ VISION: AI Virtual Mouse & Assistant

🚀 Features

🛠️ System Architecture

💻 Installation

🎮 Usage Guide

🔮 Future Scope

👥 Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

👁️ VISION: AI Virtual Mouse & Assistant

🚀 Features

🛠️ System Architecture

💻 Installation

🎮 Usage Guide

🔮 Future Scope

👥 Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages