Skip to content

SimoneAndreaCilia/Virtual-AI-Spotter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

137 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Virtual AI Spotter

Status Python License Coverage AWS

πŸš€ Major Update: The core engine has been refactored for Production Readiness. Full rewrite around FSM-based counting (debouncing + hysteresis), a modular Feedback System, One Euro Filter signal smoothing, and a pure-math Geometry Engine (zero NumPy overhead). Architecture highlights: Factory + Registry extensibility, Protocol-based DI, Session Manager with set/rest orchestration, hands-free Gesture Control, i18n (IT/EN), SQLite persistence, and an optimized HUD with ROI alpha blending β€” all validated by a 150+ test suite running at 30+ FPS on CPU.

☁️ NEW β€” AWS Cloud Integration: Workout sessions are now persisted to the cloud via API Gateway β†’ Lambda β†’ DynamoDB pipeline. Data Batching pattern sends a single JSON payload per session. Configurable via .env, fully optional (app works offline with SQLite only).

Project Overview

Virtual AI Spotter is a real-time Computer Vision assistant designed to act as an intelligent personal trainer. It utilizes state-of-the-art Deep Learning and geometric analysis to provide automatic repetition counting, exercise suggestions, and instant feedback on execution form.

Technology Stack

  • Core AI: YOLOv8 YOLOv8 (Pose Estimation)
  • Framework: PyTorch PyTorch
  • Computer Vision: OpenCV OpenCV
  • Logic: πŸ“ Geometric Vector Analysis & βš™οΈ Finite State Machines (FSM)
  • Cloud: AWS AWS (Lambda, DynamoDB, S3)
  • Database: SQLite DynamoDB SQLite (Local), DynamoDB (Cloud)

Key Features

  • Real-time Pose Estimation: High-speed, accurate body tracking using YOLOv8-pose.
  • Action Classification: Distinguishes between different exercises and movement phases.
  • Automatic Rep Counting: Precision counting based on Finite State Machines (FSM) with debouncing and hysteresis.
  • Form Correction: Instant feedback on posture (e.g., "Lower your hips", "Straighten back") using a modular Feedback System.
  • Multi-language Support: Fully localized interface (Italian/English) with dynamic switching.
  • High-Performance HUD: Optimized Visualizer engine using ROI-based Alpha Blending for smooth, transparent overlays.
  • Gesture Control: Hands-free interaction using pose-based gestures (e.g., raised arm to skip rest periods).
  • Extensible Architecture: Factory + Registry Pattern enables adding new exercises without modifying core code. Dependency Injection via Python Protocols for testability.

MVP Scope (Minimum Viable Product)

The initial release focuses on 4 fundamental exercises that test different aspects of the tracking engine:

  1. Squat (Lower Body)

    • Focus: Knee and hip angles.
    • Logic: Standard FSM (Down < Threshold, Up > Threshold).
    • Feedback: Squat depth and back alignment.
  2. Push-up (Upper Body)

    • Focus: Body alignment and elbow extension.
    • Challenges: Robustness against occlusion (body close to floor).
    • Feedback: "Keep back straight" via body angle analysis.
  3. Bicep Curl (Isolation)

    • Focus: Elbow flexion/extension.
    • Logic: Inverted FSM Logic (Up/Flexion < Threshold, Down/Extension > Threshold).
    • Feedback: Full extension check.
  4. Plank (Static Core)

    • Focus: Maintaining a straight line (Shoulder-Hip-Ankle alignment).
    • Logic: StaticDurationCounter FSM with countdown, active timer, and form break detection.

System Architecture

The project follows a Layered Architecture with clear separation of concerns, enabling testability, extensibility (Open/Closed Principle), and adherence to Domain-Driven Design (DDD) principles.

Data Flow Diagram

%%{init: {'theme': 'neutral'}}%%
graph LR
    subgraph INFRA["πŸ”Œ Infrastructure Layer"]
        direction TB
        A["πŸ“· Webcam"] --> B["πŸ€– YOLOv8 Pose"]
        B --> C["πŸ”‘ Keypoint Extractor"]
    end

    subgraph CORE["⚑ Core Domain"]
        direction TB
        D["πŸ“ Geometry Engine"] --> E["βš™οΈ FSM"]
        E --> F["πŸ’¬ Feedback"]
    end

    subgraph UI["🎨 Presentation Layer"]
        direction TB
        G["πŸ–₯️ Visualizer"] --> H["πŸ“Š Renderers"]
    end

    C ==> D
    F ==> G

    %% Layer styling - pastel fills, semantic strokes
    style INFRA fill:#f3f4f6,stroke:#6b7280,stroke-width:2px,color:#374151
    style CORE fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
    style UI fill:#f5f3ff,stroke:#8b5cf6,stroke-width:2px,color:#5b21b6

    %% Node styling - light backgrounds, dark text
    style A fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
    style B fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
    style C fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
    style D fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
    style E fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
    style F fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
    style G fill:#ffffff,stroke:#a78bfa,stroke-width:1px,color:#4c1d95
    style H fill:#ffffff,stroke:#a78bfa,stroke-width:1px,color:#4c1d95

    linkStyle default stroke:#64748b,stroke-width:1px
    linkStyle 3 stroke:#059669,stroke-width:2px
    linkStyle 4 stroke:#059669,stroke-width:2px
Loading

1. Core Domain (src/core)

Business logic is fully isolated from external dependencies:

  • Entities (src/core/entities/): Domain objects following DDD β€” Session, User, WorkoutState, UIState.
  • FSM Core (fsm.py): Reusable RepetitionCounter with debouncing, hysteresis, and support for standard/inverted logic.
  • Feedback System (feedback.py): Aggregates form-check rules and prioritizes messages.
  • Factory + Registry (factory.py, registry.py): Exercises self-register via @register_exercise decorator β€” no if/elif chains.
  • Session Manager (session_manager.py): Orchestrates workout flow, rest periods, and set progression.
  • Dependency Injection: Abstractions defined in protocols.py (PoseDetector, KeypointExtractor, DatabaseManagerProtocol) enable mock injection for CI/CD testing.

2. Infrastructure Layer (src/infrastructure)

Handles external integrations, decoupled from business logic:

  • AI Inference (ai_inference.py): YOLO model wrapper implementing PoseDetector protocol.
  • Keypoint Extractor (keypoint_extractor.py): Transforms raw YOLO output to standardized 17Γ—3 arrays.
  • Webcam (webcam.py): Frame capture abstraction for easy replacement with video files or streams.

3. UI & Visualization (src/ui)

Presentation layer with separated rendering responsibilities:

  • Visualizer (visualizer.py): Facade coordinating all renderers.
  • Dashboard Renderer: Draws HUD panels (reps, sets, feedback text).
  • Overlay Renderer: Transparent overlays using ROI-based alpha blending.
  • Skeleton Renderer: Draws pose skeleton connections.

4. Signal Processing (src/utils)

  • Geometry Engine (geometry.py): Pure math-based vector calculations (no NumPy overhead).
  • Smoothing (smoothing.py): One Euro Filter for jitter reduction.
  • Circular Buffer: collections.deque for temporal smoothing (30-frame window).

5. Hybrid Cloud Architecture (AWS)

The project implements a Hybrid Edge–Cloud model: real-time inference runs locally for zero-latency feedback, while session data is persisted to the cloud asynchronously after each workout.

%%{init: {'theme': 'neutral'}}%%
graph LR
    subgraph EDGE["πŸ’» Edge (Local)"]
        direction TB
        A["πŸ“· Webcam"] --> B["πŸ€– YOLOv8 Pose"]
        B --> C["βš™οΈ FSM + Feedback"]
        C --> D["πŸ’Ύ SQLite"]
    end

    subgraph CLOUD["☁️ AWS Cloud"]
        direction TB
        E["🌐 API Gateway"] --> F["⚑ Lambda"]
        F --> G["πŸ—„οΈ DynamoDB"]
    end

    D ==>|"POST /sessions\(Data Batching)"| E

    style EDGE fill:#f3f4f6,stroke:#6b7280,stroke-width:2px,color:#374151
    style CLOUD fill:#fff7ed,stroke:#f97316,stroke-width:2px,color:#9a3412
    style A fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
    style B fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
    style C fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
    style D fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
    style E fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
    style F fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
    style G fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
    linkStyle default stroke:#64748b,stroke-width:1px
    linkStyle 3 stroke:#f97316,stroke-width:2px
Loading
  • Edge (Local): Real-time inference on local PC/GPU for zero-latency feedback. Sessions are saved to SQLite.
  • Cloud (AWS): After each workout, a single JSON payload (Data Batching) is sent to API Gateway (POST /sessions), which triggers a Lambda function that validates the data and writes it to DynamoDB.
  • Security: API Key authentication via x-api-key header. IAM policy follows Least Privilege (only dynamodb:PutItem + CloudWatch logs).
  • Configuration: All AWS settings are loaded from .env via config/settings.py. Cloud upload is fully optional β€” without a .env file, the app works entirely offline.

6. Quality Assurance

  • Test Suite (tests/): 150+ automated tests across 14 test files β€” FSM, Geometry, SessionManager, Gesture Detection, DI mocks, exercise integration, state display, and AWS Lambda coverage.
  • Verification Scripts: Manual validation tools for debouncing, i18n, refactoring.

πŸ“‚ View Project Structure (File Tree)
β”œβ”€β”€ πŸ“ .github
β”‚   └── πŸ“ workflows                          # CI/CD pipeline definitions
β”œβ”€β”€ πŸ“ assets
β”‚   └── πŸ“ models
β”‚       └── πŸ“„ yolov8n-pose.pt                 # Pre-trained YOLOv8 pose model
β”œβ”€β”€ πŸ“ aws                                     # ☁️ AWS backend infrastructure
β”‚   β”œβ”€β”€ πŸ“ lambda                              # Lambda function package
β”‚   β”‚   β”œβ”€β”€ 🐍 lambda_function.py              # Session Logger (validate β†’ DynamoDB)
β”‚   β”‚   └── πŸ“„ requirements.txt               # Lambda dependencies (boto3)
β”‚   β”œβ”€β”€ πŸ“„ iam-policy.json                     # Least-privilege IAM policy
β”‚   └── πŸ“ README.md                           # AWS deploy instructions & API docs
β”œβ”€β”€ πŸ“ config
β”‚   β”œβ”€β”€ 🐍 settings.py                         # Global constants, thresholds, AWS config
β”‚   └── 🐍 translation_strings.py              # i18n strings (IT/EN)
β”œβ”€β”€ πŸ“ scripts
β”‚   β”œβ”€β”€ 🐍 check_cam.py                        # Camera connectivity check
β”‚   └── 🐍 verify_refactor.py                  # Post-refactor sanity checks
β”œβ”€β”€ πŸ“ src
β”‚   β”œβ”€β”€ πŸ“ core                                # Business logic (framework-agnostic)
β”‚   β”‚   β”œβ”€β”€ πŸ“ entities                        # Domain objects (DDD)
β”‚   β”‚   β”‚   β”œβ”€β”€ 🐍 session.py                  # Workout session dataclass
β”‚   β”‚   β”‚   β”œβ”€β”€ 🐍 ui_state.py                 # Rendering state container
β”‚   β”‚   β”‚   β”œβ”€β”€ 🐍 user.py                     # User profile dataclass
β”‚   β”‚   β”‚   └── 🐍 workout_state.py            # Workout FSM states (ACTIVE/REST/FINISHED)
β”‚   β”‚   β”œβ”€β”€ 🐍 app.py                          # Composition root & main loop
β”‚   β”‚   β”œβ”€β”€ 🐍 config_types.py                 # TypedDict definitions for configs
β”‚   β”‚   β”œβ”€β”€ 🐍 exceptions.py                   # Custom exception hierarchy (SpotterError)
β”‚   β”‚   β”œβ”€β”€ 🐍 factory.py                      # Exercise factory (creates instances)
β”‚   β”‚   β”œβ”€β”€ 🐍 feedback.py                     # Rule-based form correction engine
β”‚   β”‚   β”œβ”€β”€ 🐍 fsm.py                          # RepetitionCounter & StaticDurationCounter
β”‚   β”‚   β”œβ”€β”€ 🐍 gesture_detector.py             # Pose-based gesture recognition
β”‚   β”‚   β”œβ”€β”€ 🐍 interfaces.py                   # ABCs: Exercise, VideoSource, StateDisplayInfo
β”‚   β”‚   β”œβ”€β”€ 🐍 protocols.py                    # DI protocols: PoseDetector, DBManager
β”‚   β”‚   β”œβ”€β”€ 🐍 registry.py                     # @register_exercise decorator & registry
β”‚   β”‚   └── 🐍 session_manager.py              # Set/rest/rep orchestration
β”‚   β”œβ”€β”€ πŸ“ data                                # Persistence layer
β”‚   β”‚   β”œβ”€β”€ 🐍 db_manager.py                   # SQLite CRUD operations
β”‚   β”‚   └── πŸ“„ schema.sql                      # Database schema definition
β”‚   β”œβ”€β”€ πŸ“ exercises                           # Concrete exercise implementations
β”‚   β”‚   β”œβ”€β”€ 🐍 __init__.py                     # Auto-imports for registration
β”‚   β”‚   β”œβ”€β”€ 🐍 curl.py                         # Bicep Curl (inverted FSM)
β”‚   β”‚   β”œβ”€β”€ 🐍 plank.py                        # Plank (static hold timer)
β”‚   β”‚   β”œβ”€β”€ 🐍 pushup.py                       # Push-Up (bilateral + form check)
β”‚   β”‚   └── 🐍 squat.py                        # Squat (standard FSM)
β”‚   β”œβ”€β”€ πŸ“ infrastructure                      # External system adapters
β”‚   β”‚   β”œβ”€β”€ 🐍 ai_inference.py                 # YOLO model wrapper (PoseDetector)
β”‚   β”‚   β”œβ”€β”€ 🐍 keypoint_extractor.py           # Raw YOLO output β†’ 17Γ—3 arrays
β”‚   β”‚   └── 🐍 webcam.py                       # OpenCV camera capture (VideoSource)
β”‚   β”œβ”€β”€ πŸ“ ui                                  # Presentation layer
β”‚   β”‚   β”œβ”€β”€ 🐍 cli.py                          # Interactive workout setup prompts
β”‚   β”‚   β”œβ”€β”€ 🐍 dashboard_renderer.py           # HUD panel (reps, sets, state)
β”‚   β”‚   β”œβ”€β”€ 🐍 overlay_renderer.py             # Full-screen REST/FINISHED overlays
β”‚   β”‚   β”œβ”€β”€ 🐍 skeleton_renderer.py            # Pose skeleton & angle arcs
β”‚   β”‚   └── 🐍 visualizer.py                   # Renderer facade (delegates to above)
β”‚   └── πŸ“ utils                               # Signal processing utilities
β”‚       β”œβ”€β”€ 🐍 geometry.py                     # Pure-math angle calculations
β”‚       β”œβ”€β”€ 🐍 performance.py                  # FPS counter & timing helpers
β”‚       └── 🐍 smoothing.py                    # One Euro Filter for jitter reduction
β”œβ”€β”€ πŸ“ tests                                   # Automated test suite (150+ tests)
β”‚   β”œβ”€β”€ πŸ“ mocks                               # Test doubles
β”‚   β”‚   β”œβ”€β”€ 🐍 __init__.py
β”‚   β”‚   β”œβ”€β”€ 🐍 mock_pose.py                    # Fake PoseDetector for DI tests
β”‚   β”‚   └── 🐍 mock_video.py                   # Fake VideoSource for DI tests
β”‚   β”œβ”€β”€ 🐍 __init__.py
β”‚   β”œβ”€β”€ 🐍 helpers.py                          # Shared fixtures (UIState, dummy frames)
β”‚   β”œβ”€β”€ 🐍 test_app_di.py                      # Dependency injection wiring tests
β”‚   β”œβ”€β”€ 🐍 test_db_manual.py                   # SQLite persistence tests
β”‚   β”œβ”€β”€ 🐍 test_entities_manual.py             # Domain entity tests
β”‚   β”œβ”€β”€ 🐍 test_exercise_integration.py        # End-to-end rep counting & form feedback
β”‚   β”œβ”€β”€ 🐍 test_exercises.py                   # Exercise process_frame unit tests
β”‚   β”œβ”€β”€ 🐍 test_fsm.py                         # FSM state transitions & debouncing
β”‚   β”œβ”€β”€ 🐍 test_geometry.py                    # Angle calculation edge cases
β”‚   β”œβ”€β”€ 🐍 test_gesture.py                     # Gesture recognition tests
β”‚   β”œβ”€β”€ 🐍 test_lambda.py                      # ☁️ AWS Lambda handler & validation tests
β”‚   β”œβ”€β”€ 🐍 test_plank.py                       # Plank lifecycle & timer tests
β”‚   β”œβ”€β”€ 🐍 test_pose_estimator.py              # PoseEstimator protocol tests
β”‚   β”œβ”€β”€ 🐍 test_session_manager.py             # Workout flow & state transitions
β”‚   β”œβ”€β”€ 🐍 test_smoothing.py                   # One Euro Filter convergence tests
β”‚   β”œβ”€β”€ 🐍 test_visualizer.py                  # Renderer + state display mapping tests
β”‚   β”œβ”€β”€ 🐍 verify_debouncing.py                # Manual debouncing validation
β”‚   β”œβ”€β”€ 🐍 verify_features.py                  # Manual feature smoke tests
β”‚   β”œβ”€β”€ 🐍 verify_i18n.py                      # Manual i18n string verification
β”‚   └── 🐍 verify_refactor.py                  # Manual refactor validation
β”œβ”€β”€ βš™οΈ .env                                     # AWS credentials & cloud config
β”œβ”€β”€ βš™οΈ .gitignore
β”œβ”€β”€ πŸ“„ LICENSE                                  # AGPL v3
β”œβ”€β”€ πŸ“ README.md
β”œβ”€β”€ 🐍 main.py                                 # Application entry point
└── πŸ“„ requirements.txt                        # Python dependencies

πŸ—ΊοΈ Roadmap

  • Project Initialization
    • Architecture & Tech Stack Definition
    • Repository Structure & .gitignore
  • Core Engineering
    • Abstract Exercise Class
    • YOLOv8 Integration
    • FSM & Feedback Architecture Refactoring
    • Performance Optimization (Math + ROI Visualizer)
  • Exercise Logic (MVP)
    • Squat (Depth & Form)
    • Push-up (Occlusion handling)
    • Bicep Curl (Inverted Logic)
    • Plank (Static stability check)
  • Cloud & DevOps
    • AWS Lambda Session Logger (aws/lambda/lambda_function.py)
    • DynamoDB Table
    • API Gateway HTTP API (POST /sessions + API Key auth)
    • IAM Least-Privilege Policy (aws/iam-policy.json)
    • Cloud config in settings.py + .env support
    • Lambda unit tests (tests/test_lambda.py β€” 25 tests)
    • Unit Testing Suite (tests/)
    • CI/CD Pipeline (GitHub Actions)

Releases

No releases published

Packages

 
 
 

Contributors

Languages