π Major Update: The core engine has been refactored for Production Readiness. Full rewrite around FSM-based counting (debouncing + hysteresis), a modular Feedback System, One Euro Filter signal smoothing, and a pure-math Geometry Engine (zero NumPy overhead). Architecture highlights: Factory + Registry extensibility, Protocol-based DI, Session Manager with set/rest orchestration, hands-free Gesture Control, i18n (IT/EN), SQLite persistence, and an optimized HUD with ROI alpha blending β all validated by a 150+ test suite running at 30+ FPS on CPU.
βοΈ NEW β AWS Cloud Integration: Workout sessions are now persisted to the cloud via API Gateway β Lambda β DynamoDB pipeline. Data Batching pattern sends a single JSON payload per session. Configurable via
.env, fully optional (app works offline with SQLite only).
Virtual AI Spotter is a real-time Computer Vision assistant designed to act as an intelligent personal trainer. It utilizes state-of-the-art Deep Learning and geometric analysis to provide automatic repetition counting, exercise suggestions, and instant feedback on execution form.
- Core AI:
YOLOv8 (Pose Estimation)
- Framework:
PyTorch
- Computer Vision:
OpenCV
- Logic: π Geometric Vector Analysis & βοΈ Finite State Machines (FSM)
- Cloud:
AWS (Lambda, DynamoDB, S3)
- Database:
SQLite (Local), DynamoDB (Cloud)
- Real-time Pose Estimation: High-speed, accurate body tracking using YOLOv8-pose.
- Action Classification: Distinguishes between different exercises and movement phases.
- Automatic Rep Counting: Precision counting based on Finite State Machines (FSM) with debouncing and hysteresis.
- Form Correction: Instant feedback on posture (e.g., "Lower your hips", "Straighten back") using a modular Feedback System.
- Multi-language Support: Fully localized interface (Italian/English) with dynamic switching.
- High-Performance HUD: Optimized Visualizer engine using ROI-based Alpha Blending for smooth, transparent overlays.
- Gesture Control: Hands-free interaction using pose-based gestures (e.g., raised arm to skip rest periods).
- Extensible Architecture: Factory + Registry Pattern enables adding new exercises without modifying core code. Dependency Injection via Python Protocols for testability.
The initial release focuses on 4 fundamental exercises that test different aspects of the tracking engine:
-
Squat (Lower Body)
- Focus: Knee and hip angles.
- Logic: Standard FSM (Down < Threshold, Up > Threshold).
- Feedback: Squat depth and back alignment.
-
Push-up (Upper Body)
- Focus: Body alignment and elbow extension.
- Challenges: Robustness against occlusion (body close to floor).
- Feedback: "Keep back straight" via body angle analysis.
-
Bicep Curl (Isolation)
- Focus: Elbow flexion/extension.
- Logic: Inverted FSM Logic (Up/Flexion < Threshold, Down/Extension > Threshold).
- Feedback: Full extension check.
-
Plank (Static Core)
- Focus: Maintaining a straight line (Shoulder-Hip-Ankle alignment).
- Logic:
StaticDurationCounterFSM with countdown, active timer, and form break detection.
The project follows a Layered Architecture with clear separation of concerns, enabling testability, extensibility (Open/Closed Principle), and adherence to Domain-Driven Design (DDD) principles.
%%{init: {'theme': 'neutral'}}%%
graph LR
subgraph INFRA["π Infrastructure Layer"]
direction TB
A["π· Webcam"] --> B["π€ YOLOv8 Pose"]
B --> C["π Keypoint Extractor"]
end
subgraph CORE["β‘ Core Domain"]
direction TB
D["π Geometry Engine"] --> E["βοΈ FSM"]
E --> F["π¬ Feedback"]
end
subgraph UI["π¨ Presentation Layer"]
direction TB
G["π₯οΈ Visualizer"] --> H["π Renderers"]
end
C ==> D
F ==> G
%% Layer styling - pastel fills, semantic strokes
style INFRA fill:#f3f4f6,stroke:#6b7280,stroke-width:2px,color:#374151
style CORE fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
style UI fill:#f5f3ff,stroke:#8b5cf6,stroke-width:2px,color:#5b21b6
%% Node styling - light backgrounds, dark text
style A fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
style B fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
style C fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
style D fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
style E fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
style F fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
style G fill:#ffffff,stroke:#a78bfa,stroke-width:1px,color:#4c1d95
style H fill:#ffffff,stroke:#a78bfa,stroke-width:1px,color:#4c1d95
linkStyle default stroke:#64748b,stroke-width:1px
linkStyle 3 stroke:#059669,stroke-width:2px
linkStyle 4 stroke:#059669,stroke-width:2px
Business logic is fully isolated from external dependencies:
- Entities (
src/core/entities/): Domain objects following DDD βSession,User,WorkoutState,UIState. - FSM Core (
fsm.py): ReusableRepetitionCounterwith debouncing, hysteresis, and support for standard/inverted logic. - Feedback System (
feedback.py): Aggregates form-check rules and prioritizes messages. - Factory + Registry (
factory.py,registry.py): Exercises self-register via@register_exercisedecorator β no if/elif chains. - Session Manager (
session_manager.py): Orchestrates workout flow, rest periods, and set progression. - Dependency Injection: Abstractions defined in
protocols.py(PoseDetector, KeypointExtractor, DatabaseManagerProtocol) enable mock injection for CI/CD testing.
Handles external integrations, decoupled from business logic:
- AI Inference (
ai_inference.py): YOLO model wrapper implementingPoseDetectorprotocol. - Keypoint Extractor (
keypoint_extractor.py): Transforms raw YOLO output to standardized 17Γ3 arrays. - Webcam (
webcam.py): Frame capture abstraction for easy replacement with video files or streams.
Presentation layer with separated rendering responsibilities:
- Visualizer (
visualizer.py): Facade coordinating all renderers. - Dashboard Renderer: Draws HUD panels (reps, sets, feedback text).
- Overlay Renderer: Transparent overlays using ROI-based alpha blending.
- Skeleton Renderer: Draws pose skeleton connections.
- Geometry Engine (
geometry.py): Puremath-based vector calculations (no NumPy overhead). - Smoothing (
smoothing.py): One Euro Filter for jitter reduction. - Circular Buffer:
collections.dequefor temporal smoothing (30-frame window).
The project implements a Hybrid EdgeβCloud model: real-time inference runs locally for zero-latency feedback, while session data is persisted to the cloud asynchronously after each workout.
%%{init: {'theme': 'neutral'}}%%
graph LR
subgraph EDGE["π» Edge (Local)"]
direction TB
A["π· Webcam"] --> B["π€ YOLOv8 Pose"]
B --> C["βοΈ FSM + Feedback"]
C --> D["πΎ SQLite"]
end
subgraph CLOUD["βοΈ AWS Cloud"]
direction TB
E["π API Gateway"] --> F["β‘ Lambda"]
F --> G["ποΈ DynamoDB"]
end
D ==>|"POST /sessions\(Data Batching)"| E
style EDGE fill:#f3f4f6,stroke:#6b7280,stroke-width:2px,color:#374151
style CLOUD fill:#fff7ed,stroke:#f97316,stroke-width:2px,color:#9a3412
style A fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
style B fill:#ffffff,stroke:#9ca3af,stroke-width:1px,color:#1f2937
style C fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
style D fill:#ffffff,stroke:#60a5fa,stroke-width:1px,color:#1e3a8a
style E fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
style F fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
style G fill:#ffffff,stroke:#fb923c,stroke-width:1px,color:#9a3412
linkStyle default stroke:#64748b,stroke-width:1px
linkStyle 3 stroke:#f97316,stroke-width:2px
- Edge (Local): Real-time inference on local PC/GPU for zero-latency feedback. Sessions are saved to SQLite.
- Cloud (AWS): After each workout, a single JSON payload (Data Batching) is sent to API Gateway (
POST /sessions), which triggers a Lambda function that validates the data and writes it to DynamoDB. - Security: API Key authentication via
x-api-keyheader. IAM policy follows Least Privilege (onlydynamodb:PutItem+ CloudWatch logs). - Configuration: All AWS settings are loaded from
.envviaconfig/settings.py. Cloud upload is fully optional β without a.envfile, the app works entirely offline.
- Test Suite (
tests/): 150+ automated tests across 14 test files β FSM, Geometry, SessionManager, Gesture Detection, DI mocks, exercise integration, state display, and AWS Lambda coverage. - Verification Scripts: Manual validation tools for debouncing, i18n, refactoring.
π View Project Structure (File Tree)
βββ π .github
β βββ π workflows # CI/CD pipeline definitions
βββ π assets
β βββ π models
β βββ π yolov8n-pose.pt # Pre-trained YOLOv8 pose model
βββ π aws # βοΈ AWS backend infrastructure
β βββ π lambda # Lambda function package
β β βββ π lambda_function.py # Session Logger (validate β DynamoDB)
β β βββ π requirements.txt # Lambda dependencies (boto3)
β βββ π iam-policy.json # Least-privilege IAM policy
β βββ π README.md # AWS deploy instructions & API docs
βββ π config
β βββ π settings.py # Global constants, thresholds, AWS config
β βββ π translation_strings.py # i18n strings (IT/EN)
βββ π scripts
β βββ π check_cam.py # Camera connectivity check
β βββ π verify_refactor.py # Post-refactor sanity checks
βββ π src
β βββ π core # Business logic (framework-agnostic)
β β βββ π entities # Domain objects (DDD)
β β β βββ π session.py # Workout session dataclass
β β β βββ π ui_state.py # Rendering state container
β β β βββ π user.py # User profile dataclass
β β β βββ π workout_state.py # Workout FSM states (ACTIVE/REST/FINISHED)
β β βββ π app.py # Composition root & main loop
β β βββ π config_types.py # TypedDict definitions for configs
β β βββ π exceptions.py # Custom exception hierarchy (SpotterError)
β β βββ π factory.py # Exercise factory (creates instances)
β β βββ π feedback.py # Rule-based form correction engine
β β βββ π fsm.py # RepetitionCounter & StaticDurationCounter
β β βββ π gesture_detector.py # Pose-based gesture recognition
β β βββ π interfaces.py # ABCs: Exercise, VideoSource, StateDisplayInfo
β β βββ π protocols.py # DI protocols: PoseDetector, DBManager
β β βββ π registry.py # @register_exercise decorator & registry
β β βββ π session_manager.py # Set/rest/rep orchestration
β βββ π data # Persistence layer
β β βββ π db_manager.py # SQLite CRUD operations
β β βββ π schema.sql # Database schema definition
β βββ π exercises # Concrete exercise implementations
β β βββ π __init__.py # Auto-imports for registration
β β βββ π curl.py # Bicep Curl (inverted FSM)
β β βββ π plank.py # Plank (static hold timer)
β β βββ π pushup.py # Push-Up (bilateral + form check)
β β βββ π squat.py # Squat (standard FSM)
β βββ π infrastructure # External system adapters
β β βββ π ai_inference.py # YOLO model wrapper (PoseDetector)
β β βββ π keypoint_extractor.py # Raw YOLO output β 17Γ3 arrays
β β βββ π webcam.py # OpenCV camera capture (VideoSource)
β βββ π ui # Presentation layer
β β βββ π cli.py # Interactive workout setup prompts
β β βββ π dashboard_renderer.py # HUD panel (reps, sets, state)
β β βββ π overlay_renderer.py # Full-screen REST/FINISHED overlays
β β βββ π skeleton_renderer.py # Pose skeleton & angle arcs
β β βββ π visualizer.py # Renderer facade (delegates to above)
β βββ π utils # Signal processing utilities
β βββ π geometry.py # Pure-math angle calculations
β βββ π performance.py # FPS counter & timing helpers
β βββ π smoothing.py # One Euro Filter for jitter reduction
βββ π tests # Automated test suite (150+ tests)
β βββ π mocks # Test doubles
β β βββ π __init__.py
β β βββ π mock_pose.py # Fake PoseDetector for DI tests
β β βββ π mock_video.py # Fake VideoSource for DI tests
β βββ π __init__.py
β βββ π helpers.py # Shared fixtures (UIState, dummy frames)
β βββ π test_app_di.py # Dependency injection wiring tests
β βββ π test_db_manual.py # SQLite persistence tests
β βββ π test_entities_manual.py # Domain entity tests
β βββ π test_exercise_integration.py # End-to-end rep counting & form feedback
β βββ π test_exercises.py # Exercise process_frame unit tests
β βββ π test_fsm.py # FSM state transitions & debouncing
β βββ π test_geometry.py # Angle calculation edge cases
β βββ π test_gesture.py # Gesture recognition tests
β βββ π test_lambda.py # βοΈ AWS Lambda handler & validation tests
β βββ π test_plank.py # Plank lifecycle & timer tests
β βββ π test_pose_estimator.py # PoseEstimator protocol tests
β βββ π test_session_manager.py # Workout flow & state transitions
β βββ π test_smoothing.py # One Euro Filter convergence tests
β βββ π test_visualizer.py # Renderer + state display mapping tests
β βββ π verify_debouncing.py # Manual debouncing validation
β βββ π verify_features.py # Manual feature smoke tests
β βββ π verify_i18n.py # Manual i18n string verification
β βββ π verify_refactor.py # Manual refactor validation
βββ βοΈ .env # AWS credentials & cloud config
βββ βοΈ .gitignore
βββ π LICENSE # AGPL v3
βββ π README.md
βββ π main.py # Application entry point
βββ π requirements.txt # Python dependencies
- Project Initialization
- Architecture & Tech Stack Definition
- Repository Structure &
.gitignore
- Core Engineering
- Abstract
ExerciseClass - YOLOv8 Integration
- FSM & Feedback Architecture Refactoring
- Performance Optimization (Math + ROI Visualizer)
- Abstract
- Exercise Logic (MVP)
- Squat (Depth & Form)
- Push-up (Occlusion handling)
- Bicep Curl (Inverted Logic)
- Plank (Static stability check)
- Cloud & DevOps
- AWS Lambda Session Logger (
aws/lambda/lambda_function.py) - DynamoDB Table
- API Gateway HTTP API (
POST /sessions+ API Key auth) - IAM Least-Privilege Policy (
aws/iam-policy.json) - Cloud config in
settings.py+.envsupport - Lambda unit tests (
tests/test_lambda.pyβ 25 tests) - Unit Testing Suite (
tests/) - CI/CD Pipeline (GitHub Actions)
- AWS Lambda Session Logger (