Examguard IA is an advanced web platform designed to allow users to meticulously track a person's activity in a video recording to identify fraudulent behaviors along the timeline. The system is capable of tracking head position, detecting when the head is to the right, left, up, down, or facing forward. It also detects the presence of the face and indicates when it is absent. Another feature is object detection, allowing the tracking of elements like mobile phones or computer peripherals in the video. All this information is collected after processing the video and is then presented in an interactive and dynamic report. In this report, users can view the specific moments in the video where detected events occurred, along with visual evidence and various statistics.
The AI component of Examguard IA was built using Python. For face detection, the cvlib library is used. Head position tracking is performed using the PnP (Perspective-n-Point) algorithm, supported by MediaPipe to obtain facial landmark points. The processing of these landmark points and the iteration over video frames are handled with OpenCV, which is also used for object detection. The system incorporates a custom object detection model trained with YOLOv8, using images extracted from freely available datasets like COCO, OpenDataset, and other datasets available on Roboflow. All these techniques are integrated into a single Python algorithm that processes the videos received through an API built in Flask, returning the analyzed results.
The Examguard IA website is meticulously built in Java, using the Spring Boot framework. Authentication and authorization are managed with Spring Security. On the frontend, HTML, CSS, JavaScript, and Bootstrap are used, based on the free dashboard from CoreUI. When a user uploads a video to the Spring Boot platform, it connects with the Python API implemented in Flask. Communication between Flask and Spring is managed via JWT (JSON Web Tokens). The Flask API stores the video in a local file system and, if instructed, processes it second by second (not frame by frame, to optimize processing time).
- Features
- Algorithm Preview
- Application
- Tools Used
- Installation
- Areas for Improvement
- Contributors
- License
- Contact Me
Features of Examguard IA
-
Account Management: Users can create accounts and log in, with the option to remember the session for future access.
-
Recording Management by Folders: Users can create folders to organize different recordings. For instance, a folder might represent an entire class. Each folder can have a title and description, and upon accessing it, general statistics calculated from the videos it contains will be displayed, such as Overall Fraud Percentage and Total Fraudulent Events.
Important
An event is defined as any detected act at a specific second of the recording.
-
Upload and Processing of Multiple Recordings: Users can upload multiple recordings simultaneously. Before processing them, they have the option to define what is considered fraudulent based on the context, selecting from the following options:
Detected Event Description PHONE_DETECTED Mobile phone detected MOUSE_DETECTED Mouse detected KEYBOARD_DETECTED Keyboard detected MONITOR_DETECTED Monitor detected LAPTOP_DETECTED Laptop detected NOT_FACE_DETECTED Face not detected MULTIPLE_FACE_DETECTED Multiple faces detected HEAD_POSE_LEFT Head turned left HEAD_POSE_RIGHT Head turned right HEAD_POSE_UP Head looking up HEAD_POSE_DOWN Head looking down HEAD_POSE_FORWARD Head looking forward HEAD_POSE_UNKNOWN Unknown head position SAFE Action considered safe -
Detailed Recording Analysis: After processing, users can access an in-depth detail of each recording, where individual statistics are displayed, such as the fraud percentage, graphs of detected fraud by event type, and gaze trend graphs.
-
Interaction with the Fraudulent Events Timeline: In the recording detail, a timeline is generated that marks each detected fraudulent event. Clicking on a segment of the timeline displays a panel showing visual evidence of what was detected, including images and GIF sequences to help better understand the context.
Note
The detected fraud percentage is a metric based on the time range when the event was detected relative to the total duration of the video. This calculation can be controversial; for example, if someone uses a phone for 5 seconds in a one-hour video, the fraud percentage would be extremely low.
- Association of Recordings to People: Users can associate a recording with a specific student or person and send emails with the obtained results.
This section provides an overview of how the algorithm processes a video to detect faces, estimate head rotation, and identify objects.
- Video Input: The algorithm starts by receiving a video file.
- Frame Extraction: The video is decomposed into frames, with one frame selected per second.
- Face Detection:
- Single Face: If a face is detected, the algorithm estimates the head's rotation.
- No Face: It registers the absence of a face.
- Multiple Faces: It detects and records the presence of multiple faces.
- Object Detection: Simultaneously, the YOLO model identifies and classifies objects in the frame.
The YOLOv8 model medium effectively identified objects across various conditions, showing improvements in key metrics over 30 epochs:
- Precision: Increased from 0.66988 to 0.77241.
- Recall: Improved from 0.66753 to 0.74952.
- mAP50: Rose from 0.68484 to 0.79044.
- Validation Losses: Decreased consistently, indicating strong generalization.
Using cvlib, face detection was successful, though it faced challenges with faces too close or too far from the camera. MediaPipe provided precise and detailed landmark estimations, crucial for understanding facial geometry.
Important
Ensure to review and validate the detected events before sending the results, as the interpretation of the context may vary depending on the case.
Welcome View
This initial view displays the Examguard IA branding. The main image is an owl, symbolizing the application's motto: "With Examguard, protect your assessments as an owl guards its territory."
Login/Register View
In this view, users can log in or register on the platform. There is also the option to remember the session for future logins.
Warning
Videos cannot be processed or operations performed in the application or API without being registered and authenticated. The Flask API is for internal use by Spring Boot only.
Home View
A simple view welcoming the user, displaying their personal information and offering the option to change the password.
Folder List View
Here, the folders created by the user are listed. They appear as cards where you can see the folder name, description, number of recordings contained, and a progress bar indicating the amount of detected fraud. Additionally, you can create a new folder through a collapsible side panel.
Recording List View
This view shows all recordings, regardless of the folder they are in. It presents detailed statistics, including:
- Fraud Percentage
- Total Events
- Total Fraudulent Events
- Total Recordings
- Unprocessed Recordings
- Total Time in Recordings
- Average Time per Recording
- Processed Recordings
Student List View
In this view, students or people that can be associated with a recording are listed. Users have the option to edit or delete registered students.
Folder Detail View
This view first presents a panel with general statistics about the folder. Next, there is a drag-and-drop area for uploading or dragging recordings. Following this, reference graphs are displayed, followed by a table listing the recordings with their metadata (duration, processing status, associated student, fraud percentage, etc.). The table also includes a button panel to view the recording details, process it, or delete it. When selecting process, a modal opens allowing the user to choose which events to detect.
Recording Detail View
This is one of the most important views on the platform. It first displays the detected statistics for the recording, followed by two graphs: a bar chart showing the Amount of Detected Fraud by Event Type and a radar chart presenting Gaze Trend. Next, a panel indicates the previously selected events. Finally, the crucial section of the timeline list is displayed, where a timeline is generated for each type of detected fraudulent event. If fraud is detected in a time segment, it is marked with a different color. Clicking on that segment opens a panel showing a list of images corroborating the detection. Users can navigate through these images, between segments, and even generate GIFs with the sequence of images within a segment.
-
Java: Main programming language used for backend development of the platform.
-
Spring Boot: Framework used to build the backend of the application, providing a robust and scalable architecture.
-
Spring Security: Implemented for authentication and authorization within the platform, ensuring that only registered and authenticated users can access the functions.
-
JWT (JSON Web Tokens): Used to manage authentication between the web application and the Flask API, ensuring secure communication.
-
MongoDB GridFS: Used to store detected evidence images in videos, allowing efficient management of large binary files.
-
HTML, CSS, JS, Bootstrap: Front-end technologies used to create an interactive and responsive user interface. Bootstrap was used as the base for the dashboard design (using the free CoreUI dashboard).
-
Gif Encoder by Square: Java library used for creating GIFs from sequences of images detected in the videos. Gif Encoder by Square
-
Python: Language used to develop video processing and artificial intelligence logic.
-
Flask: Lightweight web framework used to create the REST API that processes videos and returns results to the Java application.
-
Local File System: Used to temporarily store video recordings before processing them by the Flask API.
-
OpenCV: Computer vision library used to iterate over video frames and perform necessary analysis.
-
Mediapipe: Used for facial landmark estimation, allowing precise tracking of head position and other facial movements.
-
Yolov8: Object detection model trained to identify specific elements within recordings.
-
SendGrid: Email service used to notify users about evaluation results.
-
Docker and Docker Compose: Tools used to containerize and orchestrate the application and its services, facilitating deployment and scalability.
- Database: Configure a local or cloud MongoDB database.
- Development Environment: If not using Docker, ensure Python 3.12 and Java 17 are installed on your system.
- PyTorch Model: You will need a PyTorch model for object detection.
Important
The pre-trained model is not included due to licensing restrictions.
You can install the application in two ways: using Docker Compose or setting it up manually in an IDE or console.
Step 1: Clone the Repository
Clone the repository to your local machine.
git clone https://github.com/darvybm/examguard-aiStep 2: Configure Model Classes
Before starting, you need to configure the model classes in the cheat_detector_optimized.py file. Locate the file in ModelAPI/cheat_detector_optimized.py and find the following code snippet:
def __init__(self, model_path):
self.model = YOLO(model_path, task='detect')
self.classes = ["keyboard", "laptop", "monitor", "mouse", "phone"]
self.mp_face_mesh = mp.solutions.face_mesh
self.face_mesh = self.mp_face_mesh.FaceMesh(static_image_mode=False, refine_landmarks=True, max_num_faces=1, min_detection_confidence=0.5)
self.lock = Lock()
self.width = 480
self.height = 384Here, you can add or remove classes based on what your model can detect.
Warning
The model must have the .pt extension.
Step 3: Adjust Event Types
If your model classes differ from those shown, you need to add or remove those event types in both Python and Java.
In Python (EventType):
Edit the EventType class in ModelAPI/app.py:
class EventType(Enum):
PHONE_DETECTED = "PHONE_DETECTED"
MOUSE_DETECTED = "MOUSE_DETECTED"
KEYBOARD_DETECTED = "KEYBOARD_DETECTED"
LAPTOP_DETECTED = "LAPTOP_DETECTED"
MONITOR_DETECTED = "MONITOR_DETECTED"
NOT_FACE_DETECTED = "NOT_FACE_DETECTED"
MULTIPLE_FACE_DETECTED = "MULTIPLE_FACE_DETECTED"
HEAD_POSE_LEFT = "HEAD_POSE_LEFT"
HEAD_POSE_RIGHT = "HEAD_POSE_RIGHT"
HEAD_POSE_UP = "HEAD_POSE_UP"
HEAD_POSE_DOWN = "HEAD_POSE_DOWN"
HEAD_POSE_FORWARD = "HEAD_POSE_FORWARD"
SAFE = "SAFE"In Java (EventType):
Edit the EventType enum in ExamGuard/src/main/java/pucmm/eict/proyectofinal/examguard/model/enums/EventType.java:
public enum EventType {
// Objects
PHONE_DETECTED,
MOUSE_DETECTED,
KEYBOARD_DETECTED,
MONITOR_DETECTED,
LAPTOP_DETECTED,
// Faces
NOT_FACE_DETECTED,
MULTIPLE_FACE_DETECTED,
// Head Pose Estimation
HEAD_POSE_LEFT,
HEAD_POSE_RIGHT,
HEAD_POSE_UP,
HEAD_POSE_DOWN,
HEAD_POSE_FORWARD,
HEAD_POSE_UNKNOWN,
SAFE
}-
Create
.envFile: Create a.envfile in the root of the project with the following environment variables:JWT_SECRET_KEY=<your-jwt-secret> JWT_EXPIRATION_TIME=<jwt-expiration-time> FLASK_API_URL=http://flask-api:5000/api SPRING_DATA_MONGODB_URI=mongodb://mongo:27017/examguard -
Start the Project with Docker Compose:
docker-compose up --build
-
Configure Python Virtual Environment:
- Navigate to the
ModelAPIfolder and create a Python virtual environment.
python3.12 -m venv venv source venv/bin/activate # On Linux/Mac venv\Scripts\activate # On Windows
- Install dependencies from the
requirements.txtfile.
pip install -r requirements.txt
- Navigate to the
-
Adjust Environment Variables:
- Configure the necessary variables in the
application.propertiesfile for Spring Boot (located inExamGuard/src/main/resources/application.properties) and inapp.pyfor Flask (located inModelAPI/app.py).
- Configure the necessary variables in the
Important
Add the JWT_SECRET_KEY variable in both app.py and application.properties.
-
Start Services:
- Start the Flask REST API:
python ModelAPI/app.py
- Run the Spring Boot application with Gradle:
./gradlew bootRun
-
Voice-to-Text Integration:
Integrate a voice-to-text system to transcribe verbal interactions during assessments. This will facilitate the detection of suspicious behaviors such as phone calls between peers, exchange of answers, or other unauthorized verbal communications. -
Object Detection Model Optimization:
Refine and adjust the object detection model to improve accuracy, especially under varying lighting conditions or when objects are partially obscured. -
Perspective-n-Point (PnP) Algorithm Enhancement:
Improve the PnP algorithm to handle cases where faces are completely profile, ensuring more precise detection of head orientations. -
Cloud File Management System Implementation:
Implement a high-performance cloud file management system for data management and storage, facilitating scaling and efficiency in handling large volumes of information.
Darvy Betances Code Creator |
Pontificia Universidad Católica Madre y Maestra (PUCMM) Educational Institution |
Acknowledgments:
I want to thank my family for their constant support throughout this process. Their backing has been essential for the development of this project.
I also thank my advisor, Máximo Medrano, for his guidance and assistance throughout the project. His knowledge and help were crucial for its success.
Additionally, I appreciate my professors for providing me with the necessary tools and knowledge, and my fellow students for their support and collaboration over the years. Their camaraderie has been very valuable to me. Thank you all!
This project is intended exclusively for educational and research purposes. It is not authorized for commercial use. All rights are reserved, and any commercial use of the code or any part of the project is prohibited.


















