Easy-Touch is a 2024-2025 College Student Innovation and Entrepreneurship Training Program project at Northeastern University. This project aims to use a more easily deployable solution to replace interaction methods that rely on depth sensors: using only an ordinary camera to capture laser points in the projected image, combining object detection and perspective transformation to map the laser point's position to the computer screen coordinates, and finally calling the system's low-level API to perform click operations.
- YOLOv8-based Laser Point Detection: Uses a trained YOLO model to identify laser points in the projected image, minimizing reliance on specialized hardware.
- Automatic Projection Area Recognition: Combines OpenCV contour detection methods to extract the four corner points of the projection area for subsequent coordinate transformation.
- Perspective Transformation Coordinate Mapping: Uses perspective transformation to convert coordinates from the camera's perspective to the screen coordinate system, reducing position deviations caused by installation angles.
- Basic Interaction Simulation: Currently supports left-click, right-click, and double-click, utilizing a simple time threshold to reduce continuous false touches.
- Visual Control Panel: Provides a simple Tkinter-based GUI to pause recognition and adjust horizontal and vertical offsets.
This project mainly relies on the following core libraries:
ultralytics(YOLOv8)opencv-pythonnumpykeyboard
Installation method:
pip install -r src/requirements.txt- Device Preparation: Project the computer screen onto a wall or curtain, and connect a camera to the computer, pointing it at the projection area.
- Model Preparation: Ensure that the trained laser pen recognition model weights are placed in the
src/model/best.ptpath. - Start the System: Run the main program script.
python src/predictByCap.py
- Calibration and Operation:
- After startup, the program will automatically attempt to recognize the boundaries of the projection area
- In the popped-up
EasyTouch_V1.0interface, you can select the mouse operation you want to execute (Click / Right Click / Double Click). - If you notice a slight deviation in the click position, you can perform real-time calibration using the Horizontal Offset and Vertical Offset sliders at the bottom of the interface.
- Shine the laser pen within the projection area, and the system will respond with the corresponding mouse action in real-time.
First, we collected and constructed a dataset of about 100 laser pen images.
💡 Tuning Tip: During the collection step, you can appropriately lower the camera's exposure. This can significantly highlight the relative brightness of the red laser without affecting the overall projection effect, thereby greatly improving the subsequent model's recognition accuracy.
The collected images were uploaded to Roboflow for annotation and preprocessing, mainly including:
- Bounding box annotation for the laser points.
- Data augmentation such as rotation, scaling, and brightness adjustment.
- Splitting the data into training, validation, and test sets.
Figure 1: A single collection sample, with the yellow box marking the laser point position.

Figure 2: A collage of some collection samples, showing laser points against different backgrounds and positions.

Model training is based on the YOLOv8 framework provided by Ultralytics:
yolo task=detect mode=train model=yolov8n.pt data=path/to/your/data.yaml epochs=50 imgsz=640 device=0The camera and the projector are usually not facing the same plane perfectly, so the projection area seen by the camera is generally a tilted quadrilateral rather than a standard rectangle. If these coordinates are used directly to control the mouse, the deviation will be quite noticeable.
To solve this problem, the program first identifies the boundary of the projection area and extracts the four corner points to serve as inputs for the subsequent perspective transformation.
After obtaining the four corner points, the program calculates the perspective transformation matrix using cv2.getPerspectiveTransform to map the projection area in the camera frame to the computer screen coordinate system. The purpose of doing this is to minimize the distortion effects caused by the shooting angle.
The left side of the figure below shows the projection area from the camera's perspective, and the right side is a schematic diagram corresponding to the standard screen plane after perspective transformation.
When YOLO detects a laser point, the system calculates its center position and triggers a click, right-click, or double-click according to the current mode. To prevent a laser point from continuously triggering multiple operations in a short period, a simple debounce logic based on a time threshold is added to the program.
The simulation of mouse events currently relies on the Windows low-level API ctypes.windll.user32, so this set of interaction logic is designated for Windows environments by default.
- It is still quite sensitive to ambient light and camera parameters. Moving to a different venue may require re-adjusting the exposure and offsets.
- Currently, it mainly supports click-based operations; the interaction methods are not rich enough yet.
Kinect Depth Sensing Solution By czaoth Perspective Transformation By Arthur Wang Perspective Transformation By MaWB
This project is licensed under the MIT License - see the LICENSE file for details.


