Real-time vision-based obstacle avoidance system combining YOLOv4-tiny object detection with MiDaS depth estimation for autonomous robot navigation.
This system processes camera input to detect obstacles, estimate their distances, and calculate safe navigation corridors in real-time. By combining 2D object detection with 3D depth information, it enables intelligent decision-making for autonomous mobile robots.
Platform: NVIDIA RTX 4050 Laptop GPU (development), WSL2 Ubuntu 22.04
Detection: YOLOv4-tiny identifies up to 80 object classes from COCO dataset
Depth Estimation: MiDaS small generates dense depth maps from monocular images
Processing: Complete GPU-accelerated pipeline using CUDA
Framework: ROS2 Humble with modular node architecture
Unlike typical approaches that sample only the center of bounding boxes, this system samples depth at three points (left, center, right) on each detected object. This enables correct handling of oblique obstacles where distance varies significantly across the object width.
Seven independent ROS2 nodes communicate via dedicated topics:
- camera_node: Image acquisition and resolution publishing
- yolo_node: Object detection with GPU-accelerated NMS
- midas_node: Depth estimation and multi-point sampling
- avoidance_node: 3D geometric trajectory calculation
- servo_node: Safe corridor visualization
- save_images: Automatic result logging
- start_node: Processing cycle orchestration
| Component | Processing Time | FPS |
|---|---|---|
| YOLOv4-tiny inference | 15-18ms | 55-65 |
| NMS (GPU-accelerated) | 1-2ms | - |
| MiDaS small inference | 30-35ms | 28-33 |
| Complete pipeline | 50-60ms | 16-20 |
Optimization: GPU-accelerated NMS reduces processing time by 85% compared to CPU implementation (10-20ms → 1-2ms).
Requirements
- ROS2 Humble (Ubuntu 22.04)
- Python 3.10+
- PyTorch 2.x with CUDA 12.4+
- CUDA-compatible NVIDIA GPU
- OpenCV 4.x
Installation
# Install ROS2 Humble
# Follow: https://docs.ros.org/en/humble/Installation.html
# Install Python dependencies
pip install torch torchvision opencv-python
# Clone repository
git clone https://github.com/andymisu/obstacle-avoidance-ros2.git
cd obstacle-avoidance-ros2
# Build ROS2 workspace
cd ros2_ws
colcon build --packages-select obstacle_avoidance
source install/setup.bashModel Weights
Download YOLOv4-tiny weights (~23MB):
cd src/obstacle_avoidance/obstacle_avoidance/
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4-tiny.weightsMiDaS model downloads automatically on first run via PyTorch Hub.
Launch Complete System
ros2 launch obstacle_avoidance obstacle_launch.pyRun Individual Nodes (debugging)
ros2 run obstacle_avoidance camera_node
ros2 run obstacle_avoidance yolo_node
ros2 run obstacle_avoidance midas_node
ros2 run obstacle_avoidance avoidance_nodeVisualize Topics
# View detection results
ros2 topic echo /yolo/detections
# Visualize images
ros2 run rqt_image_view rqt_image_view| Topic | Message Type | Description |
|---|---|---|
/camera/image_raw |
sensor_msgs/Image | Raw camera input |
/camera/resolution |
std_msgs/Int32MultiArray | Image dimensions |
/yolo/detections |
vision_msgs/Detection2DArray | Detected objects with bounding boxes |
/yolo/image_annotated |
sensor_msgs/Image | Annotated detection visualization |
/midas/depth_detections |
custom | Objects with 3-point depth measurements |
/avoidance/steering_angle |
std_msgs/Int16 | Calculated safe angle |
/servo/corridor_image |
sensor_msgs/Image | Safe corridor visualization |
MiDaS generates relative depth (inverse depth). The system normalizes to 0-10m range using:
- Known scene parameters
- Camera focal length (calculated from FOV)
- Calibration with known distances
-
Obstacle Classification
- Dangerous zone: <2m
- Warning zone: 2-4m
-
3D Geometric Calculation
- Convert pixel positions to angles using: θ = arctan(pixel_offset / focal_length)
- Build occupied angular intervals with adaptive safety margins
- Identify free zones between obstacles
-
Trajectory Selection
- Score free zones based on: width, proximity to 0° (straight), bonus for continuing current direction
- Select highest-scoring zone
- Emergency fallback: steer away from closest obstacle edge
For objects with depth variation >0.5m across width:
- Identify closest side (left or right)
- Apply asymmetric safety margins (1.5x on close side, 0.8x on far side)
- Prioritize avoidance toward safer side
The system generates four visualization types per cycle:
- YOLO Detections: Bounding boxes with class labels and confidence scores
- MiDaS Depth Map: Color-coded depth (red=close, blue=far)
- Combined Depth+YOLO: Depth map with overlaid detections and 3-point distance measurements
- Safe Corridor: Original image with green line indicating calculated navigation direction
- Depth estimation is relative, not absolute metric
- Performance degrades with poor lighting or strong backlighting
- Resolution >2000×2000 reduces FPS below real-time threshold
- Requires GPU for acceptable performance (CPU: 4-6 FPS vs GPU: 16-20 FPS)
- Mobile robot navigation (indoor/outdoor)
- Autonomous wheelchairs for accessibility assistance
- ADAS components for autonomous vehicles
- Drone navigation in confined spaces
Technical Improvements
- Upgrade to YOLOv8/YOLOv9 for better accuracy
- Metric depth calibration using known markers
- IMU integration for odometry
- SLAM implementation for spatial memory
Performance Optimization
- TensorRT conversion for faster inference
- FP16/INT8 quantization for reduced memory
- Asynchronous pipeline with buffering
Platform Porting
- Resolve PyTorch compatibility on Jetson with JetPack 6+
- Test on NVIDIA Jetson AGX Orin
- Explore ONNX Runtime alternatives
Mihai Andrei Mărtinaș
- GitHub: @andymisu
- LinkedIn: Mihai Andrei Mărtinaș
ROS2 Humble, Python 3.10, PyTorch 2.x, CUDA 12.4, OpenCV 4.x, YOLOv4-tiny, MiDaS, WSL2