Skip to content

Latest commit

 

History

History
82 lines (62 loc) · 3.33 KB

File metadata and controls

82 lines (62 loc) · 3.33 KB

Computer Vision Algorithms Overview

This document provides a deeper look into the algorithms implemented in the ComputerVisionLab project.

📋 Table of Contents

  1. Face Detection (Haar Cascades)
  2. Canny Edge Detection
  3. Image Segmentation (Watershed)
  4. ORB Keypoint Detection
  5. Contour Analysis (Coin Counting)
  6. Image Thresholding

1. Face Detection (Haar Cascades)

Face detection is implemented using the Haar Feature-based Cascade Classifiers. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images.

Implementation Details:

  • Classifier: haarcascade_frontalface_default.xml
  • Logic:
    detector = cv2.CascadeClassifier(FACE_DETECTOR_PATH)
    rects = detector.detectMultiScale(
        image,
        scaleFactor=1.1,
        minNeighbors=5,
        minSize=(30, 30),
        flags=cv2.CASCADE_SCALE_IMAGE
    )

2. Canny Edge Detection

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images.

Stages:

  1. Noise Reduction: Apply Gaussian filter to smooth the image.
  2. Finding Intensity Gradient: Find the intensity gradients of the image.
  3. Non-maximum Suppression: Get rid of spurious response to edge detection.
  4. Hysteresis Thresholding: Final step to decide which are all edges are actually edges and which are not.

3. Image Segmentation (Watershed)

The Watershed algorithm is a classic algorithm used for segmentation, especially when you have touching objects in an image.

Workflow:

  1. Thresholding: Find approximate foreground and background.
  2. Distance Transform: Calculate the distance from each foreground pixel to the nearest background pixel.
  3. Marker Creation: Use the peaks of the distance transform as markers for the objects.
  4. Watershed: "Fill" the markers with color until they meet at the boundaries.

4. ORB Keypoint Detection

ORB (Oriented FAST and Rotated BRIEF) is a fast robust local feature detector, first presented by Ethan Rublee et al. in 2011. It is based on the FAST keypoint detector and a modified version of the visual descriptor BRIEF (Binary Robust Independent Elementary Features).

Why ORB?

  • It is a great alternative to SIFT and SURF.
  • It is computationally efficient and free from patent restrictions.

5. Contour Analysis (Coin Counting)

Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity.

Steps in the Lab:

  1. Grayscale conversion.
  2. Gaussian Blur to reduce noise.
  3. Canny edge detection.
  4. cv2.findContours() to extract object boundaries.

6. Image Thresholding

Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images.

Methods in the Lab:

  • Global Thresholding: A fixed value is used as a threshold.
  • Adaptive Mean Thresholding: The threshold value is the mean of the neighborhood area.
  • Adaptive Gaussian Thresholding: The threshold value is the weighted sum of neighborhood values where weights are a Gaussian window.