Facial-Recognition-using-Fourier-Transform

The Idea

Fourier Transform is just one of many different face recognition methods that have been developed over the last 25 years. Comparing to the Machine Learning approach, Fourier Transform is a very simple and fast algorithm. It extracts frequency features of a face, rather than analyzing the image pattern using a convolutional network. The Main idea is to find the most variant frequencies in the image database and identify the faces by matching these frequencies.

The face on the left is used as input in this algorithm. The predicted face is on the right:

The Math

Fourier Transform

$f'_{u,v} = \sum_{y=0}^{M-1}\sum_{x=0}^{N-1}{f_{x,y}}\cdot e^{{2\pi j}{(\frac{x\cdot u}{N} + \frac{y\cdot v}{M})}}$

The formula of Fourier Transform means that an image of size N x M can be decomposed into frequencies (of various wave length j) in the directions of u or v. u corresponds to the horizontal direction, while v corresponds to the vertical direction. x and y are the measurements along u and v.

Euler's formula

$e^{ix}=cos(x) +sin(x) \cdot i$

Euler's formula just says that each wave length is made of cos and sin waves, written in a complex number form, where cos is in the real part and sin is in the imaginary part.

Visualization of Fourier Transform

The math may seem to be complicated, but the two formulas explain a simple concept: an image is composed of various frequencies. Here is the example of Fourier Transform:

5 horizontal waves	10 horizontal waves	15 diagonal waves

FFT of 5 horizontal waves	FFT of 10 horizontal waves	FFT of 15 diagonal waves

The above images show: After Fourier Transform, each frequency is decomposed into 2 white pixels, symmetrical around the origin (0,0). Higher frequencies are farther away from the origin and their direction aligns with the wave direction. Overlaying the above three images and Fourier Transform it, we get this:

Three waves combined	FFT of Three waves combined

The Approach

Take a face image and pad the image into 128 x 128 pixels. Fourier Transform of the image is on the right:

Input Sample Image	FFT of Sample

However, this is not very helpful. We don't know which frequency in this image is very different from the rest. We need to apply Fourier Transform to the whole image database, and find out where is not most variant frequencies.

The variance of frequency in image data

The above plot is the variance of frequency throughout the whole image database. If we decompose this variance plot into "real" and the "imaginary " parts, we get the two plots below. (The real part is the cos wave, and imaginary part is the sin wave, as we explained in the above). We can see something very interesting: Two bright diamonds around the origin! Look closely, the brightest part is the most variant frequencies, distinguishing the face features of among all images. They are the feature frequencies!

real feature frequency	imaginary feature frequency

Set the threshold for both real and imaginary part, we can filter those feature frequencies!

% locate the big variance in both parts of the image data
big_variance_real = real_var_map > 0.7;
big_variance_imag = imag_var_map > 0.05;

Plot the 1/4 of the "diamond", which is the fourth lower quadrant. 1 indicates the most variant frequencies throughout the image database. We can use the two matrices as masks to filter out the feature frequencies.

Location of real feature frequency	Location of imaginary feature frequency

After deleting these feature frequencies, the input sample loses many details:

original input	delete feature frequencies

The algorithm only matches the frequency in these positions. The distances between the feature vector of the input face and all faces in the image data are calculated. The smallest distance then gives the closest match.

function x = feature_distance(A, B)
%The distances between the feature vectors
    M = abs(A-B);
    x = sum(sum(M));
end

Result

The Matlab code takes a sample face as an input, and find the closest distance in the feature space and output that face from the image database. I found the real part frequency matters more than the imaginary. It has above 95% accuracy. So I tried to rotate and flip the input image and check the accuracy again.

original input # 17	# 17 flip	# 17 rotate 90

output: real 17, imag 17	output: real 17, imag 9	output: real 37, imag 29
avg acc > 95%	avg acc > 90%	avg acc < 10%

This algorithm seems not to be very robust to rotations, but I doubt people would rotate their head 90 degrees when taking pictures. ¯ \ _ (ツ) _ / ¯

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
att_faces		att_faces
output images		output images
.gitignore		.gitignore
README.md		README.md
facial_rec.asv		facial_rec.asv
facial_rec.m		facial_rec.m
test_17_4.pgm		test_17_4.pgm
test_32_9.pgm		test_32_9.pgm
test_7_7.pgm		test_7_7.pgm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Facial-Recognition-using-Fourier-Transform

The Idea

The Math

Fourier Transform

Euler's formula

Visualization of Fourier Transform

The Approach

Result

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Facial-Recognition-using-Fourier-Transform

The Idea

The Math

Fourier Transform

Euler's formula

Visualization of Fourier Transform

The Approach

Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages