This repository is a collection of classical IQA methods, long forgotten in favour of deep learning-based IQA methods. I am doing this because lots of papers cite classical methods when testing new IQA methods and many of them are not very well documented (GM-LOG, SSEQ, CORNIA, LFA, HOSA...). If all of them were implemented in a single package, it would be easier to try them.
This is my implementation of the SSEQ index. The full details of SSEQ can be found in the paper: No-reference image quality assessment based on spatial and spectral entropies (Liu et al.). The original MATLAB implementation is here.
I wasn't able to find a fully implemented Python version of this index, so I decided to use Aca4peop's code as a starting point and then add my own modifications. The main highlight of this version is the vectorized implementation of patch spatial entropy and DCT for spectral entropy (more info here)
This measure was proposed in Blind Image Quality Assessment Using Joint Statistics of Gradient Magnitude and Laplacian Features (Xue et al., 2014). The authors shared a MATLAB implementation that I used as a starting point.
This one was tougher to implement. The authors shared their MATLAB implementation, but the code was not very well documented and the paper doesn't help either (the explanation of the key fatures doesn't go too deep, and the features are computed in different order than in their MATLAB code!).
In fact, I think their work is flawed because they trained their models on LIVEIQA and tested them on TID2013... And both datasets have some images that are identical. But that's another story and clearly out of the scope of this repository.
This measure was proposed in Local Feature Aggregation for Blind Image Quality Assessment (Xu et al. 2015), and it was the precursor of other measures (like HOSA). There are some things to consider:
- Using 16-bit precision whenever possibe: The construction of the visual codebook is memory-hungry, and probably not intended to be done with a laptop. Each local feature corresponds to a BxB patch, which results in (HxW)/(BxB) patches, and that can take a lot of RAM if you are using a large image dataset. For example, if we resized the images from KonIQ-10k to make them 512x382 and used 7x7 patches, each image would produce 3992 local features, which would result in more than 39M features for the whole dataset! Just imagine if we used the original image size...
- Image resizing: Related to my first point, it's not clear whether the images undergo any resizing or if LFA was designed to work for all image sizes. To make it comparable to other IQA measures in this repository, I'm resizing the images to 512px prior to feature extraction.
- Mini-batch K-means: If we use larger IQA datasets, K-means will be very slow, so I decided to try this variation (available in scikit-learn) to see if it can speed up the learning phase.
I implemented HOSA according to Blind Image Quality Assessment Based on High Order Statistics Aggregation (Xu et al.). As it follows a similar approach (generating a codebook and computing some metrics with each cluster's assignments), I've also implemented it with the same tricks I used for LFA.
CORNIA is a very famous no-reference IQA measure that also makes use of visual codebooks. It was presented in Unsupervised Feature Learning Framework for No-reference Image Quality Assessment (Ye et al., 2012) and, after reading the other papers more carefully, I discovered that CORNIA was the starting point for many codebook-based methods (SOM, LFA, HOSA...).
Ye and Doermann (2011), the same authors of CORNIA, were the first ones to propose the use of visual codebooks for image quality assessment. Their first approach, before CORNIA, consisted of using Gabor-๏ฌlter based features to generate visual codewords, which I've reproduced thanks to Scikit-image's Gabor filters. The reason why I don't use OpenCV's Gabor filter is because it requires specifying the kernel size, and the authors never specified that parameter. After some search, it looks like in MATLAB's implementation of Gabor filters the kernel size is determined automatically, which makes it very similar to Scikit-image's implementation.
This one was presented in SOM: Semantic Obviousness Metric for Image Quality Assessment (Zhang et al., 2015). This measure uses the BING model for saliency detection. You can find the BING model files in this repo.
Unfortunately, the paper did not provide enough details for the implementation, such as the computing requirements or any link to their code. I had to make some assumptions and tweak some parameters to make it less memory-hungry, and also use 16-bit precision and mini-batch K-means.
Moreover, the OpenCV implementation of the BING object-like detector doesn't seem to work as expected either, as it is much slower than what the authors of BING reported (300fps). After a quick search, this appears to have happenned to other people in the past, and the OpenCV documentation for these saliency models is useless and almost non-existent. The objectness scores don't seem to work either, and it has been reported to the opencv-contrib team for a long, long time.
I have simply taken the BRISQUE package by rehangua and wrapped it to make it work like all my other models. This makes it possible to generate BRISQUE features for new datasets and train custom regression models. This measure was proposed in No-Reference Image Quality Assessment in the Spatial Domain (Mittal et al., 2012).
This one was easy as I only had to create a wrapper for Scikit-learn's measure.blur_effect. I'm adding this one to the repo just to have more IQA methods available in a single place. It was proposed in The blur effect: perception and estimation with a new no-reference perceptual blur metric (Crete et al., 2007).
In Analysis of focus measure operators in shape-from-focus (Pertuz et al., 2012) there are a lot of focus measures that could be easy to implement. Despite they're probably not very good for IQA, at least they can be used to compute some interesting measures.
- Brenner's focus measure (
MIS2), proposed in An automated microscope for cytologic research a preliminary evaluation (Brenner et al., 1976) - Image contrast (
MIS3in the paper), which was originally proposed in Practical calibrations for a real-time digital omnidirectional camera (Nanda and Cutler, 2001). I've called itNandaCutlerContrast. - Helmli and Scherer's mean method for contrast (
MIS5), proposed in Adaptive shape from focus with an error estimation in light microscopy (Helmli and Scherer, 2001). I've called itMeanMethodFocus. - Variance of Laplacian (
LAP4). I used the same method as in this Pyimagesearch post. The original measure was proposed in Diatom autofocusing in brightfield microscopy: a comparative study (Pacheco et al., 2000).
This is the method proposed in Sharpness Estimation for Document and Scene Images (Kumar et al., 2012). There is already a Python implementation of this metric (here), but I preferred to reimplement it to match the style of my other models. I've called it DeltaDifferences. I've also decided to make some contributions:
- Normalise the score instead of working in the [0,
$\sqrt{2}$ ] range. - Including the MaxDoM-based sharpness proposed by the authors in their patent (i.e. taking the max sharpness value at each window).
I wasn't able to find an "official" sharpness threshold, so I've used the one used in umang-singhal's repo. But I can be easily changed by the user depending on the image to analyse.
For all these models I'm following the same approach: splitting every dataset into a training and a test set. I use the training sets with K-fold cross-validation to get the best parameters for each regression model. As of today, it's possible to fit an SVR or an MLP.
I'm using the following datasets:
You can find some results here.