Skip to content

Commit aa5fd89

Browse files
hjmjohnsonclaude
andcommitted
ENH: Add StructuralSimilarityImageFilter for SSIM image quality
Implements the Structural Similarity Index Measure as an N-dimensional, multi-threaded ITK filter in the ImageCompare module. Closes #6030, supersedes #6031. == Algorithm and reference sources == The implementation follows the canonical SSIM formulation of Wang, Bovik, Sheikh, and Simoncelli, "Image Quality Assessment: From Error Visibility to Structural Similarity," IEEE Trans. Image Processing 13(4), 2004. The reference materials consulted while building this filter were: - Wang et al. 2004 paper (full text): https://www.cns.nyu.edu/pub/eero/wang03-reprint.pdf - Wang et al. SSIM MATLAB reference (utlive/ssim/ssim_index.m): https://github.com/utlive/ssim/blob/main/ssim_index.m - scikit-image v0.25 structural_similarity implementation: https://github.com/scikit-image/scikit-image/blob/v0.25.0/skimage/metrics/_structural_similarity.py - Wang, Simoncelli, Bovik, "Multi-Scale Structural Similarity for Image Quality Assessment," Asilomar 2003 (MS-SSIM, future extension): https://www.cns.nyu.edu/pub/eero/wang03b.pdf - Wikipedia SSIM article (formulas, defaults): https://en.wikipedia.org/wiki/Structural_similarity_index_measure For two images x and y the filter computes local statistics by convolving with a discrete Gaussian (sigma=1.5, 11x11 default, matching Wang et al. and the default of skimage.metrics.structural_similarity): mu_x = G_sigma * x mu_y = G_sigma * y var_x = G_sigma * (x*x) - mu_x^2 var_y = G_sigma * (y*y) - mu_y^2 cov = G_sigma * (x*y) - mu_x * mu_y The three SSIM components are l(x,y) = (2*mu_x*mu_y + C1) / (mu_x^2 + mu_y^2 + C1) c(x,y) = (2*sigma_x*sigma_y + C2) / (var_x + var_y + C2) s(x,y) = (cov + C3) / (sigma_x*sigma_y + C3) with C1 = (K1*L)^2, C2 = (K2*L)^2, C3 = C2/2, K1=0.01, K2=0.03, L the dynamic range. The combined SSIM is l^alpha * c^beta * s^gamma. When alpha=beta=gamma=1 (the default), the filter takes the simplified product fast path SSIM = (2*mu_x*mu_y + C1)*(2*cov + C2) / ((mu_x^2 + mu_y^2 + C1)*(var_x + var_y + C2)) which is what Wang et al.'s reference distributes and what skimage uses by default. == Filter architecture == The filter is structured as a composite ImageToImageFilter: 1. Internal sub-pipeline of five DiscreteGaussianImageFilter passes (mu_x, mu_y, mu_xx, mu_yy, mu_xy) reusing ITK's well-optimized, multi-threaded smoothing. 2. A parallelized per-pixel combination via MultiThreaderBase::ParallelizeImageRegion that reads from the five smoothed buffers and writes the SSIM map. 3. The mean SSIM is accumulated only over the interior region (cropped by half the Gaussian kernel width), matching scikit-image and the MATLAB reference, since pixels within the kernel half-width use boundary-extended values inside the convolution and are less reliable. Inputs: two images of the same template type and identical region. Outputs: - a per-pixel SSIM map (TOutputImage, default Image<float, D>) - GetMeanSSIM() returns the scalar after Update() Configurable runtime parameters (covering all of issue #6030): - GaussianSigma (default 1.5) - MaximumKernelWidth (default 11) - K1, K2 stability constants (defaults 0.01, 0.03) - DynamicRange L (NumericTraits-derived: 1.0 for float/double, 255 for uchar, 65535 for ushort, ...) - LuminanceExponent alpha (default 1.0) - ContrastExponent beta (default 1.0) - StructureExponent gamma (default 1.0) - ScaleWeights array (default {1.0} -> single-scale). Multi-element arrays reserve API space for a future MS-SSIM extension and currently throw a not-yet-implemented exception in BeforeGenerate. == Test strategy == The filter ships with 30 GoogleTest assertions in itkStructuralSimilarityImageFilterGTest.cxx, organized into four classes of expected values that decouple correctness checks from sensitivity to the discrete-Gaussian implementation: Class 1 -- mathematical identities (kernel-independent, tolerance 1e-9) SSIM(x, x) = 1 exactly for any image (constant, random, gradient). SSIM is symmetric: SSIM(a,b) == SSIM(b,a). Class 2 -- closed-form analytic checks for constant inputs Constant inputs make all variances and the covariance vanish, so SSIM(constant_a, constant_b) = (2*a*b + C1) / (a^2 + b^2 + C1). Tested at (100, 150) -> 0.9230923 and the textbook (0, 255) -> 0.0000999900. Verified pixel-wise across the output map (every map element matches the closed form). Class 3 -- input validation (exception tests) Mismatched input sizes, missing inputs, non-positive sigma, non-positive dynamic range, empty ScaleWeights, multi-element ScaleWeights (MS-SSIM not yet implemented). Class 4 -- qualitative properties Result range bounded in [-1, 1]. Monotonic decay of mean SSIM as additive Gaussian noise grows (sigma = 2 -> 8 -> 24). Strong anti-correlation for negated images (SSIM(x, 255-x) < -0.5). Class 5 -- cross-checks against scikit-image (loose tolerance 5e-3) Two reference values were computed offline against skimage.metrics.structural_similarity with gaussian_weights=True, sigma=1.5, use_sample_covariance=False, data_range=255, win_size=11 (i.e. the canonical Wang configuration): gradient + 30 luminance shift -> 0.9676912545 gradient * 0.5 contrast -> 0.7550069937 The 5e-3 tolerance absorbs minor discretization differences between ITK's GaussianOperator and scipy's sampled Gaussian (the two libraries do not produce bit-identical Gaussian kernels). Class 6 -- code-path equivalence The simplified-product fast path (alpha=beta=gamma=1) and the general l^alpha*c^beta*s^gamma path are exercised on the same inputs and required to agree to 1e-6. Class 7 -- multi-dimensional and pixel-type coverage 3D and 4D variants of the identity and constant-input tests. Default DynamicRange is correct for unsigned char (255), unsigned short (65535), and float (1.0). The reference values for Class 5 were generated with the following script (commit history; not shipped): import numpy as np from skimage.metrics import structural_similarity as ssim def ref(x, y, L=255.0): return ssim(x, y, gaussian_weights=True, sigma=1.5, use_sample_covariance=False, data_range=L, win_size=11, K1=0.01, K2=0.03) == Results == Local build with GCC 13.3 / Ninja / Release on Ubuntu 24.04: $ cmake --build build-ssim -j48 --target ITKImageCompareGTestDriver [4/4] Linking CXX executable bin/ITKImageCompareGTestDriver $ ./bin/ITKImageCompareGTestDriver \\ --gtest_filter='StructuralSimilarityImageFilter.*' [==========] 30 tests from 1 test suite ran. (60 ms total) [ PASSED ] 30 tests. $ ctest -R 'StructuralSimilarity' --output-on-failure 100% tests passed, 0 tests failed out of 30 Total Test time (real) = 0.30 sec ITKImageCompareHeaderTest1 (the auto-generated header self-test) and the existing ITKImageCompareTestDriver also link cleanly with the new module dependencies. pre-commit (gersemi, clang-format, kw-pre-commit, etc.) reports all checks Passed on every touched file. == Module dependency changes == ITKImageCompare/itk-module.cmake gains: - ITKSmoothing as a COMPILE_DEPENDS (for DiscreteGaussianImageFilter) - ITKGoogleTest as a TEST_DEPENDS (for the new GTest driver) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 8483b54 commit aa5fd89

6 files changed

Lines changed: 1392 additions & 0 deletions

File tree

Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
/*=========================================================================
2+
*
3+
* Copyright NumFOCUS
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* https://www.apache.org/licenses/LICENSE-2.0.txt
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*
17+
*=========================================================================*/
18+
#ifndef itkStructuralSimilarityImageFilter_h
19+
#define itkStructuralSimilarityImageFilter_h
20+
21+
#include "itkArray.h"
22+
#include "itkImage.h"
23+
#include "itkImageToImageFilter.h"
24+
#include "itkNumericTraits.h"
25+
26+
#include <type_traits>
27+
28+
namespace itk
29+
{
30+
/**
31+
* \class StructuralSimilarityImageFilter
32+
* \brief Computes the Structural Similarity Index Measure (SSIM) between two images.
33+
*
34+
* This filter computes the Structural Similarity Index Measure
35+
* \cite wang2004image between two input images of identical geometry. The
36+
* output image stores the per-pixel SSIM map. The scalar mean SSIM over the
37+
* valid (non-boundary) region is available via GetMeanSSIM() after Update().
38+
*
39+
* \par Algorithm
40+
* For two images \f$x\f$ and \f$y\f$, local statistics are computed by
41+
* convolving with a discrete Gaussian kernel of standard deviation
42+
* \f$\sigma\f$ (default 1.5):
43+
* \f[
44+
* \mu_x = G_\sigma * x, \quad \mu_y = G_\sigma * y,
45+
* \f]
46+
* \f[
47+
* \sigma_x^2 = G_\sigma * x^2 - \mu_x^2,\quad
48+
* \sigma_y^2 = G_\sigma * y^2 - \mu_y^2,\quad
49+
* \sigma_{xy} = G_\sigma * (xy) - \mu_x \mu_y .
50+
* \f]
51+
*
52+
* The three SSIM components are
53+
* \f[
54+
* l(x,y) = \frac{2\mu_x\mu_y + C_1}{\mu_x^2 + \mu_y^2 + C_1}, \qquad
55+
* c(x,y) = \frac{2\sigma_x\sigma_y + C_2}{\sigma_x^2 + \sigma_y^2 + C_2}, \qquad
56+
* s(x,y) = \frac{\sigma_{xy} + C_3}{\sigma_x\sigma_y + C_3}
57+
* \f]
58+
* with \f$C_1 = (K_1 L)^2\f$, \f$C_2 = (K_2 L)^2\f$, \f$C_3 = C_2/2\f$,
59+
* and \f$L\f$ the dynamic range of the pixel values.
60+
*
61+
* The combined SSIM is
62+
* \f[
63+
* \mathrm{SSIM}(x,y) = [l(x,y)]^{\alpha}\,[c(x,y)]^{\beta}\,[s(x,y)]^{\gamma}.
64+
* \f]
65+
*
66+
* With the default exponents \f$\alpha = \beta = \gamma = 1\f$ and the
67+
* convention \f$C_3 = C_2/2\f$, this collapses to the simplified form
68+
* \f[
69+
* \mathrm{SSIM}(x,y) =
70+
* \frac{(2\mu_x\mu_y + C_1)\,(2\sigma_{xy} + C_2)}
71+
* {(\mu_x^2 + \mu_y^2 + C_1)\,(\sigma_x^2 + \sigma_y^2 + C_2)}
72+
* \f]
73+
* which matches the reference implementation distributed by Wang et al.
74+
* and the default behavior of \c skimage.metrics.structural_similarity .
75+
*
76+
* \par Properties
77+
* - For identical images, the per-pixel SSIM is exactly 1 and the mean SSIM
78+
* is exactly 1 (subject to floating-point precision).
79+
* - The SSIM index is symmetric: \f$\mathrm{SSIM}(x,y) = \mathrm{SSIM}(y,x)\f$.
80+
* - The SSIM index is bounded above by 1. In typical cases it is
81+
* non-negative; values can be slightly negative for anti-correlated
82+
* regions.
83+
*
84+
* \par Parameters
85+
* - \c GaussianSigma: standard deviation of the Gaussian window
86+
* (default 1.5, matching Wang et al.).
87+
* - \c MaximumKernelWidth: hard limit on the discrete Gaussian kernel width
88+
* (default 11, matching the canonical 11x11 window).
89+
* - \c K1, \c K2: stability constants (defaults 0.01 and 0.03).
90+
* - \c DynamicRange: \f$L\f$ in the formulas above; defaults to the dynamic
91+
* range of the input pixel type via NumericTraits (e.g. 255 for
92+
* \c unsigned char, 1.0 for \c float / \c double). For arbitrary
93+
* floating-point images, set this explicitly to the actual data range.
94+
* - \c LuminanceExponent (\f$\alpha\f$), \c ContrastExponent (\f$\beta\f$),
95+
* \c StructureExponent (\f$\gamma\f$): defaults all 1.0.
96+
* - \c ScaleWeights: array of per-scale weights for multi-scale SSIM
97+
* (MS-SSIM, \cite wang2003multiscale). When the array contains a single
98+
* element (the default), the filter computes ordinary single-scale SSIM.
99+
* Multi-scale evaluation with more than one scale is not yet implemented
100+
* and will raise an exception in BeforeGenerate.
101+
*
102+
* The filter is N-dimensional, multi-threaded, and templated over the input
103+
* and output image types. The output pixel type defaults to \c float.
104+
*
105+
* \sa SimilarityIndexImageFilter
106+
* \sa DiscreteGaussianImageFilter
107+
*
108+
* \ingroup MultiThreaded
109+
* \ingroup ITKImageCompare
110+
*/
111+
template <typename TInputImage, typename TOutputImage = Image<float, TInputImage::ImageDimension>>
112+
class ITK_TEMPLATE_EXPORT StructuralSimilarityImageFilter : public ImageToImageFilter<TInputImage, TOutputImage>
113+
{
114+
public:
115+
ITK_DISALLOW_COPY_AND_MOVE(StructuralSimilarityImageFilter);
116+
117+
/** Standard class type aliases. */
118+
using Self = StructuralSimilarityImageFilter;
119+
using Superclass = ImageToImageFilter<TInputImage, TOutputImage>;
120+
using Pointer = SmartPointer<Self>;
121+
using ConstPointer = SmartPointer<const Self>;
122+
123+
/** Method for creation through the object factory. */
124+
itkNewMacro(Self);
125+
126+
/** \see LightObject::GetNameOfClass() */
127+
itkOverrideGetNameOfClassMacro(StructuralSimilarityImageFilter);
128+
129+
/** Image type aliases. */
130+
using InputImageType = TInputImage;
131+
using OutputImageType = TOutputImage;
132+
using InputPixelType = typename InputImageType::PixelType;
133+
using OutputPixelType = typename OutputImageType::PixelType;
134+
using InputImageRegionType = typename InputImageType::RegionType;
135+
using OutputImageRegionType = typename OutputImageType::RegionType;
136+
using SizeType = typename InputImageType::SizeType;
137+
using IndexType = typename InputImageType::IndexType;
138+
139+
static constexpr unsigned int ImageDimension = InputImageType::ImageDimension;
140+
141+
/** Floating-point type used for all SSIM computations. */
142+
using RealType = typename NumericTraits<InputPixelType>::RealType;
143+
144+
/** Type used for the user-specified array of multi-scale weights. */
145+
using ScaleWeightsType = Array<RealType>;
146+
147+
/** Set/Get the first input image. */
148+
/** @ITKStartGrouping */
149+
void
150+
SetInput1(const InputImageType * image)
151+
{
152+
this->SetInput(image);
153+
}
154+
const InputImageType *
155+
GetInput1() const
156+
{
157+
return this->GetInput(0);
158+
}
159+
/** @ITKEndGrouping */
160+
161+
/** Set/Get the second input image. */
162+
/** @ITKStartGrouping */
163+
void
164+
SetInput2(const InputImageType * image);
165+
const InputImageType *
166+
GetInput2() const;
167+
/** @ITKEndGrouping */
168+
169+
/** Standard deviation \f$\sigma\f$ of the Gaussian window used to compute
170+
* local statistics. Default 1.5 (matching Wang et al. 2004). */
171+
/** @ITKStartGrouping */
172+
itkSetMacro(GaussianSigma, double);
173+
itkGetConstMacro(GaussianSigma, double);
174+
/** @ITKEndGrouping */
175+
176+
/** Maximum width (per dimension) of the discrete Gaussian kernel.
177+
* Default 11, giving an 11x11 window in 2D when sigma=1.5. */
178+
/** @ITKStartGrouping */
179+
itkSetMacro(MaximumKernelWidth, unsigned int);
180+
itkGetConstMacro(MaximumKernelWidth, unsigned int);
181+
/** @ITKEndGrouping */
182+
183+
/** \f$K_1\f$ stability constant. Default 0.01. */
184+
/** @ITKStartGrouping */
185+
itkSetMacro(K1, double);
186+
itkGetConstMacro(K1, double);
187+
/** @ITKEndGrouping */
188+
189+
/** \f$K_2\f$ stability constant. Default 0.03. */
190+
/** @ITKStartGrouping */
191+
itkSetMacro(K2, double);
192+
itkGetConstMacro(K2, double);
193+
/** @ITKEndGrouping */
194+
195+
/** Dynamic range \f$L\f$ of the pixel values used to compute
196+
* \f$C_1 = (K_1 L)^2\f$ and \f$C_2 = (K_2 L)^2\f$. Default depends on
197+
* the input pixel type: 255 for \c unsigned \c char, 65535 for
198+
* \c unsigned \c short, 1.0 for \c float / \c double, etc. */
199+
/** @ITKStartGrouping */
200+
itkSetMacro(DynamicRange, double);
201+
itkGetConstMacro(DynamicRange, double);
202+
/** @ITKEndGrouping */
203+
204+
/** Exponent \f$\alpha\f$ on the luminance term. Default 1.0. */
205+
/** @ITKStartGrouping */
206+
itkSetMacro(LuminanceExponent, double);
207+
itkGetConstMacro(LuminanceExponent, double);
208+
/** @ITKEndGrouping */
209+
210+
/** Exponent \f$\beta\f$ on the contrast term. Default 1.0. */
211+
/** @ITKStartGrouping */
212+
itkSetMacro(ContrastExponent, double);
213+
itkGetConstMacro(ContrastExponent, double);
214+
/** @ITKEndGrouping */
215+
216+
/** Exponent \f$\gamma\f$ on the structure term. Default 1.0. */
217+
/** @ITKStartGrouping */
218+
itkSetMacro(StructureExponent, double);
219+
itkGetConstMacro(StructureExponent, double);
220+
/** @ITKEndGrouping */
221+
222+
/** Per-scale weights for multi-scale SSIM (MS-SSIM). An array of size 1
223+
* (the default) requests ordinary single-scale SSIM and is the only
224+
* configuration currently supported. Setting an array of length greater
225+
* than 1 will currently raise an exception in BeforeGenerate. */
226+
/** @ITKStartGrouping */
227+
void
228+
SetScaleWeights(const ScaleWeightsType & weights);
229+
itkGetConstReferenceMacro(ScaleWeights, ScaleWeightsType);
230+
/** @ITKEndGrouping */
231+
232+
/** Mean SSIM over the valid (non-Gaussian-padded) region. Available
233+
* after Update(). */
234+
itkGetConstMacro(MeanSSIM, double);
235+
236+
itkConceptMacro(InputHasNumericTraitsCheck, (Concept::HasNumericTraits<InputPixelType>));
237+
238+
protected:
239+
StructuralSimilarityImageFilter();
240+
~StructuralSimilarityImageFilter() override = default;
241+
242+
void
243+
PrintSelf(std::ostream & os, Indent indent) const override;
244+
245+
/** Verify that parameters are valid and both inputs are set with matching regions. */
246+
void
247+
VerifyPreconditions() const override;
248+
249+
/** This filter requires its full input to compute correct mean SSIM. */
250+
void
251+
GenerateInputRequestedRegion() override;
252+
253+
/** This filter computes the full output. */
254+
void
255+
EnlargeOutputRequestedRegion(DataObject * data) override;
256+
257+
/** Composite-filter-style: drives the internal sub-pipeline (5 Gaussian
258+
* convolutions plus a parallelized SSIM combination). */
259+
void
260+
GenerateData() override;
261+
262+
private:
263+
double m_GaussianSigma{ 1.5 };
264+
unsigned int m_MaximumKernelWidth{ 11 };
265+
double m_K1{ 0.01 };
266+
double m_K2{ 0.03 };
267+
268+
/** Default dynamic range: 1.0 for floating-point pixels (assume normalized
269+
* data), and \c NumericTraits::max() - \c NumericTraits::min() for integer
270+
* pixels (e.g. 255 for \c unsigned \c char). */
271+
static constexpr double
272+
DefaultDynamicRange()
273+
{
274+
if constexpr (std::is_floating_point_v<InputPixelType>)
275+
{
276+
return 1.0;
277+
}
278+
else
279+
{
280+
return static_cast<double>(NumericTraits<InputPixelType>::max()) -
281+
static_cast<double>(NumericTraits<InputPixelType>::min());
282+
}
283+
}
284+
double m_DynamicRange{ DefaultDynamicRange() };
285+
double m_LuminanceExponent{ 1.0 };
286+
double m_ContrastExponent{ 1.0 };
287+
double m_StructureExponent{ 1.0 };
288+
289+
ScaleWeightsType m_ScaleWeights{};
290+
291+
double m_MeanSSIM{ 0.0 };
292+
};
293+
} // end namespace itk
294+
295+
#ifndef ITK_MANUAL_INSTANTIATION
296+
# include "itkStructuralSimilarityImageFilter.hxx"
297+
#endif
298+
299+
#endif

0 commit comments

Comments
 (0)