This document explains the mathematical ideas behind the transformations and perturbations used in Image Protector.
The goal of the project is to apply controlled, bounded distortions to an image so that:
- The image remains visually understandable to humans
- The pixel-level structure is altered enough to reduce usefulness for automated analysis, scraping, or naive ML pipelines
This is not a formal adversarial ML defense system. Instead, it is a practical, model-agnostic perturbation framework.
A digital image can be represented as a matrix (or tensor):
-
Grayscale image:
$$ I \in \mathbb{R}^{H \times W} $$ -
RGB image:
$$ I \in \mathbb{R}^{H \times W \times 3} $$
Each pixel value is typically in the range:
(or normalized to
Most operations in this project can be seen as computing a perturbed image:
where:
-
$I$ is the original image -
$\Delta$ is a perturbation matrix -
$I'$ is the protected image
Most methods use a strength parameter
Where:
-
$P(I)$ is some perturbation function derived from the image -
$\alpha$ controls how strong the effect is
Clipping is applied to keep values valid:
The simplest method uses random noise:
So:
This:
- Breaks exact pixel patterns
- Preserves overall structure
- Increases entropy of the image
We approximate local image structure using finite differences:
Gradient magnitude:
A perturbation can be built from this:
Then:
Effect: edges and fine details are disturbed, which affects many vision algorithms.
A simple texture model:
- Compute a smoothed image
$S = \text{blur}(I)$ - Extract high-frequency detail:
- Re-inject or modify it:
This changes local texture statistics while keeping global structure mostly intact.
Transform the image into the frequency domain:
Modify coefficients:
Invert the transform:
This allows controlled changes to different frequency bands.
Let
Combined perturbation:
Final image:
This mixes multiple distortion types to avoid relying on a single transformation.
Where
Classic adversarial attacks solve:
subject to:
This project does not optimize against a specific model
Instead, it uses heuristic, model-agnostic perturbations:
All methods in this project reduce to:
Where the core design problem is choosing a good
- Visual quality for humans
- Disruption of automated processing
- No formal robustness guarantees
- No model-specific optimization
- Heuristic, signal-processing-based methods
- Stronger perturbations always trade off visual quality
-
Rafael C. Gonzalez and Richard E. Woods - Digital Image Processing
A foundational textbook on image representation, transforms, filtering, noise, and metrics.
https://books.google.com/books?id=4gZkQgAACAAJ -
Anil K. Jain - Fundamentals of Digital Image Processing
Classic reference for gradients, sampling, transforms, and image statistics.
https://books.google.com/books?id=6m1tQgAACAAJ -
Alan V. Oppenheim & Ronald W. Schafer - Discrete-Time Signal Processing
Comprehensive treatment of noise, filtering, frequency-domain analysis, and transforms.
https://books.google.com/books?id=akt7QgAACAAJ -
Stéphane Mallat - A Wavelet Tour of Signal Processing
Deep discussion of frequency and multi-scale analysis for images.
https://www.elsevier.com/books/a-wavelet-tour-of-signal-processing/mallat/978-0-12-374370-1 -
Ian Goodfellow, Yoshua Bengio, Aaron Courville - Deep Learning (MIT Press)
Standard text on gradients, optimization, and robustness in neural networks.
https://www.deeplearningbook.org/ -
Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy - Explaining and Harnessing Adversarial Examples
Introduces the concept of adversarial perturbations.
https://arxiv.org/abs/1412.6572 -
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, Eero P. Simoncelli - Image Quality Assessment: From Error Visibility to Structural Similarity
(IEEE TIP) A highly-cited paper on image quality metrics including PSNR logic.
https://ieeexplore.ieee.org/document/1292216 -
Richard Szeliski - Computer Vision: Algorithms and Applications
General reference for gradients, filters, edges, and transform concepts used in vision.
https://szeliski.org/Book/ -
Discrete Cosine Transform (DCT) - Wikipedia
Explains frequency-domain representations used in texture and frequency perturbations.
https://en.wikipedia.org/wiki/Discrete_cosine_transform -
Mean Squared Error (MSE) - Wikipedia
Definition and context of MSE, which you reference in metrics.
https://en.wikipedia.org/wiki/Mean_squared_error -
Peak Signal-to-Noise Ratio (PSNR) - Wikipedia
Visual quality metric commonly used for image comparison.
https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio
The methods in Image Protector are inspired by standard techniques from:
- Digital image processing
- Signal processing
- Computer vision
- Adversarial robustness literature
However, this project uses heuristic, model-agnostic perturbations rather than solving formal optimization problems against specific models.
- Images are matrices
- Protection = controlled perturbation
- Different methods design different
$\Delta$ - Strength
$\alpha$ controls magnitude - Ensemble = weighted sum of perturbations
- Metrics quantify distortion