Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 51 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ thesis `Testing robustness of DeepFake recognition methods against non-malicious
* [How it works?](#how-it-works-)
* [TL;DR:](#tl-dr-)
* [Tell me more!](#tell-me-more-)
* [Favourite code snippet](#favourite-code-snippet)
* [Design](#design)
* [Installation](#installation)
* [GPU configuration](#gpu-configuration)
Expand All @@ -23,10 +24,9 @@ recognition method used to properly discriminate DeepFake from authentic video w
## What is this?

* CLI program allowing to:
* Preprocess dataset containing real and fake videos. Chosen part of negatives is modified by selected non-malicious
image modifications based on settings assigned by user.
* Preprocess dataset containing real and fake videos. Chosen part of negatives is modified by selected modifications based on settings assigned by user.
* Preprocessing can be done via single command or gradually in steps where single step represents activity such
as `extract faces from fake videos.` The letter method allows preprocessing large datasets[^2] on without the
as `extract faces from fake videos`. The letter method allows preprocessing large datasets[^2] without the
need of keeping program on for 24+ hours.
* Train detection model and evaluate it in assigned settings.

Expand Down Expand Up @@ -59,28 +59,57 @@ alter negatives are configured via YAML files. Here is a sample:
---
# Example modifications settings.
# In total half of negatives are altered.
modifications:
- name: RedEyesEffectModification
share: 0.125
options:
brightness_threshold: 50
- name: CLAHEModification
share: 0.125
options:
clip_limit: 2.0
grid_width: 8
grid_height: 8
- name: HistogramEqualizationModification
share: 0.125
- name: GaussianBlurModification
share: 0.125
options:
kernel_width: 9
kernel_height: 9
modifications_chains:
- share: 0.25
modifications:
- name: GammaCorrectionModification
options:
gamma_value: 0.75
- name: CLAHEModification
options:
clip_limit: 2.0
grid_width: 8
grid_height: 8
- share: 0.25
modifications:
- name: GammaCorrectionModification
options:
gamma_value: 0.75
- name: HistogramEqualizationModification
```

The names of supported modifications can be found in [this file](src/dfd/datasets/modifications/register.py).

## Favourite code snippet

```python
@given(
given_specifications=st.lists(
st.builds(
ModificationStub,
name=st.text(min_size=1),
no_repeats=st.integers(min_value=1),
),
min_size=1,
max_size=5,
)
)
def test_combine_multiple_specifications(given_specifications):
# GIVEN
image_mock = Mock(spec_set=np.ndarray)
# WHEN specifications are combined
combined_specification = functools.reduce(operator.and_, given_specifications)
# THEN specification names are combined
expected_name = "__".join([spec.name for spec in given_specifications])
assert combined_specification.name == expected_name
# And specifications are performed in order
combined_specification.perform(image_mock)
expected_calls_in_order = [call.repeat(spec.no_repeats) for spec in given_specifications]
image_mock.assert_has_calls(expected_calls_in_order)
```

It's uses two cool concepts: property-based testing and specification pattern[^4].

## Design

The application design is loosely inspired
Expand Down Expand Up @@ -121,3 +150,4 @@ pip install git+https://github.com/cicheck/dfd.git
[^1]: Currently the only supported detection method is [Meso-4](https://arxiv.org/abs/1809.00888).
[^2]: Such as [Celeb-DF](https://github.com/yuezunli/celeb-deepfakeforensics).
[^3]: Half of negatives modified, 4 modifications used with naive parameters.
[^4]: Or rather design loosely inspired by specification pattern :stuck_out_tongue:
Binary file modified docs/diagrams/app_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions docs/diagrams/app_architecture.puml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ package "Business Logic (Core)" #skyblue {
}

package Modifications {
interface ModificationInterface {
abstract ModificationSpecification {
+ name
--
+ perform()
Expand Down Expand Up @@ -100,8 +100,8 @@ MesoNet -up-- ModelRegistry : register >

MesoNet -up--|> ModelInterface

GammaCorrectionModification -up--|> ModificationInterface
CLAHEModification -up--|> ModificationInterface
GammaCorrectionModification -up--|> ModificationSpecification
CLAHEModification -up--|> ModificationSpecification

GammaCorrectionModification -up-- ModificationRegistry : register >
CLAHEModification -up-- ModificationRegistry : register >
Expand Down
40 changes: 17 additions & 23 deletions example_settings.yaml
Original file line number Diff line number Diff line change
@@ -1,25 +1,19 @@
---
# Example ModificationGenerator settings file
modifications:
- name: RedEyesEffectModification
share: 0.125
options:
brightness_threshold: 50
face_landmarks_detector_path: "/media/cicheck/Extreme Pro/\
models/shape_predictor_68_face_landmarks.dat"
- name: CLAHEModification
share: 0.125
options:
clip_limit: 2.0
grid_width: 8
grid_height: 8
- name: HistogramEqualizationModification
share: 0.125
- name: GammaCorrectionModification
share: 0.0625
options:
gamma_value: 0.75
- name: GammaCorrectionModification
share: 0.0625
options:
gamma_value: 1.25
modifications_chains:
- share: 0.25
modifications:
- name: GammaCorrectionModification
options:
gamma_value: 0.75
- name: CLAHEModification
options:
clip_limit: 2.0
grid_width: 8
grid_height: 8
- share: 0.25
modifications:
- name: GammaCorrectionModification
options:
gamma_value: 0.75
- name: HistogramEqualizationModification
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ tests =
pytest-cov==2.12.1
pytest>=6.2.4
coverage[toml]
hypothesis==6.34.1
style =
darglint>=1.8.0
flake8>=3.9.2
Expand Down
41 changes: 24 additions & 17 deletions src/dfd/datasets/frames_generators/modification.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,24 @@
"""Generate new frames after performing set non malicious modifications on original frames."""
import functools
import itertools
import operator
import pathlib
from typing import Generator, List, NamedTuple, Optional
from typing import Generator, List, NamedTuple, Optional, Sequence

import cv2 as cv
import numpy as np

from dfd.exceptions import DfdError
from dfd.datasets.modifications.definitions import IdentityModification
from dfd.datasets.modifications.interfaces import ModificationInterface
from dfd.datasets.modifications.register import ModificationRegister
from dfd.datasets.settings import GeneratorSettings
from dfd.datasets.modifications.specification import ModificationSpecification
from dfd.datasets.settings import GeneratorSettings, ModificationSettings
from dfd.exceptions import DfdError


class ModificationShare(NamedTuple):
"""Share of frames on which modification will be performed."""

modification: ModificationInterface
modification: ModificationSpecification
share: float


Expand All @@ -31,7 +32,7 @@ class ModificationRange(NamedTuple):

"""

modification: ModificationInterface
modification: ModificationSpecification
lower_bound: int
upper_bound: int

Expand Down Expand Up @@ -93,22 +94,17 @@ def from_directory(
input_frame = cv.imread(str(input_frame_path))
modified_frame = modification.perform(input_frame)
yield ModifiedFrame(
modification_used=str(modification),
modification_used=modification.name,
frame=modified_frame,
original_path=input_frame_path,
)

@functools.lru_cache(maxsize=1)
def _get_modifications_share(self) -> List[ModificationShare]:
modifications_share: List[ModificationShare] = []
for modification_settings in self._setting.modifications:
mame = modification_settings.name
options = modification_settings.options
share = modification_settings.share

modification_class = self._register.get_modification_class(mame)
# TODO: fix typing
modification = modification_class(**options) # type: ignore
for modification_chain_settings in self._setting.modifications_chains:
share = modification_chain_settings.share
modification = self._chain_modifications(modification_chain_settings.modifications)
modifications_share.append(ModificationShare(modification, share))

self._check_modifications_are_unique(
Expand All @@ -117,7 +113,7 @@ def _get_modifications_share(self) -> List[ModificationShare]:
return modifications_share

@staticmethod
def _check_modifications_are_unique(modifications: List[ModificationInterface]):
def _check_modifications_are_unique(modifications: List[ModificationSpecification]):
"""Check if modifications are unique.

Raises:
Expand Down Expand Up @@ -160,7 +156,7 @@ def _get_frames_permutation(self, no_frames: int) -> np.ndarray:

def _choose_modification(
self, frame_index: int, input_frame_path: pathlib.Path, no_frames: int
) -> ModificationInterface:
) -> ModificationSpecification:
frames_permutation = self._get_frames_permutation(no_frames)
modifications_range = self._get_modifications_range(no_frames)
permuted_index = frames_permutation[frame_index]
Expand All @@ -170,3 +166,14 @@ def _choose_modification(
# TODO: log error
# This should never happen
raise DfdError("Could not select modification.")

def _chain_modifications(
self, modifications_settings: Sequence[ModificationSettings]
) -> ModificationSpecification:
modifications: List[ModificationSpecification] = []
for modification_settings in modifications_settings:
modification_class = self._register.get_modification_class(modification_settings.name)
# TODO: fix typing
modification = modification_class(**modification_settings.options) # type: ignore
modifications.append(modification)
return functools.reduce(operator.and_, modifications)
18 changes: 14 additions & 4 deletions src/dfd/datasets/modifications/definitions/clahe.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
import cv2 as cv
import numpy as np

from dfd.datasets.modifications.interfaces import ModificationInterface
from dfd.datasets.modifications.specification import ModificationSpecification


class CLAHEModification(ModificationInterface):
class CLAHEModification(ModificationSpecification):
"""Modification CLAHE (Contrast Limited Adaptive Histogram Equalization)"""

def __init__(self, clip_limit: float, grid_width: int, grid_height: int) -> None:
Expand All @@ -21,6 +21,17 @@ def __init__(self, clip_limit: float, grid_width: int, grid_height: int) -> None
self._clip_limit = clip_limit
self._title_grid_size = (grid_width, grid_height)

@property
def name(self) -> str:
"""Get specification name.

Returns:
The name of specification.

"""
width, height = self._title_grid_size
return f"clahe_{width}_{height}_{self._clip_limit}"

def perform(self, image: np.ndarray) -> np.ndarray:
"""Perform CLAHE on image.

Expand All @@ -44,5 +55,4 @@ def perform(self, image: np.ndarray) -> np.ndarray:
return cv.cvtColor(ycrcb_image, cv.COLOR_YCrCb2BGR)

def __str__(self) -> str:
width, height = self._title_grid_size
return f"clahe_{width}_{height}_{self._clip_limit}"
return self.name
16 changes: 13 additions & 3 deletions src/dfd/datasets/modifications/definitions/gamma_correction.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
import cv2 as cv
import numpy as np

from dfd.datasets.modifications.interfaces import ModificationInterface
from dfd.datasets.modifications.specification import ModificationSpecification


class GammaCorrectionModification(ModificationInterface):
class GammaCorrectionModification(ModificationSpecification):
"""Modification Gamma Correction."""

def __init__(self, gamma_value: float) -> None:
Expand All @@ -17,6 +17,16 @@ def __init__(self, gamma_value: float) -> None:
"""
self._gamma_value = gamma_value

@property
def name(self) -> str:
"""Get specification name.

Returns:
The name of specification.

"""
return f"gamma_correction_{self._gamma_value}"

def perform(self, image: np.ndarray) -> np.ndarray:
"""Perform gamma correction on provided image.

Expand All @@ -38,4 +48,4 @@ def perform(self, image: np.ndarray) -> np.ndarray:
return cv.LUT(image, look_up_table)

def __str__(self) -> str:
return f"gamma_correction_{self._gamma_value}"
return self.name
14 changes: 12 additions & 2 deletions src/dfd/datasets/modifications/definitions/gaussian_blur.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
import cv2 as cv
import numpy as np

from dfd.datasets.modifications.interfaces import ModificationInterface
from dfd.datasets.modifications.specification import ModificationSpecification


class GaussianBlurModification(ModificationInterface):
class GaussianBlurModification(ModificationSpecification):
"""Modification Gaussian blur (AKA Gaussian smoothing)."""

def __init__(
Expand All @@ -30,6 +30,16 @@ def __init__(
self._sigma_y = sigma_y

def __str__(self) -> str:
return self.name

@property
def name(self) -> str:
"""Get specification name.

Returns:
The name of specification.

"""
width, height = self._kernel_size
return f"gaussian_blur{width}_{height}_{self._sigma_x}_{self._sigma_y}"

Expand Down
Loading