Background
A contributor reported an unexpected clustering inconsistency: the same two people's images produce two separate clusters when processed alongside a larger photo library, but collapse into a single (incorrect) cluster when processed in isolation. Same images, different result, purely depending on folder composition.
The root cause is a known DBSCAN limitation called the Global vs. Local Density Problem (Gan et al., 2022, Information Sciences). A static eps radius works relative to the density of the full embedding space. In a sparse isolated dataset, the same eps spans across identity boundaries it would not touch in a denser combined dataset. A follow-up audit of the codebase confirmed that several fixes this problem calls for are not yet implemented.
Current State
| Item |
Status |
min_samples=1 chaining problem |
Fixed: now 2 in face_clusters.py |
| Face quality gate |
Not implemented |
Adaptive eps via k-NN estimation |
Not implemented |
| Clustering params in config |
All hardcoded as Python defaults |
| Clustering regression tests |
None exist |
The five affected parameters - eps, min_samples, similarity_threshold, merge_threshold, conf_threshold - are scattered as hardcoded defaults across face_clusters.py and FaceDetector.__init__(). None are wired to settings.py or overridable without editing source code.
Proposed Solution
9.1 - Face quality gate
Low-quality "bridge point" embeddings are a direct contributor to incorrect cluster merges. This adds a pre-clustering quality gate that runs after YOLO detection and before the FaceNet embedding step. Three checks are applied per detected face:
- Blur: Laplacian variance on the cropped face region, rejected below a configurable threshold
- Size: Bounding box area, rejected below a configurable minimum
- Completeness: YOLO confidence score below
conf_threshold (no new model dependency)
Rejected faces are counted and surfaced to the user via a non-blocking UI alert ("X faces skipped due to quality") rather than silently discarded.
9.2 - Adaptive eps via k-NN distance estimation
Replaces the single hardcoded eps = 0.75 with a per-run estimation step. After the quality gate, k-NN distances are computed across the current embedding set using k = min_samples. A dataset-appropriate eps is derived from the distance distribution, making cluster quality a function of embedding geometry rather than folder composition. Falls back to the config default if the embedding set is too small for reliable estimation.
An in-depth explanation of each of these implementations can be found at: https://www.notion.so/Google-Summer-of-Code-Project-Proposal-for-AOSSIE-PictoPy-31f0567ec53f80bebad2c92acfe5f429?source=copy_link
Implementation Plan
References
Background
A contributor reported an unexpected clustering inconsistency: the same two people's images produce two separate clusters when processed alongside a larger photo library, but collapse into a single (incorrect) cluster when processed in isolation. Same images, different result, purely depending on folder composition.
The root cause is a known DBSCAN limitation called the Global vs. Local Density Problem (Gan et al., 2022, Information Sciences). A static
epsradius works relative to the density of the full embedding space. In a sparse isolated dataset, the sameepsspans across identity boundaries it would not touch in a denser combined dataset. A follow-up audit of the codebase confirmed that several fixes this problem calls for are not yet implemented.Current State
min_samples=1chaining problem2inface_clusters.pyepsvia k-NN estimationThe five affected parameters -
eps,min_samples,similarity_threshold,merge_threshold,conf_threshold- are scattered as hardcoded defaults acrossface_clusters.pyandFaceDetector.__init__(). None are wired tosettings.pyor overridable without editing source code.Proposed Solution
9.1 - Face quality gate
Low-quality "bridge point" embeddings are a direct contributor to incorrect cluster merges. This adds a pre-clustering quality gate that runs after YOLO detection and before the FaceNet embedding step. Three checks are applied per detected face:
conf_threshold(no new model dependency)Rejected faces are counted and surfaced to the user via a non-blocking UI alert ("X faces skipped due to quality") rather than silently discarded.
9.2 - Adaptive
epsvia k-NN distance estimationReplaces the single hardcoded
eps = 0.75with a per-run estimation step. After the quality gate, k-NN distances are computed across the current embedding set usingk = min_samples. A dataset-appropriateepsis derived from the distance distribution, making cluster quality a function of embedding geometry rather than folder composition. Falls back to the config default if the embedding set is too small for reliable estimation.An in-depth explanation of each of these implementations can be found at: https://www.notion.so/Google-Summer-of-Code-Project-Proposal-for-AOSSIE-PictoPy-31f0567ec53f80bebad2c92acfe5f429?source=copy_link
Implementation Plan
ClusteringConfigblock insettings.pywith env-var overrides. No functional change, refactor only.face_quality_gate()in a newface_quality.pyutility. Insert between YOLO output and FaceNet crop/embed. Returnfaces_skippedcount up the call stack instead of discarding it.estimate_eps()using k-NN distance distribution on the post-gate embedding set.k = min_samplesread from config.GlobalReclusterDataPydantic model withfaces_skipped: Optional[int]. Wire the count from utility functions through the FastAPI route into the response envelope.faces_skippedto the frontend TypeScript type. Trigger a ShadCN corner toast inApplicationControlsCard.tsxwhenfaces_skipped > 0.References