Describe the feature
Currently PictoPy identifies images only by their file path. If the same image file is:
Copied to another folder
Downloaded multiple times
Renamed
Or exists in backups
…it is stored as a new independent image in the database.
This causes:
Duplicate thumbnails
Duplicate metadata
Duplicate face processing
Duplicate tagging
Waste of storage and processing
No way to detect or manage duplicates
Add ScreenShots
Same images:
Harsh_Shah.jpg exists in two folders
🔍 Current Behavior:
Images are uniquely identified by path
Same image in different folders = separate DB rows
No content hash or duplicate detection exists
✅ Expected Behavior:
System should compute a content hash (SHA256 or similar)
Store it in DB as image_hash
Allow:
Finding all images with same content
Showing duplicate groups
Letting user decide what to do with them (keep/delete/merge)
Record