😎 Finding duplicate images made easy!
-
Updated
Aug 15, 2025 - Python
😎 Finding duplicate images made easy!
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Image similarity in Golang. Version 4 (LATEST)
Tool to detect (and get rid of) similar images using perceptual hashing (pHash lib)
A utility for locating near duplicate photos irrespective of image resolution, compression settings or file format.
A Python tool to identify and remove similar-looking images from a dataset. Utilizes image preprocessing and hashing techniques for efficient comparison.
一个基于深度学习特征和余弦相似度的高性能图像去重工具,能够快速、准确地从海量图像中找出并移除重复或高度相似的图片。
Downloader with custom wildcard system: cherry-picking internet with asterisks for HTML or right-carets for API, whether it's for time-critical website moments or just for laziness. Features directory listing and serve, alarm (essentially in-stock tracker), file sorter (organizer), image duplicate finder and tools for naked eyes.
高效的Python图像查重工具,支持百万级图片文件的重复检测。集成多种算法包括MD5哈希、感知哈希(dHash/pHash/aHash)和C++加速库,可识别完全相同、分辨率调整、部分截取和水印变更的重复图像。
🏍️ A clustering tool providing exact and near de-duplication of images using vector embeddings.
A CLI tool for images analysis: checking image integrity, images deduplication, image retrieval.
a Python command-line tool that identifies and groups similar images using average hashing. It supports single-level and recursive directory scanning, adjustable similarity threshold, and presents results in JSON format. Ideal for image deduplication, organization, and content-based retrieval tasks.
The extended version of simhash supports fingerprint extraction of documents and images.
CLIP image deduplication toolkit
Sobel Gradient Image Deduplication
This Python script helps in identifying and moving duplicate images within a specified directory to a designated duplicates folder.
A utility for testing the performance of de-duplication algorithms by randomly generating “noisy” images in a dataset.
A python program to detect duplicate images in a specified folder.
Client-side lossless image deduplication engine using block-level content-addressable storage
A Python notebook combining MD5 and perceptual hashing to detect exact-duplicate images
Add a description, image, and links to the image-deduplication topic page so that developers can more easily learn about it.
To associate your repository with the image-deduplication topic, visit your repo's landing page and select "manage topics."