-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description
The current duplicate image detection function identifies exact duplicates. The goal now is to extend this functionality to find near-duplicates, defined as images that are similar but might have been slightly modified (e.g., cropped, resized, color-altered).
Task
Update the duplicate detection function to incorporate Perceptual Hashing (pHash). This approach allows for the generation of an image 'fingerprint' that remains consistent even with minor image modifications. These fingerprints can then be compared to find near-duplicate images.
Steps
- Research Perceptual Hashing (pHash) to understand its implementation.
- Refactor the
get_hashfunction to calculate the pHash of an image instead of the SHA256 hash. - Validate the updated function to confirm its ability to detect near-duplicate images.
Acceptance Criteria
- The refactored function must be capable of detecting and moving near-duplicate images to the "duplicates" folder.
- The function should retain its ability to detect and move exact duplicate images to the "duplicates" folder.
- The function should not move images that are neither near-duplicates nor exact duplicates.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request