Skip to content

It scans the chosen folder, identifies duplicate files by comparing checksums and full contents, and produces a CSV report.

License

Notifications You must be signed in to change notification settings

arnavdutta/MirrorMatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” MirrorMatch - Duplicate File Finder

MirrorMatch is a cross-platform desktop utility (Python + Tkinter) to scan folders and find duplicate files.
It combines file size filtering, fast CRC32 checksums, and byte-level comparison to ensure accuracy.

Results are exported into a clean CSV report, and you can easily review or open them in Excel/LibreOffice.


✨ Features

βœ… Accurate duplicate detection
Β Β Uses a 3-step process:

  1. Group files by size (fast pre-check).
  2. Within equal-sized groups, compute CRC32 checksum.
  3. For checksum matches, verify with byte-for-byte comparison.

βœ… Intuitive interface

  • Browse folder easily.
  • Progress bar with real-time ETA.
  • Tooltips to guide users.

βœ… Control the scan

  • Pause / Resume whenever you want.
  • Cancel the scan and reset progress instantly.

βœ… CSV export

  • CSV report with duplicate groups.
  • Opens automatically after scan (on Windows, macOS, and Linux).

βœ… Cross-platform
Works on Windows, macOS, and Linux with Python 3.x.


πŸš€ Installation & Usage

  1. Clone the repository:

  2. Install dependencies:

(Only uses standard library, no extra packages needed by default!) 3. Run the application:


πŸ“Š Output Example

CSV output file:
duplicate_files_<foldername>_<timestamp>.csv

checksum file_path duplicate_count
a3f4c2e1 /path/to/duplicate1.txt 3
a3f4c2e1 /path/to/duplicate2.txt 3
a3f4c2e1 /path/to/duplicate3.txt 3
b5e8d1f9 /path/to/another_duplicate.docx 2

βš™οΈ Options & Controls

  • Browse: Pick a folder to scan.
  • Find Duplicates: Starts scanning in a background thread (UI stays responsive).
  • Pause / Resume: Temporarily stop and continue scanning without losing progress.
  • Cancel: Abort scan and reset progress bar instantly.
  • About: Shows app version and credits.

πŸ› οΈ Tech Stack

  • Language: Python 3
  • GUI: Tkinter (cross-platform native GUI)
  • Logic: os, zlib, threading, csv, subprocess

πŸ‘¨β€πŸ’» Author

Arnav Dutta


πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

It scans the chosen folder, identifies duplicate files by comparing checksums and full contents, and produces a CSV report.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages