Skip to content

muhammad-fiaz/DupeFinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

DupeFinder

Fast file deduplication and disk space analyzer for Windows, Linux, and macOS.

DupeFinder scans directories to identify duplicate files using cryptographic hashes. It supports multiple hash algorithms, configurable file filters, and various actions including reporting, deleting, hardlinking, or moving duplicate files.

Prerequisites

Linux (Debian/Ubuntu)

sudo apt update
sudo apt install perl cpanminus build-essential

Linux (RHEL/CentOS/Fedora)

sudo dnf install perl perl-App-cpanminus gcc make

macOS

brew install perl cpanminus

Windows

Download and install Strawberry Perl which includes cpanm.

Installation

git clone https://github.com/muhammad-fiaz/DupeFinder.git
cd DupeFinder
cpanm --installdeps .
chmod +x bin/dupefinder

Add to PATH (optional):

export PATH="$PWD/bin:$PATH"

Usage

Basic Scan

./bin/dupefinder /path/to/directory

Multiple Directories

./bin/dupefinder /home/user/photos /home/user/downloads

Output Formats

./bin/dupefinder -f json -o report.json /data
./bin/dupefinder -f yaml -o report.yaml /data
./bin/dupefinder -f csv -o report.csv /data

Filter by File Size

./bin/dupefinder -m 1024 -M 10485760 /data

Filter by Extension

./bin/dupefinder -e jpg,png,gif /media/photos

Delete Duplicates

# Dry run (default)
./bin/dupefinder -A delete -k first /tmp/downloads

# Execute
./bin/dupefinder -A delete -N -k first /tmp/downloads

Create Hardlinks

./bin/dupefinder -A hardlink -N /data/backups

Move Duplicates

./bin/dupefinder -A move -d /tmp/dupes -N /data

Use Config File

./bin/dupefinder -c config/dupefinder.yaml /data

Options

Option Description
-c, --config FILE Config file (YAML)
-o, --output FILE Output file for report
-f, --format FMT Output format: text, json, yaml, csv
-v, --verbose Verbose output
-q, --quiet Suppress progress messages
-C, --no-color Disable colored output
-n, --dry-run Dry run mode (default)
-N, --no-dry-run Execute file operations
-m, --min-size N Minimum file size in bytes
-M, --max-size N Maximum file size in bytes
-e, --extensions Comma-separated extensions
-a, --algorithm Hash: MD5, SHA1, SHA256, SHA512
-A, --action Action: report, delete, hardlink, move
-k, --keep Keep first or last file
-d, --move-dir DIR Destination for moved duplicates

Running Tests

prove -l t/

License

Apache License 2.0 - see LICENSE

About

DupeFinder scans directories to identify duplicate files using cryptographic hashes. It supports multiple hash algorithms, configurable file filters, and various actions including reporting, deleting, hardlinking, or moving duplicate files.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages