A comprehensive Python script designed to automatically organize multimedia files and documents based on their creation date.
It recursively scans subfolders, prevents duplicates using SHA256 hashing, handles files without metadata, and logs every execution into a SQLite database to track volumes and statistics over time.
✅ Automatic organization by Year/Month
✅ Support for Images, Videos, Audio, and Documents
✅ Duplicate prevention via SHA256 Hashing
✅ Automatic renaming for files with identical names (_Copy suffix)
✅ Files without dates → moved to NoDate/ folder
✅ Unsupported files → moved to Others/ folder
✅ Dedicated subfolders for Audio and Documents (/Year/Audio and /Year/Documents)
✅ Real-time progress bar with tqdm
✅ SQLite Logging with statistics on files, duplicates, and GB processed
- Clone or download the project:
git clone https://github.com/Jio-Jio99/media-organizer.git
cd media-organizer
- Install requirements:
pip install -r requirements.txt
Pillow>=10.0.0
tqdm>=4.66.0
ℹ️
sqlite3is included in the Python standard library.
python organize_media.py ./Source_Folder ./Archive_Destination
You can filter the organization process by using the --type flag:
- Images only:
python organize_media.py ./Media ./Archive --type image - Videos only:
python organize_media.py ./Media ./Archive --type video - Audio only:
python organize_media.py ./Media ./Archive --type audio - Documents only:
python organize_media.py ./Media ./Archive --type document
Example of the ./Archive directory after processing:
Archive/
├── 2024/
│ ├── 01/
│ │ ├── IMG_1234.jpg
│ │ └── VID_0001.mp4
│ ├── Audio/
│ │ └── recording.m4a
│ └── Documents/
│ └── report.pdf
├── 2025/
│ ├── 03/
│ │ └── event_photo.png
│ ├── Audio/
│ │ └── lecture_voice.mp3
│ └── Documents/
│ └── notes.docx
├── Others/
│ └── script.sh
└── NoDate/
└── file_without_metadata.jpg
Each execution generates or updates a SQLite database located at:
<DESTINATION>/Logs/organizer_log.db
| Field | Description |
|---|---|
id |
Unique log identifier |
timestamp |
Date and time of execution |
mode |
File type processed (image, video, audio, document, all) |
files_processed |
Total files scanned |
files_copied |
Files successfully moved/copied |
duplicates_skipped |
Number of duplicate files detected and skipped |
total_size_gb |
Total size of data processed in Gigabytes |
To view the last 5 executions:
sqlite3 Archive/Logs/organizer_log.db "SELECT * FROM logs ORDER BY id DESC LIMIT 5;"
- Recursive Scan: Traverses the source directory and all subdirectories.
- Classification: Determines file type based on extension.
- Deduplication: Calculates a SHA256 hash for every file to ensure no identical content is copied twice.
- Date Extraction (Priority Order):
- EXIF Metadata: Specifically the
DateTimeOriginaltag. - Filename Parsing: Looks for date patterns (e.g.,
2023-07-15). - System Metadata: Fallback to the file's last modification date.
- Validation: If the date is missing or set in the future, the file is sent to
NoDate/. - Execution: Files are copied/moved to the organized destination.
- Database Update: Stats are written to the SQLite log.
Gioele Zoccoli 💻 Computer Science Student @ Sapienza University of Rome
This project is distributed under the MIT License.
You are free to use, modify, and redistribute it as you wish.
Create a terminal alias to run the script quickly from anywhere:
alias organize="python /absolute/path/to/organize_media.py"
Then simply run:
organize ./Downloads ./MyArchive --type all