SmartExcelGuardian v1.1.0 is a powerful Python desktop application for professional Excel data cleanup, validation, and monitoring.
This repository contains the full source code, allowing you to customize heuristic scoring, formula cleanup, conditional formatting, export logic, and UI behavior for enterprise reporting, analytics, or auditing workflows.
- 📂 Excel File Input — Load
.xlsxor.xlsworkbooks - 🧹 Automatic Data Cleanup — Handles missing values, duplicates, and type inconsistencies
- 🧠 Heuristic Scoring Engine (0–100) — Flags high-risk columns based on data quality
- 📝 Column Name Normalization — snake_case rename suggestions
- 🔢 Type Normalization — Numeric and string coercion with validation
- 📊 Missing Value Imputation — Mean (numeric) and Mode (string)
- 📈 Duplicate Detection — Column-level duplicate analysis
- 🧮 Formula Cleanup — Removes invalid Excel formulas safely
- 🎨 Conditional Formatting — Highlights high-risk columns in Excel exports
- 📐 Auto Excel Formulas — Automatic SUM & AVERAGE for numeric columns
- 🧵 Multithreaded Execution — Responsive UI during large file processing
- 🖱️ Interactive Results Table — View column stats, scores, and suggestions
- 📄 Export Results — Excel, PDF, JSON, and TXT formats
- 📑 Professional PDF Reports — Pagination and color-coded heuristic scores
- 🎨 Modern Dark UI — Built with Tkinter + ttkbootstrap
- 📘 Built-In About / Guide — Usage instructions and feature overview
- 🔒 Local Processing Only — No internet access or data transmission
- Clone or download this repository:
git clone https://github.com/rogers-cyber/SmartExcelGuardian.git
cd SmartExcelGuardian
- Install required Python packages:
pip install pandas numpy ttkbootstrap openpyxl reportlab
(Tkinter is included with standard Python installations.)
- Run the application:
python SmartExcelGuardian.py
- Optional: Build a standalone executable using PyInstaller:
pyinstaller --onefile --windowed SmartExcelGuardian.py
-
Select Excel File:
- Click 📄 Excel File to choose your workbook.
-
Start Cleanup:
- Click 🛡 CLEAN DATA
- The tool analyzes each column and applies cleanup logic.
-
Stop Cleanup:
- Click 🛑 STOP to safely interrupt processing.
-
Review Results:
- Columns are displayed with:
- Original type → Cleaned type
- Missing values count
- Duplicate count
- Heuristic score (0–100)
- Suggested rename (snake_case)
- Columns are displayed with:
-
Export Results:
- 📄 Excel — Cleaned data, conditional formatting, formulas
- 📄 PDF — Professional audit-style report
- 📄 JSON — Structured results for automation
- 📃 TXT — Plain-text summary
-
About / Guide:
- Click ℹ About for features, usage steps, and developer info
Option Description
Excel File Load a workbook for cleanup Start Cleanup Begin heuristic analysis and data cleaning Stop Cleanup Safely halt processing Results Table Interactive column-level diagnostics Export Excel Cleaned data + formulas + formatting Export PDF Professional audit-style report Export JSON Structured cleanup metadata Export TXT Plain-text summary About / Guide Built-in usage documentation
- Excel (.xlsx) — Cleaned data, highlighted risk columns, formulas
- PDF — Color-coded heuristic report with pagination
- JSON — Machine-readable cleanup results
- TXT — Human-readable text summary
- Python 3.10+
- pandas — Data processing and validation
- numpy — Numeric computation support
- ttkbootstrap — Modern themed UI
- openpyxl — Excel reading, writing, and formatting
- reportlab — PDF generation
- Tkinter — Standard Python GUI framework
- threading — Background cleanup execution
- OS / Sys — Platform-aware file handling
- SmartExcelGuardian processes all files locally
- No data is transmitted or uploaded
- Heuristic scores help prioritize risky columns
- Conditional formatting visually highlights problem areas
- Numeric columns receive automatic SUM and AVERAGE formulas
- Column renaming suggestions enforce consistent formatting
- Error logs are written to excelguardian.log
- Suitable for auditors, analysts, and data engineers
- Fully portable when compiled as a standalone executable
SmartExcelGuardian v1.1.0 is developed and maintained by Mate Technologies, delivering professional-grade Python productivity and data quality tools.
Website: https://matetools.gumroad.com
Distributed as commercial source code.
You may use it for personal or commercial projects.
Redistribution, resale, or rebranding as a competing product is not allowed.
