SecureAI Backup — Intelligent Backup & Recovery System

See Architecture diagram

🔹 Feature Overview

1. AI-Powered Data Leak Prevention (DLP) for Backups

Goal: Detect sensitive data before backing it up and ensure it is never stored without encryption.

Techniques:

Regex + NLP models to detect Aadhaar, PAN, credit card numbers, email IDs, etc.
Example: Aadhaar regex \d{4}\s\d{4}\s\d{4}.
Use an ML/NLP model (spaCy NER or a small HuggingFace model) for robust sensitive-entity detection.

Workflow:

User selects folder.
The system scans files for sensitive information.
If sensitive data is found, the file is encrypted (AES-256) before backup.
User is notified: “Sensitive data found — encryption applied.”

2. Ransomware Detection in Backups

Goal: If files being backed up are already infected or encrypted by ransomware, raise an alert and prevent compromised data from polluting the backup set.

Techniques:

File entropy calculation (high entropy can indicate encrypted/malicious files).
Detection of mass-renaming patterns (e.g., many .docx → .locked).
A machine learning classifier trained on samples of normal files vs. ransomware-encrypted files.

Workflow:

The backup engine analyzes files.
If suspicious patterns are detected, the backup batch is flagged as “Quarantine Backup.”
User is notified: “Potential ransomware detected in backup batch XYZ.”

🔹 AI Monitoring & Internal Logs

When the system computes context scores and risk scores, detailed monitoring/logging runs for debugging and auditing. Logs are important for traceability while preserving confidentiality (paths redacted).

Sample log trace (paths hidden):

INFO:     Started server process [5812]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

scan_directory : Started

test1.txt
context_score : Started
load_ml_model : Started
h_score : 0.16666666666666666  + ai_score : 0.6886560320854187  + ml_score : 0.5652289159334941
0.4950310876142157
compute_risk_score : Started
confidence : medium
context_score : 0.4950310876142157
-------------------------------
base : 20
conf_boost : 10
raw  44.85093262842647
return  45
-------------------------------

test2.txt
context_score : Started
load_ml_model : Started
h_score : 0.3333333333333333  + ai_score : 0.6598167419433594  + ml_score : 0.5660652443068904
0.5337462700694109
compute_risk_score : Started
confidence : high
context_score : 0.5337462700694109
-------------------------------
base : 20
conf_boost : 20
raw  61.34985080277644
return  61
-------------------------------

test4.txt
context_score : Started
load_ml_model : Started
h_score : 0.6666666666666666  + ai_score : 0.6758469343185425  + ml_score : 0.5911254159459445
0.6476763985112003
compute_risk_score : Started
confidence : high
context_score : 0.6476763985112003
-------------------------------
base : 40
conf_boost : 20
raw  98.860583910672
return  99
-------------------------------

text3.txt
context_score : Started
load_ml_model : Started
h_score : 0.16666666666666666  + ai_score : 0.6647664904594421  + ml_score : 0.5424601270550792
0.4786446343003006
compute_risk_score : Started
confidence : medium
context_score : 0.4786446343003006
-------------------------------
base : 20
conf_boost : 10
raw  44.359339029009014
return  44
-------------------------------

text5.txt
context_score : Started
load_ml_model : Started
h_score : 0.6666666666666666  + ai_score : 0.520253598690033  + ml_score : 0.5328629812888575
0.5679603338626704
compute_risk_score : Started
confidence : low
context_score : 0.5679603338626704
-------------------------------
base : 0
conf_boost : 0
raw  0.0
return  0
-------------------------------

3. Honey Backup System 🐝

Goal: Lure attackers into interacting with decoy backups so you can detect malicious access and gather forensic information.

Techniques & Workflow:

Create a duplicate “decoy backup” set with dummy files and honeytokens.
The honey backup mirrors the real backup’s structure but contains markers (for example, a hidden DO_NOT_OPEN.txt file).
If the decoy is accessed, an immediate alert is triggered and attacker access is logged.

🔹 Technology Stack

Frontend: React + Tailwind (UI for backup configuration, scan results, and alerts)
Backend: FastAPI or Flask (backup service and AI modules)
Database: MongoDB or SQLite (stores backup metadata, alerts, honeytrap logs)
ML / NLP: Regex + spaCy / HuggingFace for DLP; scikit-learn or PyTorch for ransomware detection
Security: AES-256 encryption for sensitive files
Monitoring & Alerts: Application logging + notifications (email/SMS via SendGrid/Twilio or similar)
Integration: Optional SIEM integration (Splunk, Elastic)

🔹 Demo Flow

Backup Start → System scans selected data → displays “Sensitive data detected — encrypting”. ✅
Backup Analysis → System detects suspicious files flagged as possibly ransomware-encrypted → displays alert. 🚨
Restore Attempt by Attacker → Decoy (honey) backup is accessed → alert triggered and attacker activity logged. 🔔

🔹 Bonus Features

Dashboard with an overall Backup Security Score (e.g., “Backup Security Score: 85/100”)
Time-series graphs showing ransomware detection trends and backup health
Easy integration options with SIEM tools (Splunk, Elastic) for enterprise deployments

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
app		app
templates		templates
testing_folder		testing_folder
.gitignore		.gitignore
README.md		README.md
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SecureAI Backup — Intelligent Backup & Recovery System

🔹 Feature Overview

1. AI-Powered Data Leak Prevention (DLP) for Backups

2. Ransomware Detection in Backups

🔹 AI Monitoring & Internal Logs

3. Honey Backup System 🐝

🔹 Technology Stack

🔹 Demo Flow

🔹 Bonus Features

About

Uh oh!

Releases

Packages

Languages

ethicalByte1443/Intelligent-Secure-Backup-System

Folders and files

Latest commit

History

Repository files navigation

SecureAI Backup — Intelligent Backup & Recovery System

🔹 Feature Overview

1. AI-Powered Data Leak Prevention (DLP) for Backups

2. Ransomware Detection in Backups

🔹 AI Monitoring & Internal Logs

3. Honey Backup System 🐝

🔹 Technology Stack

🔹 Demo Flow

🔹 Bonus Features

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages