A modular tool for scanning Salesforce orgs to identify fields that may contain Protected Health Information (PHI) and require encryption under HIPAA compliance.
- SF CLI Integration: Uses existing Salesforce CLI authentication (no credentials to manage)
- Multiple Scan Modes: Full API scan, local metadata only, or hybrid
- PHI Risk Classification: Three-tier risk assessment (High/Medium/Low)
- Data Sampling: Samples actual data with automatic masking
- Encryption Gap Analysis: Identifies PHI fields not protected by Shield Platform Encryption
- Encryption Recommendations: Deterministic vs. probabilistic encryption guidance
- Multiple Output Formats: Excel, CSV, or JSON reports with professional styling
- Interactive Mode: Select objects from a menu before scanning
- Web UI Dashboard: Browser-based interface with real-time progress tracking
# Clone the repository
git clone https://github.com/YOUR_USERNAME/phi-scanner.git
cd phi-scanner
# Install core dependencies
pip install -r requirements.txt
# Install web UI dependencies (optional)
pip install -r requirements-web.txtpip install -e .
# With web UI support
pip install -e ".[web]"
# With all optional dependencies
pip install -e ".[all]"- Python 3.9+
- Salesforce CLI installed and authenticated
npm install -g @salesforce/cli sf org login web --alias myorg
python main.py --list-orgspython main.py --org myorg --interactivepython main.py --org myorg --objects Account,Contact --check-encryptionpython main.py --web
# Open http://localhost:8000 in your browserOpen the generated PHI_Scan_Report_*.xlsx file in Excel.
# Scan default org
python main.py
# Scan specific org
python main.py --org myorg
# Scan specific objects only
python main.py --org myorg --objects Account,Contact,Lead,My_Custom__c# Select objects from a menu
python main.py --org myorg --interactiveAvailable commands in interactive mode:
| Command | Action |
|---|---|
1-5,8,10 |
Toggle specific objects |
all / a |
Select all objects |
none / n |
Clear all selections |
custom / cu |
Select only custom objects |
std / s |
Select only standard objects |
done / d |
Proceed with scan |
quit / q |
Cancel and exit |
# Check which PHI fields are not encrypted
python main.py --org myorg --check-encryption
# Full scan with encryption analysis
python main.py --org myorg --objects Account --check-encryption# Launch on default port 8000
python main.py --web
# Launch on custom port
python main.py --web --port 3000# Scan from local SFDX project (no API calls)
python main.py --mode metadata-only --source ./force-app# Excel output (default)
python main.py --org myorg --format excel
# CSV output (single flat file)
python main.py --org myorg --format csv --output ./reports/phi-audit.csv
# JSON output
python main.py --org myorg --format json --output ./reports/phi-audit.json| Option | Short | Description |
|---|---|---|
--org |
-o |
SF CLI org alias |
--mode |
-m |
Scan mode: full, metadata-only, hybrid |
--source |
-s |
Path to force-app for local scanning |
--objects |
Comma-separated list of objects to scan | |
--format |
-f |
Output format: excel, csv, json |
--output |
Output file path | |
--interactive |
-i |
Interactive object selection mode |
--check-encryption |
Check Shield Platform Encryption status | |
--web |
Launch web UI dashboard | |
--port |
Port for web UI (default: 8000) | |
--org-name |
Organization name for report header | |
--config |
-c |
Path to YAML config file |
--patterns |
Path to custom PHI patterns JSON | |
--list-orgs |
List authenticated orgs and exit | |
--quiet |
-q |
Suppress progress output |
Fields matching patterns like:
- SSN, Social Security
- Birth Date, DOB
- Medical, Health, Clinical
- Diagnosis, Treatment, Medication
- Insurance, Policy, Claim
- Surgery, Procedure
Fields matching patterns like:
- Phone, Mobile, Email
- Address, Street, City, Zip
- Name, First, Last
- Emergency Contact
- Account Number, Member ID
Fields matching patterns like:
- Description, Notes, Comments
- History, Record
- Payment, Amount, Balance
| Recommendation | Use Case |
|---|---|
| Deterministic | Fields that need to be searchable/filterable (SSN, IDs, Phone) |
| Probabilistic | Sensitive text fields (Medical notes, Descriptions) |
| Review & Encrypt | Medium-risk fields requiring business decision |
| No Encryption | Non-PHI fields |
- Summary Sheet: Professional audit format with statistics, methodology, and risk definitions
- Encryption Gaps Sheet: PHI fields not protected by Shield encryption (when using
--check-encryption) - Per-Object Tabs: All fields with risk tier, assessment, encryption status, and sample data
- Single flat file with all fields across all objects
- Includes encryption status columns when using
--check-encryption
- Structured data for programmatic processing
- Includes metadata, summary, encryption gap analysis, and per-object field details
phi-scanner/
├── main.py # CLI entry point
├── pyproject.toml # Package configuration
├── requirements.txt # Core dependencies
├── requirements-web.txt # Web UI dependencies
├── README.md # This file
├── LICENSE # MIT License
├── config/
│ ├── default_patterns.json # PHI detection patterns
│ └── sample_config.yaml # Example configuration
├── scanner/
│ ├── __init__.py # Package init (v1.1.1)
│ ├── config.py # Configuration management
│ ├── connection.py # SF CLI integration
│ ├── metadata.py # Metadata retrieval
│ ├── categorizer.py # PHI risk classification
│ ├── sampler.py # Data sampling with masking
│ ├── reporter.py # Report generation
│ ├── interactive.py # Interactive object selection
│ └── encryption.py # Encryption status checker
└── web/
├── __init__.py # Web package init
├── app.py # FastAPI application
├── models.py # Pydantic models + SQLite DB
├── templates/ # Jinja2 HTML templates
└── static/ # CSS and JavaScript
Create a custom patterns file based on config/default_patterns.json:
{
"tier1_high": [
"SSN|Social.*Security",
"My_Custom_PHI_Pattern"
],
"tier2_medium": [...],
"tier3_low": [...]
}Then run with:
python main.py --org myorg --patterns ./my-patterns.jsonCopy config/sample_config.yaml and customize:
- Object selection
- Sampling settings
- Output preferences
Install Salesforce CLI:
npm install -g @salesforce/cliAuthenticate to your org:
sf org login web --alias myorgRe-authenticate:
sf org login web --alias myorgIncrease timeout in config or use --mode metadata-only for large orgs.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - see LICENSE for details.
Cloud Beacon Consulting