|
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to the Protegrity Developer Edition project will be documented in this file. |
| 4 | + |
| 5 | +## [Current Release] |
| 6 | + |
| 7 | +### 🎉 Major New Features |
| 8 | + |
| 9 | +#### Enhanced Data Protection Capabilities |
| 10 | +- **Protection (Tokenization-like)**: New protect & unprotect functionality for specific data elements |
| 11 | +- **Find and Protect**: Combined discovery and protection workflow via `sample-app-find-and-protect.py` |
| 12 | +- **Direct Protection CLI**: New `sample-app-protection.py` for command-line protect/unprotect operations |
| 13 | +- **PII Discovery**: Enhanced entity enumeration with confidence scores via `sample-app-find.py` |
| 14 | + |
| 15 | +#### Semantic Guardrail Integration |
| 16 | +- **GenAI Security**: Message & conversation level risk scoring for AI applications |
| 17 | +- **Multi-turn Conversation Support**: PII scanning across conversation history |
| 18 | +- **Dual Interface Support**: Both cURL and Python examples provided in `semantic-guardrail/` folder |
| 19 | +- **Risk Assessment**: Comprehensive risk scoring for GenAI flows |
| 20 | + |
| 21 | +### 🏗️ Architecture & Structure Changes |
| 22 | + |
| 23 | +#### Repository Structure Enhancements |
| 24 | +- **New Semantic Guardrail Module**: Added `semantic-guardrail/` directory |
| 25 | + - `sample-guardrail-command.sh` - cURL-based examples |
| 26 | + - `sample-guardrail-python.py` - Python integration examples |
| 27 | +- **Enhanced Sample Applications**: Expanded `samples/` directory structure |
| 28 | + - **NEW**: `sample-app-find.py` - Dedicated PII discovery (list entities only) |
| 29 | + - **ENHANCED**: `sample-app-find-and-redact.py` - Improved redaction/masking |
| 30 | + - **NEW**: `sample-app-find-and-protect.py` - Combined find and protect workflow |
| 31 | + - **NEW**: `sample-app-protection.py` - Direct protection CLI interface |
| 32 | +- **Enhanced Sample Data**: Expanded `sample-data/` structure |
| 33 | + - **RENAMED**: `sample-find-redact.txt` → `input.txt` |
| 34 | + - **NEW**: `output-redact.txt` - Produced by redact workflow |
| 35 | + - **NEW**: `output-protect.txt` - Produced by protect workflow |
| 36 | + - Dynamic file generation based on operation type |
| 37 | + |
| 38 | +#### Docker Compose Orchestration |
| 39 | +- **Multi-Service Architecture**: Enhanced `docker-compose.yml` with semantic guardrail services |
| 40 | +- **ML Provider Backends**: Added `presidio-provider-service` & `roberta-provider-service` |
| 41 | +- **Service Dependencies**: Proper orchestration between classification and semantic guardrail services |
| 42 | +- **Port Management**: Classification (8580) + Semantic Guardrail (8581) services |
| 43 | + |
| 44 | +### 🔧 Enhanced Configuration & Service Features |
| 45 | + |
| 46 | +#### New Service Endpoints & Health Checks |
| 47 | +- **Classification API**: `http://localhost:8580/pty/data-discovery/v1.0/classify` |
| 48 | +- **Semantic Guardrail API**: `http://localhost:8581/pty/semantic-guardrail/v1.0/conversations/messages/scan` |
| 49 | +- **Health Monitoring**: Built-in service health verification procedures |
| 50 | +- **Service Restart**: Comprehensive docker compose management commands |
| 51 | + |
| 52 | +### 🔐 Authentication & Registration Requirements |
| 53 | + |
| 54 | +#### Protection API Authentication (NEW) |
| 55 | +- **Registration Requirement**: Protection features now require user registration |
| 56 | +- **Credential Management**: Support for email, password, and API key authentication |
| 57 | +- **Environment Variables**: Secure credential handling for protection operations |
| 58 | +```bash |
| 59 | +export DEV_EDITION_EMAIL="<your_registered_email>" |
| 60 | +export DEV_EDITION_PASSWORD="<your_portal_password>" |
| 61 | +export DEV_EDITION_API_KEY="<your_api_key>" |
| 62 | +``` |
| 63 | +- **Credential Verification**: Built-in verification commands for environment setup |
| 64 | + |
| 65 | +### 📋 Sample Applications Evolution |
| 66 | + |
| 67 | +#### From Single to Multiple Application Suite |
| 68 | +**Previous:** |
| 69 | +- Single sample: `sample-app-find-and-redact.py` |
| 70 | +- Basic redaction workflow only |
| 71 | +- Single output file |
| 72 | + |
| 73 | +**Current (README.md):** |
| 74 | +1. **Discovery Only**: `sample-app-find.py` - Entity enumeration with JSON output |
| 75 | +2. **Find and Redact**: Enhanced redaction/masking with configurable output |
| 76 | +3. **Find and Protect**: Tokenization-like protection workflow (requires registration) |
| 77 | +4. **Direct Protection**: CLI-based protect/unprotect operations (requires registration) |
| 78 | +5. **Semantic Guardrail (cURL)**: Risk assessment via command line |
| 79 | +6. **Semantic Guardrail (Python)**: Multi-turn conversation security |
| 80 | + |
| 81 | +#### Enhanced Workflow Documentation |
| 82 | +- **Step-by-step Guides**: Detailed instructions for each sample application |
| 83 | +- **Prerequisites Separation**: Clear distinction between basic and registration-required features |
| 84 | +- **Output Documentation**: Detailed explanation of generated files and their purposes |
| 85 | + |
| 86 | +### 🎯 GenAI & AI Integration (NEW) |
| 87 | + |
| 88 | +#### Advanced AI Security Features |
| 89 | +- **Conversation Risk Scoring**: Real-time risk assessment for AI conversations |
| 90 | +- **Multi-turn PII Scanning**: Persistent PII detection across conversation history |
| 91 | +- **GenAI Application Support**: Dedicated features for securing AI applications |
| 92 | +- **Semantic Analysis**: Advanced semantic understanding for context-aware protection |
| 93 | + |
| 94 | +### 📚 Documentation & Developer Experience |
| 95 | + |
| 96 | +#### Enhanced Overview & Features |
| 97 | +**Previous**: Basic data discovery and redaction |
| 98 | +**Current**: Comprehensive platform with: |
| 99 | +- Unstructured text classification, PII discovery, redaction, masking, and tokenization-like protection |
| 100 | +- Semantic guardrails for GenAI applications |
| 101 | +- Message/conversation risk scoring + PII scanning |
| 102 | + |
| 103 | +#### Improved Developer Guidance |
| 104 | +- **Detailed Setup Instructions**: Step-by-step guides with verification steps |
| 105 | +- **Configuration Examples**: Comprehensive configuration documentation with examples |
| 106 | +- **Troubleshooting**: Enhanced error handling and debugging guidance |
| 107 | +- **Community Support**: Expanded issue reporting with sample script name & log snippet requirements |
| 108 | + |
| 109 | +### ⚙️ Infrastructure & Operations |
| 110 | + |
| 111 | +#### Docker Compose Evolution |
| 112 | +**Previous**: Simple data discovery service startup |
| 113 | +**Current**: Multi-service orchestration |
| 114 | +- **Service Description**: Detailed explanation of each container service |
| 115 | +- **Dependency Management**: Proper service startup order and dependencies |
| 116 | +- **Resource Management**: Optimized container download and deployment |
| 117 | +- **Port Configuration**: Flexible port management with environment variable support |
| 118 | + |
| 119 | +### 🐛 Configuration Breaking Changes |
| 120 | + |
| 121 | +#### Configuration Schema Updates |
| 122 | +**Previous config.json structure:** |
| 123 | +```json |
| 124 | +{ |
| 125 | + "api_endpoint": "http://localhost:8580/pty/data-discovery/v1.0/classify", |
| 126 | + "named_entity_map": { "CREDIT_CARD": "CCN", "DATE_TIME": "DATE" }, |
| 127 | + "redaction_method": "redact", |
| 128 | + "masking_character": "#", |
| 129 | + "classification_threshold": 0.6, |
| 130 | + "enable_logging": true |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +**Current config.json structure:** |
| 135 | +```json |
| 136 | +{ |
| 137 | + "masking_char": "#", |
| 138 | + "named_entity_map": { |
| 139 | + "USERNAME": "USERNAME", "STATE": "STATE", |
| 140 | + "PHONE_NUMBER": "PHONE", "SOCIAL_SECURITY_NUMBER": "SSN", |
| 141 | + "AGE": "AGE", "CITY": "CITY", "PERSON": "PERSON" |
| 142 | + }, |
| 143 | + "method": "redact" |
| 144 | +} |
| 145 | +``` |
| 146 | + |
| 147 | +**Key Changes:** |
| 148 | +- Simplified structure with smart defaults |
| 149 | +- Expanded entity mapping |
| 150 | +- Streamlined configuration keys |
| 151 | +- Internal endpoint management |
| 152 | + |
| 153 | +--- |
| 154 | + |
| 155 | +## [Previous Release] - README.md Baseline |
| 156 | + |
| 157 | +### Features (Baseline) |
| 158 | +- Basic unstructured text classification and PII redaction |
| 159 | +- Single Docker service for data discovery |
| 160 | +- Single sample application (`sample-app-find-and-redact.py`) |
| 161 | +- Basic configuration via `config.json` |
| 162 | +- Simple repository structure with `data-discovery/` and `samples/` folders |
| 163 | + |
| 164 | +### Limitations (Baseline) |
| 165 | +- No protection/unprotection capabilities |
| 166 | +- No semantic guardrail functionality |
| 167 | +- No authentication or registration system |
| 168 | +- Single output format (redacted text only) |
| 169 | +- Limited configuration options |
| 170 | +- Basic docker compose setup |
| 171 | + |
| 172 | +--- |
| 173 | + |
| 174 | +*Note: This release represents a major evolution from a simple data discovery and redaction tool to a comprehensive data protection and AI security platform with advanced semantic guardrail capabilities, authentication systems, and multiple workflow options.* |
0 commit comments