Skip to content

Commit c68efeb

Browse files
Merge pull request #17 from Protegrity-Developer-Edition/develop
Developer edition 1.0.0 release
2 parents 0a2aa43 + de021a6 commit c68efeb

13 files changed

+881
-65
lines changed

CHANGELOG

Whitespace-only changes.

CHANGELOG.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Changelog
2+
3+
All notable changes to the Protegrity Developer Edition project will be documented in this file.
4+
5+
## [Current Release]
6+
7+
### 🎉 Major New Features
8+
9+
#### Enhanced Data Protection Capabilities
10+
- **Protection (Tokenization-like)**: New protect & unprotect functionality for specific data elements
11+
- **Find and Protect**: Combined discovery and protection workflow via `sample-app-find-and-protect.py`
12+
- **Direct Protection CLI**: New `sample-app-protection.py` for command-line protect/unprotect operations
13+
- **PII Discovery**: Enhanced entity enumeration with confidence scores via `sample-app-find.py`
14+
15+
#### Semantic Guardrail Integration
16+
- **GenAI Security**: Message & conversation level risk scoring for AI applications
17+
- **Multi-turn Conversation Support**: PII scanning across conversation history
18+
- **Dual Interface Support**: Both cURL and Python examples provided in `semantic-guardrail/` folder
19+
- **Risk Assessment**: Comprehensive risk scoring for GenAI flows
20+
21+
### 🏗️ Architecture & Structure Changes
22+
23+
#### Repository Structure Enhancements
24+
- **New Semantic Guardrail Module**: Added `semantic-guardrail/` directory
25+
- `sample-guardrail-command.sh` - cURL-based examples
26+
- `sample-guardrail-python.py` - Python integration examples
27+
- **Enhanced Sample Applications**: Expanded `samples/` directory structure
28+
- **NEW**: `sample-app-find.py` - Dedicated PII discovery (list entities only)
29+
- **ENHANCED**: `sample-app-find-and-redact.py` - Improved redaction/masking
30+
- **NEW**: `sample-app-find-and-protect.py` - Combined find and protect workflow
31+
- **NEW**: `sample-app-protection.py` - Direct protection CLI interface
32+
- **Enhanced Sample Data**: Expanded `sample-data/` structure
33+
- **RENAMED**: `sample-find-redact.txt``input.txt`
34+
- **NEW**: `output-redact.txt` - Produced by redact workflow
35+
- **NEW**: `output-protect.txt` - Produced by protect workflow
36+
- Dynamic file generation based on operation type
37+
38+
#### Docker Compose Orchestration
39+
- **Multi-Service Architecture**: Enhanced `docker-compose.yml` with semantic guardrail services
40+
- **ML Provider Backends**: Added `presidio-provider-service` & `roberta-provider-service`
41+
- **Service Dependencies**: Proper orchestration between classification and semantic guardrail services
42+
- **Port Management**: Classification (8580) + Semantic Guardrail (8581) services
43+
44+
### 🔧 Enhanced Configuration & Service Features
45+
46+
#### New Service Endpoints & Health Checks
47+
- **Classification API**: `http://localhost:8580/pty/data-discovery/v1.0/classify`
48+
- **Semantic Guardrail API**: `http://localhost:8581/pty/semantic-guardrail/v1.0/conversations/messages/scan`
49+
- **Health Monitoring**: Built-in service health verification procedures
50+
- **Service Restart**: Comprehensive docker compose management commands
51+
52+
### 🔐 Authentication & Registration Requirements
53+
54+
#### Protection API Authentication (NEW)
55+
- **Registration Requirement**: Protection features now require user registration
56+
- **Credential Management**: Support for email, password, and API key authentication
57+
- **Environment Variables**: Secure credential handling for protection operations
58+
```bash
59+
export DEV_EDITION_EMAIL="<your_registered_email>"
60+
export DEV_EDITION_PASSWORD="<your_portal_password>"
61+
export DEV_EDITION_API_KEY="<your_api_key>"
62+
```
63+
- **Credential Verification**: Built-in verification commands for environment setup
64+
65+
### 📋 Sample Applications Evolution
66+
67+
#### From Single to Multiple Application Suite
68+
**Previous:**
69+
- Single sample: `sample-app-find-and-redact.py`
70+
- Basic redaction workflow only
71+
- Single output file
72+
73+
**Current (README.md):**
74+
1. **Discovery Only**: `sample-app-find.py` - Entity enumeration with JSON output
75+
2. **Find and Redact**: Enhanced redaction/masking with configurable output
76+
3. **Find and Protect**: Tokenization-like protection workflow (requires registration)
77+
4. **Direct Protection**: CLI-based protect/unprotect operations (requires registration)
78+
5. **Semantic Guardrail (cURL)**: Risk assessment via command line
79+
6. **Semantic Guardrail (Python)**: Multi-turn conversation security
80+
81+
#### Enhanced Workflow Documentation
82+
- **Step-by-step Guides**: Detailed instructions for each sample application
83+
- **Prerequisites Separation**: Clear distinction between basic and registration-required features
84+
- **Output Documentation**: Detailed explanation of generated files and their purposes
85+
86+
### 🎯 GenAI & AI Integration (NEW)
87+
88+
#### Advanced AI Security Features
89+
- **Conversation Risk Scoring**: Real-time risk assessment for AI conversations
90+
- **Multi-turn PII Scanning**: Persistent PII detection across conversation history
91+
- **GenAI Application Support**: Dedicated features for securing AI applications
92+
- **Semantic Analysis**: Advanced semantic understanding for context-aware protection
93+
94+
### 📚 Documentation & Developer Experience
95+
96+
#### Enhanced Overview & Features
97+
**Previous**: Basic data discovery and redaction
98+
**Current**: Comprehensive platform with:
99+
- Unstructured text classification, PII discovery, redaction, masking, and tokenization-like protection
100+
- Semantic guardrails for GenAI applications
101+
- Message/conversation risk scoring + PII scanning
102+
103+
#### Improved Developer Guidance
104+
- **Detailed Setup Instructions**: Step-by-step guides with verification steps
105+
- **Configuration Examples**: Comprehensive configuration documentation with examples
106+
- **Troubleshooting**: Enhanced error handling and debugging guidance
107+
- **Community Support**: Expanded issue reporting with sample script name & log snippet requirements
108+
109+
### ⚙️ Infrastructure & Operations
110+
111+
#### Docker Compose Evolution
112+
**Previous**: Simple data discovery service startup
113+
**Current**: Multi-service orchestration
114+
- **Service Description**: Detailed explanation of each container service
115+
- **Dependency Management**: Proper service startup order and dependencies
116+
- **Resource Management**: Optimized container download and deployment
117+
- **Port Configuration**: Flexible port management with environment variable support
118+
119+
### 🐛 Configuration Breaking Changes
120+
121+
#### Configuration Schema Updates
122+
**Previous config.json structure:**
123+
```json
124+
{
125+
"api_endpoint": "http://localhost:8580/pty/data-discovery/v1.0/classify",
126+
"named_entity_map": { "CREDIT_CARD": "CCN", "DATE_TIME": "DATE" },
127+
"redaction_method": "redact",
128+
"masking_character": "#",
129+
"classification_threshold": 0.6,
130+
"enable_logging": true
131+
}
132+
```
133+
134+
**Current config.json structure:**
135+
```json
136+
{
137+
"masking_char": "#",
138+
"named_entity_map": {
139+
"USERNAME": "USERNAME", "STATE": "STATE",
140+
"PHONE_NUMBER": "PHONE", "SOCIAL_SECURITY_NUMBER": "SSN",
141+
"AGE": "AGE", "CITY": "CITY", "PERSON": "PERSON"
142+
},
143+
"method": "redact"
144+
}
145+
```
146+
147+
**Key Changes:**
148+
- Simplified structure with smart defaults
149+
- Expanded entity mapping
150+
- Streamlined configuration keys
151+
- Internal endpoint management
152+
153+
---
154+
155+
## [Previous Release] - README.md Baseline
156+
157+
### Features (Baseline)
158+
- Basic unstructured text classification and PII redaction
159+
- Single Docker service for data discovery
160+
- Single sample application (`sample-app-find-and-redact.py`)
161+
- Basic configuration via `config.json`
162+
- Simple repository structure with `data-discovery/` and `samples/` folders
163+
164+
### Limitations (Baseline)
165+
- No protection/unprotection capabilities
166+
- No semantic guardrail functionality
167+
- No authentication or registration system
168+
- Single output format (redacted text only)
169+
- Limited configuration options
170+
- Basic docker compose setup
171+
172+
---
173+
174+
*Note: This release represents a major evolution from a simple data discovery and redaction tool to a comprehensive data protection and AI security platform with advanced semantic guardrail capabilities, authentication systems, and multiple workflow options.*

0 commit comments

Comments
 (0)