|
| 1 | +# Protegrity Developer Edition |
| 2 | + |
| 3 | +Welcome to the `protegrity-developer-edition` repository, part of the Protegrity Developer Edition suite. This repository provides a self-contained experimentation platform for discovering and protecting sensitive data using Protegrity’s Data Discovery and Protection APIs. |
| 4 | + |
| 5 | +## 🚀 Overview |
| 6 | + |
| 7 | +This repository enables developers to: |
| 8 | +- Rapidly set up a local environment using Docker Compose. |
| 9 | +- Experiment with unstructured text classification and PII redaction. |
| 10 | +- Integrate Protegrity APIs into GenAI and traditional applications. |
| 11 | +- Use sample applications and data to understand integration workflows. |
| 12 | + |
| 13 | +## 📦 Repository Structure |
| 14 | + |
| 15 | +```text |
| 16 | +. |
| 17 | +├── CHANGELOG |
| 18 | +├── CONTRIBUTIONS.md |
| 19 | +├── LICENSE |
| 20 | +├── README.md |
| 21 | +├── data-discovery |
| 22 | +│ ├── sample-classification-commands.sh |
| 23 | +│ └── sample-classification-python.py |
| 24 | +├── docker-compose.yml |
| 25 | +└── samples |
| 26 | + ├── config.json |
| 27 | + ├── requirements.txt |
| 28 | + ├── sample-app-find-and-redact.py |
| 29 | + ├── sample-app-find.py |
| 30 | + └── sample-data |
| 31 | + └── sample-find-redact.txt |
| 32 | +``` |
| 33 | + |
| 34 | +## 🧰 Features |
| 35 | + |
| 36 | +- **Data Discovery**: REST-based classification of unstructured text using Data Discovery. |
| 37 | +- **Data Protection**: Integration with a sample Python application for redaction or masking. |
| 38 | +- **Sample App**: Demonstrates how to find and redact PII. |
| 39 | +- **Cross-platform**: Works on Linux, Windows, and MacOS. |
| 40 | + |
| 41 | +## 🛠️ Getting Started |
| 42 | + |
| 43 | +### Prerequisites |
| 44 | +- [Python >= 3.9.23](https://www.python.org/downloads/) |
| 45 | +- [pip](https://pip.pypa.io/en/stable/installation/) |
| 46 | +- [Python Virtual Environment](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/) |
| 47 | +- Container management software: |
| 48 | + - For Linux/Windows: [Docker](https://docs.docker.com/reference/cli/docker/) |
| 49 | + - For MacOS: [Docker Desktop](https://docs.docker.com/reference/cli/docker/) or Colima |
| 50 | +- [Docker Compose V2](https://docs.docker.com/compose/install/) |
| 51 | +- [Git](https://git-scm.com/downloads) |
| 52 | + |
| 53 | +Linux and Windows users can proceed to [Setup Instructions](#setup-instructions). |
| 54 | + |
| 55 | +**Additional settings for MacOS** |
| 56 | + |
| 57 | +MacOS requires additional steps for Docker and for systems with Apple Silicon chips. Complete the following steps before using Developer Edition. |
| 58 | + |
| 59 | +1. Complete one of the following options to apply the settings. |
| 60 | + - For Colima: |
| 61 | + 1. Open a command prompt. |
| 62 | + 2. Run the following command. |
| 63 | + ``` |
| 64 | + colima start --vm-type vz --vz-rosetta |
| 65 | + ``` |
| 66 | + - For Docker Desktop: |
| 67 | + 1. Open Docker Desktop. |
| 68 | + 2. Go to **Settings > General**. |
| 69 | + 3. Enable the following check boxes: |
| 70 | + - **Use Virtualization framework** |
| 71 | + - **Use Rosetta for x86_64/amd64 emulation on Apple Silicon** |
| 72 | + 4. Click **Apply & restart**. |
| 73 | +
|
| 74 | +2. Update one of the following options for resolving certificate related errors. |
| 75 | + - For Colima: |
| 76 | + 1. Open a command prompt. |
| 77 | + 2. Navigate and open the following file. |
| 78 | + |
| 79 | + ``` |
| 80 | + ~/.colima/default/colima.yaml |
| 81 | + ``` |
| 82 | + 3. Update the following configuration in `colima.yaml` to add the path for obtaining the required images. |
| 83 | +
|
| 84 | + Before update: |
| 85 | + ``` |
| 86 | + docker: {} |
| 87 | + ``` |
| 88 | + |
| 89 | + After update: |
| 90 | + ``` |
| 91 | + docker: |
| 92 | + insecure-registries: |
| 93 | + - ghcr.io |
| 94 | + ``` |
| 95 | + 4. Save and close the file. |
| 96 | + 5. Stop colima. |
| 97 | + ``` |
| 98 | + colima stop |
| 99 | + ``` |
| 100 | + 6. Close and start the command prompt. |
| 101 | + 7. Start colima. |
| 102 | + ``` |
| 103 | + colima start --vm-type vz --vz-rosetta |
| 104 | + ``` |
| 105 | + - For Docker Desktop: |
| 106 | + 1. Open Docker Desktop. |
| 107 | + 2. Click the gear or settings icon. |
| 108 | + 3. Click **Docker Engine** from the sidebar. The editor with your current Docker daemon configuration `daemon.json` opens. |
| 109 | + 4. Locate and add the `insecure-registries` key in the root JSON object. Ensure that you add a comma after the last value in the existing configuration. |
| 110 | +
|
| 111 | + After update: |
| 112 | + ``` |
| 113 | + { |
| 114 | + . |
| 115 | + . |
| 116 | + <existing configuration>, |
| 117 | + "insecure-registries": [ |
| 118 | + "ghcr.io", |
| 119 | + "githubusercontent.com" |
| 120 | + ] |
| 121 | + } |
| 122 | + ``` |
| 123 | +
|
| 124 | + 5. Click **Apply & Restart** to save the changes and restart Docker Desktop. |
| 125 | + 6. Verify: After Docker restarts, run `docker info` in your terminal and confirm that the required registry is listed under **Insecure Registries**. |
| 126 | +
|
| 127 | +3. Optional: If the *The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested* error is displayed. |
| 128 | +
|
| 129 | + 1. Start a command prompt. |
| 130 | + 2. Navigate and open the following file. |
| 131 | +
|
| 132 | + ``` |
| 133 | + ~/.docker/config.json |
| 134 | + ``` |
| 135 | + 3. Add the following paramater. |
| 136 | + ``` |
| 137 | + "default-platform": "linux/amd64" |
| 138 | + ``` |
| 139 | + 4. Save and close the file. |
| 140 | + 5. Run the `docker compose up -d` from the `protegrity-developer-edition` directory if already cloned, else proceed to Setup Instructions. |
| 141 | +
|
| 142 | +### Setup Instructions |
| 143 | +
|
| 144 | +Complete the steps provided here to clone, install, find, and test the Developer Edition. |
| 145 | +
|
| 146 | +1. Open a command prompt. |
| 147 | +2. Clone the git repository. |
| 148 | + ``` |
| 149 | + git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-edition.git |
| 150 | + ``` |
| 151 | +3. Navigate to the `protegrity-developer-edition` directory in the cloned location. |
| 152 | +4. Start the Data Discovery services in background. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy. |
| 153 | + ``` |
| 154 | + docker compose up -d |
| 155 | + ``` |
| 156 | + Based on your configuration use the `docker-compose up -d` command. |
| 157 | +5. Install the `protegrity-developer-python` module. It is recommended to install and activate the Python virtual environment before installing the module. |
| 158 | + ```bash |
| 159 | + pip install protegrity-developer-python |
| 160 | + ``` |
| 161 | + The installation completes and the success message is displayed. |
| 162 | +
|
| 163 | +
|
| 164 | +### Run the Sample application |
| 165 | +
|
| 166 | +Complete the steps provided here to run the sample application. The sample application reads the `sample-find-redact.txt` file, classifies and redacts the sensitive data, and the `output.txt` file is saved to the folder `samples/sample-data`. |
| 167 | +
|
| 168 | +1. Open a command prompt. |
| 169 | +2. Navigate to the `protegrity-developer-edition` directory in the cloned location. |
| 170 | +3. Run the sample application. |
| 171 | + ``` |
| 172 | + python samples/sample-app-find-and-redact.py |
| 173 | + ``` |
| 174 | +
|
| 175 | +## 📄 Configuration |
| 176 | +
|
| 177 | +Edit `samples/config.json` to customize the Python module: |
| 178 | +- API endpoint (Default: `localhost`) |
| 179 | +- Named entity mappings |
| 180 | +- Redaction method (`redact` or `mask`, Default: `redact`) |
| 181 | +- Masking Character (Default: `#`) |
| 182 | +- Classification score threshold (Default: `0.6`) |
| 183 | +- Enable logging (Default: `true`) |
| 184 | +```json |
| 185 | +{ |
| 186 | + "api_endpoint": "http://localhost:8580/pty/data-discovery/v1.0/classify", |
| 187 | + "named_entity_map": { |
| 188 | + "CREDIT_CARD": "CCN", |
| 189 | + "DATE_TIME": "DATE" |
| 190 | + }, |
| 191 | + "redaction_method": "redact", |
| 192 | + "masking_character": "#", |
| 193 | + "classification_threshold": 0.6, |
| 194 | + "enable_logging": true |
| 195 | +} |
| 196 | +``` |
| 197 | + |
| 198 | +## 📚 Documentation |
| 199 | + |
| 200 | +- The Protegrity Developer Edition documentation is available at [http://developer.docs.protegrity.com/](http://developer.docs.protegrity.com/). |
| 201 | +- For API reference and tutorials, visit the Developer Portal at [https://www.protegrity.com/developers](https://www.protegrity.com/developers). |
| 202 | + |
| 203 | +## 📢 Community & Support |
| 204 | + |
| 205 | +- Join the discussion on https://github.com/orgs/Protegrity-Developer-Edition/discussions. |
| 206 | +- Anonymous downloads supported; registration required for participation. |
| 207 | + |
| 208 | +## 📜 License |
| 209 | + |
| 210 | +See [LICENSE](https://github.com/Protegrity-Developer-Edition/protegrity-developer-edition/blob/main/LICENSE) for terms and conditions. |
0 commit comments