Skip to content

Commit 482bb5b

Browse files
committed
feat: functional render pipeline ytt and helm
1 parent 7fee42c commit 482bb5b

File tree

12 files changed

+219
-21
lines changed

12 files changed

+219
-21
lines changed

.github/workflows/build-images-manifests.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
name: Presidio Docker Build
22

33
on:
4-
push:
5-
# pull_request: # Future run on main pushes only.
4+
push:
5+
# pull_request: # Future run on main pushes only.
66
# branches: [ main ]
77
workflow_dispatch:
88

@@ -35,7 +35,7 @@ jobs:
3535
run: |
3636
tag=$(curl -s https://api.github.com/repos/microsoft/presidio/releases/latest | jq -r .tag_name)
3737
echo "tag=$tag" >> $GITHUB_OUTPUT
38-
38+
3939
# SDSC ADD-ON
4040
- name: Checkout Presidio (latest release)
4141
uses: actions/checkout@v5
@@ -87,7 +87,7 @@ jobs:
8787
- name: Create all multi-platform manifests
8888
run: |
8989
IMAGES=("presidio-anonymizer" "presidio-analyzer" "presidio-image-redactor")
90-
90+
9191
for image in "${IMAGES[@]}"; do
9292
echo "Creating manifest for $image"
9393
docker buildx imagetools create \

docs/presidio-poc.md

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,17 @@
88
- can be deployed as an API server using a compose stack
99

1010
## API usage
11+
1112
2-steps:
13+
1214
- analyze: NER from raw text using models
1315
- anonymize: config (rule) based processing of pre-detected PII
1416

1517
### analyze
18+
1619
- Minimal requirements: text + language. By default, all recognizers for that language are enabled.
1720
```sh
18-
$ curl http://localhost:5002/analyze -s --header "Content-Type: application/json" --request POST --data '{"text": "John Smith drivers license is AC432223","language": "en"}' | jq
21+
$ curl http://localhost:5002/analyze -s --header "Content-Type: application/json" --request POST --data '{"text": "John Smith drivers license is AC432223","language": "en"}' | jq
1922
[
2023
{
2124
"analysis_explanation": null,
@@ -33,19 +36,22 @@
3336
}
3437
]
3538
```
36-
- analysis can be controlled by setting detection score, selecting entities, adding context words and adding a correlation id(?)
39+
- analysis can be controlled by setting detection score, selecting entities, adding context words and adding a correlation id(?)
3740
- ad-hoc pattern (regex) recognizers can be provided as json objects
3841
- a correlation-id (hash) can be given to append to logs for easier grouping of analyses in logs / traces.
3942

4043
### anonymize
44+
4145
- By default, the anonymization replaces all detected identifies by their type (e.g. <PERSON>) in the input text.
4246
- An anonymizer dictionary can be provided to associate specific anonymization procedure to specific entity types.
4347
- Two inputs must be given to the endpoint:
4448
- the raw text
4549
- the response from the analyze step (detected entities and their positions)
4650

4751
### artificial sample
52+
4853
Input:
54+
4955
```
5056
Prof. Gérard Waeber, Chef de service
5157
Tél: +41 21 314 68 85 / Fax: +41 21 314 08 95
@@ -77,8 +83,10 @@ jfldéijf
7783
Dr Médecin 00 Formateur
7884
Chef de clinique
7985
```
86+
8087
- ## initial tests
81-
Works with example artifical lettre de sortie.
88+
Works with example artifical lettre de sortie.
89+
8290
```python
8391
import json
8492
import requests
@@ -129,7 +137,9 @@ print(
129137
## limitations
130138

131139
### potential improvements
140+
132141
Model configuration
142+
133143
```yaml
134144
# config.yaml
135145
nlp_engine_name: spacy
@@ -157,30 +167,28 @@ ner_model_configuration:
157167
```
158168
159169
Recognizer configuration
170+
160171
```yaml
161172
# recognizers.yaml
162173
recognizers:
163-
-
164-
name: "Swiss Zip code Recognizer"
174+
- name: "Swiss Zip code Recognizer"
165175
supported_languages:
166176
- language: fr
167177
context: [adresse, postal]
168178
- language: de
169-
context: [ort,]
179+
context: [ort]
170180
- language: it
171181
context: [...]
172182

173183
patterns:
174-
-
175-
name: "zip code (weak)"
176-
regex: "(\\b\\d{5}(?:\\-\\d{4})?\\b)"
177-
score: 0.01
184+
- name: "zip code (weak)"
185+
regex: "(\\b\\d{5}(?:\\-\\d{4})?\\b)"
186+
score: 0.01
178187
context:
179-
- zip
180-
- code
188+
- zip
189+
- code
181190
supported_entity: "ZIP"
182-
-
183-
name: "Titles recognizer"
191+
- name: "Titles recognizer"
184192
supported_language: "en"
185193
supported_entity: "TITLE"
186194
deny_list:
@@ -190,5 +198,4 @@ recognizers:
190198
- Miss
191199
- Dr.
192200
- Prof.
193-
194201
```

external/vendir.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,13 @@ directories:
88
url: https://github.com/microsoft/presidio
99
ref: refs/tags/2.2.360
1010
newRootPath: docs/samples/deployments/k8s/charts/presidio
11+
# - path: helm/presidio
12+
# contents:
13+
# - path: .
14+
# helmChart:
15+
# name: presidio
16+
# version: 2.2.360
17+
# git:
18+
# url: https://github.com/microsoft/presidio
19+
# ref: refs/tags/2.2.360
20+
# subPath: docs/samples/deployments/k8s/charts/presidio

justfile

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,18 @@ render-ytt dir="src":
3333
fd '^ytt$' {{dir}} \
3434
-x sh -c 'ytt -f {}/values.yaml -f external/ytt/$(basename {//}) --output-files {}/out'
3535

36+
# Render when the code was pulled in via ytt but is a helm template
37+
[private]
38+
render-ytt-extract-helm-template dir="src":
39+
# render mixed ytt + helm templates with our values into src/<service>/mix/out
40+
fd '^helm$' {{dir}} \
41+
-x sh -c 'helm template $(basename {//}) external/ytt/$(basename {//}) -f {}/values.yaml --output-dir {}/out'
42+
3643
# Render manifests
3744
render dir="src":
3845
just fetch && \
39-
just render-helm {{dir}} && \
4046
just render-ytt {{dir}} && \
47+
just render-ytt-extract-helm-template {{dir}} && \
4148
just format
4249

4350
# Apply manifests in dir to the cluster.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
# Source: presidio/templates/analyzer-deployment.yaml
3+
apiVersion: apps/v1
4+
kind: Deployment
5+
metadata:
6+
name: presidio-presidio-analyzer
7+
labels:
8+
app: presidio-presidio-analyzer
9+
chart: "presidio-2"
10+
release: "presidio"
11+
heritage: "Helm"
12+
spec:
13+
replicas: 1
14+
selector:
15+
matchLabels:
16+
app: presidio-presidio-analyzer
17+
template:
18+
metadata:
19+
labels:
20+
app: presidio-presidio-analyzer
21+
spec:
22+
containers:
23+
- name: presidio
24+
image: "ghcr.io/presidio-analyzer:latest"
25+
imagePullPolicy: Always
26+
ports:
27+
- containerPort: 8080
28+
resources:
29+
requests:
30+
memory: 1500Mi
31+
cpu: 1500m
32+
limits:
33+
memory: 3000Mi
34+
cpu: 2000m
35+
env:
36+
- name: PORT
37+
value: "8080"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
# Source: presidio/templates/analyzer-service.yaml
3+
apiVersion: v1
4+
kind: Service
5+
metadata:
6+
name: presidio-presidio-analyzer
7+
labels:
8+
app: presidio-presidio-analyzer
9+
service: presidio-presidio-analyzer
10+
chart: "presidio-2"
11+
release: "presidio"
12+
heritage: "Helm"
13+
spec:
14+
type: ClusterIP
15+
ports:
16+
- port: 80
17+
targetPort: 8080
18+
protocol: TCP
19+
name: http
20+
selector:
21+
app: presidio-presidio-analyzer
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
# Source: presidio/templates/anonymizer-deployment.yaml
3+
apiVersion: apps/v1
4+
kind: Deployment
5+
metadata:
6+
name: presidio-presidio-anonymizer
7+
labels:
8+
app: presidio-presidio-anonymizer
9+
chart: "presidio-2"
10+
release: "presidio"
11+
heritage: "Helm"
12+
spec:
13+
replicas: 1
14+
selector:
15+
matchLabels:
16+
app: presidio-presidio-anonymizer
17+
template:
18+
metadata:
19+
labels:
20+
app: presidio-presidio-anonymizer
21+
spec:
22+
containers:
23+
- name: presidio
24+
image: "ghcr.io/presidio-anonymizer:latest"
25+
imagePullPolicy: Always
26+
ports:
27+
- containerPort: 8080
28+
resources:
29+
requests:
30+
memory: 128Mi
31+
cpu: 125m
32+
limits:
33+
memory: 512Mi
34+
cpu: 500m
35+
env:
36+
- name: PORT
37+
value: "8080"
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
# Source: presidio/templates/anonymizer-image-deployment.yaml
3+
apiVersion: apps/v1
4+
kind: Deployment
5+
metadata:
6+
name: presidio-presidio-image-redactor
7+
labels:
8+
app: presidio-presidio-image-redactor
9+
chart: "presidio-2"
10+
release: "presidio"
11+
heritage: "Helm"
12+
spec:
13+
replicas: 1
14+
selector:
15+
matchLabels:
16+
app: presidio-presidio-image-redactor
17+
template:
18+
metadata:
19+
labels:
20+
app: presidio-presidio-image-redactor
21+
spec:
22+
containers:
23+
- name: presidio
24+
image: "ghcr.io/presidio-image-redactor:latest"
25+
imagePullPolicy: Always
26+
ports:
27+
- containerPort: 8080
28+
resources:
29+
requests:
30+
memory: 1500Mi
31+
cpu: 1500m
32+
limits:
33+
memory: 3000Mi
34+
cpu: 2000m
35+
env:
36+
- name: PORT
37+
value: "8080"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
# Source: presidio/templates/anonymizer-image-service.yaml
3+
apiVersion: v1
4+
kind: Service
5+
metadata:
6+
name: presidio-presidio-image-redactor
7+
labels:
8+
app: presidio-presidio-image-redactor
9+
service: presidio-presidio-image-redactor
10+
chart: "presidio-2"
11+
release: "presidio"
12+
heritage: "Helm"
13+
spec:
14+
type: ClusterIP
15+
ports:
16+
- port: 80
17+
targetPort: 8080
18+
protocol: TCP
19+
name: http
20+
selector:
21+
app: presidio-presidio-image-redactor
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
# Source: presidio/templates/anonymizer-service.yaml
3+
apiVersion: v1
4+
kind: Service
5+
metadata:
6+
name: presidio-presidio-anonymizer
7+
labels:
8+
app: presidio-presidio-anonymizer
9+
service: presidio-presidio-anonymizer
10+
chart: "presidio-2"
11+
release: "presidio"
12+
heritage: "Helm"
13+
spec:
14+
type: ClusterIP
15+
ports:
16+
- port: 80
17+
targetPort: 8080
18+
protocol: TCP
19+
name: http
20+
selector:
21+
app: presidio-presidio-anonymizer

0 commit comments

Comments
 (0)