Skip to content

Commit 3a47194

Browse files
committed
Update query docs to use the new "for query" field
This change updates the EQL query docs to use the new field `"q"` (for query). This field tells proxy to perform encryption for a specific query operation (instead of performing source encryption and encryption for all indexes). This change also updates the JSON schema: - add new "for query" field - add missing STE vec index field - remove ".1" suffix from m, u, and o fields
1 parent 8220c15 commit 3a47194

File tree

2 files changed

+134
-67
lines changed

2 files changed

+134
-67
lines changed

README.md

Lines changed: 110 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -141,22 +141,52 @@ These statements must be run through the CipherStash Proxy in order to **decrypt
141141
**Example:**
142142

143143
```rb
144-
Users.findAll(&:encrypted_email)
144+
filter = EQL.for_match("users", "email_encrypted", "test")
145+
User.select(:email_encrypted).where("cs_match_v1(email_encrypted) @> cs_match_v1(?)", filter)
145146
```
146147

147148
Which will execute on the server as:
148149

149150
```sql
150-
SELECT encrypted_email FROM users;
151+
SELECT email_encrypted FROM users
152+
WHERE cs_match_v1(email_encrypted) @> cs_match_v1('{
153+
"v": 1,
154+
"k": "ct",
155+
"p": "test",
156+
"i": {
157+
"t": "users",
158+
"c": "email_encrypted"
159+
},
160+
"q": "match"
161+
}');
151162
```
152163

153-
And is the EQL equivalent of the following plaintext query.
164+
A similar plaintext query (without EQL) could look like:
154165

155166
```sql
156-
SELECT email FROM users;
167+
SELECT email FROM users WHERE email LIKE 'test%';
168+
```
169+
170+
Note that plaintext payloads for query operations should set the `"q"` (for query) property.
171+
This property tells CipherStash Proxy to only perform encryption necessary for a specific operation.
172+
Otherwise, Proxy will perform source encryption and encryption for all indexes configured for the given column.
173+
174+
For reference, the EQL payload is defined as a `jsonb` with a specific schema:
175+
176+
```json
177+
{
178+
"v": 1,
179+
"k": "ct",
180+
181+
"i": {
182+
"t": "users",
183+
"c": "email_encrypted"
184+
},
185+
"q": "match"
186+
}
157187
```
158188

159-
All the data returned from the database is fully decrypted and an audit trail is generated.
189+
All the data returned from the database is fully decrypted.
160190

161191
## Querying data with EQL
162192

@@ -170,15 +200,15 @@ Enables basic full-text search.
170200

171201
```rb
172202
# Create the EQL payload using helper functions
173-
payload = eqlPayload("users", "encrpyted_field", "plaintext value")
203+
payload = EQL.for_match("users", "encrypted_field", "plaintext value")
174204

175205
Users.where("cs_match_v1(field) @> cs_match_v1(?)", payload)
176206
```
177207

178208
Which will execute on the server as:
179209

180210
```sql
181-
SELECT * FROM users WHERE cs_match_v1(field) @> cs_match_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrpyted_field"}}');
211+
SELECT * FROM users WHERE cs_match_v1(field) @> cs_match_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrypted_field"},"q":"match"}');
182212
```
183213

184214
And is the EQL equivalent of the following plaintext query.
@@ -195,15 +225,15 @@ Retrieves the unique index for enforcing uniqueness.
195225

196226
```rb
197227
# Create the EQL payload using helper functions
198-
payload = eqlPayload("users", "encrpyted_field", "plaintext value")
228+
payload = EQL.for_unique("users", "encrypted_field", "plaintext value")
199229

200230
Users.where("cs_unique_v1(field) = cs_unique_v1(?)", payload)
201231
```
202232

203233
Which will execute on the server as:
204234

205235
```sql
206-
SELECT * FROM users WHERE cs_unique_v1(field) = cs_unique_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrpyted_field"}}');
236+
SELECT * FROM users WHERE cs_unique_v1(field) = cs_unique_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrypted_field"},"q":"unique"}');
207237
```
208238

209239
And is the EQL equivalent of the following plaintext query.
@@ -220,7 +250,7 @@ Retrieves the Order-Revealing Encryption index for range queries.
220250

221251
```rb
222252
# Create the EQL payload using helper functions
223-
eqlPayload("users", "encrypted_date", Time.now)
253+
date = EQL.for_ore("users", "encrypted_date", Time.now)
224254

225255
User.where("cs_ore_64_8_v1(encrypted_date) < cs_ore_64_8_v1(?)", date)
226256
```
@@ -246,7 +276,7 @@ User.order("cs_ore_64_8_v1(encrypted_field)").all().map(&:id)
246276
Which will execute on the server as:
247277

248278
```sql
249-
SELECT id FROM examples ORDER BY cs_ore_64_8_v1(feild) DESC;
279+
SELECT id FROM examples ORDER BY cs_ore_64_8_v1(encrypted_field) DESC;
250280
```
251281

252282
And is the EQL equivalent of the following plaintext query.
@@ -259,7 +289,7 @@ SELECT id FROM examples ORDER BY field DESC;
259289

260290
### `cs_ste_term_v1(val JSONB, epath TEXT)`
261291

262-
Retrieves the encrypted *term* associated with the encrypted JSON path, `epath`.
292+
Retrieves the encrypted _term_ associated with the encrypted JSON path, `epath`.
263293

264294
### `cs_ste_vec_v1(val JSONB)`
265295

@@ -269,7 +299,7 @@ Retrieves the Structured Encryption Vector for containment queries.
269299

270300
```rb
271301
# Serialize a JSONB value bound to the users table column
272-
term = User::ENCRYPTED_JSONB.serialize({field: "value"})
302+
term = EQL.for_ste_vec("users", "attrs", {field: "value"})
273303
User.where("cs_ste_vec_v1(attrs) @> cs_ste_vec_v1(?)", term)
274304
```
275305

@@ -295,8 +325,8 @@ This is useful for sorting or filtering on integers in encrypted JSON objects.
295325
296326
```rb
297327
# Serialize a JSONB value bound to the users table column
298-
path = EJSON_PATH.serialize("$.login_count")
299-
term = User::ENCRYPTED_INT.serialize(100)
328+
path = EQL.for_ejson_path("users", "attrs", "$.login_count")
329+
term = EQL.for_ore("users", "attrs", 100)
300330
User.where("cs_ste_term_v1(attrs, ?) > cs_ore_64_8_v1(?)", path, term)
301331
```
302332
@@ -309,18 +339,18 @@ SELECT * FROM users WHERE cs_ste_term_v1(attrs, 'DQ1rbhWJXmmqi/+niUG6qw') > 'QAJ
309339
And is the EQL equivalent of the following plaintext query.
310340
311341
```sql
312-
SELECT * FROM users WHERE attrs->'login_count' > 10;
342+
SELECT * FROM users WHERE attrs->'login_count' > 10;
313343
```
314344
315345
### `cs_ste_value_v1(val JSONB, epath TEXT)`
316346
317-
Retrieves the encrypted *value* associated with the encrypted JSON path, `epath`.
347+
Retrieves the encrypted _value_ associated with the encrypted JSON path, `epath`.
318348
319349
**Example:**
320350
321351
```rb
322352
# Serialize a JSONB value bound to the users table column
323-
path = EJSON_PATH.serialize("$.login_count")
353+
path = EQL.for_ejson_path("users", "attrs", "$.login_count")
324354
User.find_by_sql(["SELECT cs_ste_value_v1(attrs, ?) FROM users", path])
325355
```
326356
@@ -333,7 +363,7 @@ SELECT cs_ste_value_v1(attrs, 'DQ1rbhWJXmmqi/+niUG6qw') FROM users;
333363
And is the EQL equivalent of the following plaintext query.
334364
335365
```sql
336-
SELECT attrs->'login_count' FROM users;
366+
SELECT attrs->'login_count' FROM users;
337367
```
338368
339369
## Managing indexes with EQL
@@ -346,24 +376,25 @@ These functions expect a `jsonb` value that conforms to the storage schema.
346376
cs_add_index(table_name text, column_name text, index_name text, cast_as text, opts jsonb)
347377
```
348378
349-
| Parameter | Description | Notes
350-
| ------------- | -------------------------------------------------- | ------------------------------------
351-
| `table_name` | Name of target table | Required
352-
| `column_name` | Name of target column | Required
353-
| `index_name` | The index kind | Required.
354-
| `cast_as` | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text`
355-
| `opts` | Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below)
379+
| Parameter | Description | Notes |
380+
| ------------- | -------------------------------------------------- | ------------------------------------------------------------------------ |
381+
| `table_name` | Name of target table | Required |
382+
| `column_name` | Name of target column | Required |
383+
| `index_name` | The index kind | Required. |
384+
| `cast_as` | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text` |
385+
| `opts` | Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below) |
356386
357387
#### cast_as
358388
359389
Supported types:
360-
- `text`
361-
- `int`
362-
- `small_int`
363-
- `big_int`
364-
- `boolean`
365-
- `date`
366-
- `jsonb`
390+
391+
- `text`
392+
- `int`
393+
- `small_int`
394+
- `big_int`
395+
- `boolean`
396+
- `date`
397+
- `jsonb`
367398
368399
#### match opts
369400
@@ -428,13 +459,13 @@ An ste_vec index requires one piece of configuration: the `context` (a string) w
428459
This ensures that all of the encrypted values are unique to that context.
429460
It is generally recommended to use the table and column name as a the context (e.g. `users/name`).
430461
431-
Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts cannot be compared.
432-
Containment queries that manage to mix index terms from multiple columns will never return a positive result.
462+
Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts cannot be compared.
463+
Containment queries that manage to mix index terms from multiple columns will never return a positive result.
433464
This is by design.
434465
435466
The index is generated from a JSONB document by first flattening the structure of the document such that a hash can be generated for each unique path prefix to a node.
436467
437-
The complete set of JSON types is supported by the indexer.
468+
The complete set of JSON types is supported by the indexer.
438469
Null values are ignored by the indexer.
439470
440471
- Object `{ ... }`
@@ -451,12 +482,9 @@ For a document like this:
451482
"email": "[email protected]",
452483
"name": {
453484
"first_name": "Alice",
454-
"last_name": "McCrypto",
485+
"last_name": "McCrypto"
455486
},
456-
"roles": [
457-
"admin",
458-
"owner",
459-
]
487+
"roles": ["admin", "owner"]
460488
}
461489
}
462490
```
@@ -466,17 +494,33 @@ Hashes would be produced from the following list of entries:
466494
```js
467495
[
468496
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")],
469-
[Obj, Key("account"), Obj, Key("name"), Obj, Key("first_name"), String("Alice")],
470-
[Obj, Key("account"), Obj, Key("name"), Obj, Key("last_name"), String("McCrypto")],
497+
[
498+
Obj,
499+
Key("account"),
500+
Obj,
501+
Key("name"),
502+
Obj,
503+
Key("first_name"),
504+
String("Alice"),
505+
],
506+
[
507+
Obj,
508+
Key("account"),
509+
Obj,
510+
Key("name"),
511+
Obj,
512+
Key("last_name"),
513+
String("McCrypto"),
514+
],
471515
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")],
472516
[Obj, Key("account"), Obj, Key("roles"), Array, String("owner")],
473-
]
517+
];
474518
```
475519
476520
Using the first entry to illustrate how an entry is converted to hashes:
477521
478522
```js
479-
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")]
523+
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")];
480524
```
481525
482526
The hashes would be generated for all prefixes of the full path to the leaf node.
@@ -489,15 +533,15 @@ The hashes would be generated for all prefixes of the full path to the leaf node
489533
[Obj, Key("account"), Obj, Key("email")],
490534
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")],
491535
// (remaining leaf nodes omitted)
492-
]
536+
];
493537
```
494538
495539
Query terms are processed in the same manner as the input document.
496540
497541
A query prior to encrypting & indexing looks like a structurally similar subset of the encrypted document, for example:
498542
499543
```json
500-
{ "account": { "email": "[email protected]", "roles": "admin" }}
544+
{ "account": { "email": "[email protected]", "roles": "admin" } }
501545
```
502546
503547
The expression `cs_ste_vec_v1(encrypted_account) @> cs_ste_vec_v1($query)` would match all records where the `encrypted_account` column contains a JSONB object with an "account" key containing an object with an "email" key where the value is the string "[email protected]".
@@ -510,11 +554,12 @@ When reduced to a prefix list, it would look like this:
510554
[Obj, Key("account")],
511555
[Obj, Key("account"), Obj],
512556
[Obj, Key("account"), Obj, Key("email")],
513-
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")]
514-
[Obj, Key("account"), Obj, Key("roles")],
557+
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")][
558+
(Obj, Key("account"), Obj, Key("roles"))
559+
],
515560
[Obj, Key("account"), Obj, Key("roles"), Array],
516-
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")]
517-
]
561+
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")],
562+
];
518563
```
519564
520565
Which is then turned into an ste_vec of hashes which can be directly queries against the index.
@@ -573,19 +618,20 @@ The format is defined as a [JSON Schema](src/cs_encrypted_v1.schema.json).
573618
It should never be necessary to directly interact with the stored `jsonb`.
574619
Cipherstash proxy handles the encoding, and EQL provides the functions.
575620
576-
| Field | Name | Description
577-
| -------- | ------------------ | ------------------------------------------------------------
578-
| s | Schema version | JSON Schema version of this json document.
579-
| v | Version | The configuration version that generated this stored value.
580-
| k | Kind | The kind of the data (plaintext/pt, ciphertext/ct, encrypting/et).
581-
| i.t | Table identifier | Name of the table containing encrypted column.
582-
| i.c | Column identifier | Name of the encrypted column.
583-
| p | Plaintext | Plaintext value sent by database client. Required if kind is plaintext/pt or encrypting/et.
584-
| c | Ciphertext | Ciphertext value. Encrypted by proxy. Required if kind is plaintext/pt or encrypting/et.
585-
| m.1 | Match index | Ciphertext index value. Encrypted by proxy.
586-
| o.1 | ORE index | Ciphertext index value. Encrypted by proxy.
587-
| u.1 | Unique index | Ciphertext index value. Encrypted by proxy.
588-
| sv.1 | STE vector index | Ciphertext index value. Encrypted by proxy.
621+
| Field | Name | Description |
622+
| ----- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
623+
| s | Schema version | JSON Schema version of this json document. |
624+
| v | Version | The configuration version that generated this stored value. |
625+
| k | Kind | The kind of the data (plaintext/pt, ciphertext/ct, encrypting/et). |
626+
| i.t | Table identifier | Name of the table containing encrypted column. |
627+
| i.c | Column identifier | Name of the encrypted column. |
628+
| p | Plaintext | Plaintext value sent by database client. Required if kind is plaintext/pt or encrypting/et. |
629+
| q | For query | Specifies that the plaintext should be encrypted for a specific query operation. If `null`, source encryption and encryption for all indexes will be performed. Valid values are `"match"`, `"ore"`, `"unique"`, `"ste_vec"`, `"ejson_path"`, and `"websearch_to_match"`. |
630+
| c | Ciphertext | Ciphertext value. Encrypted by proxy. Required if kind is plaintext/pt or encrypting/et. |
631+
| m | Match index | Ciphertext index value. Encrypted by proxy. |
632+
| o | ORE index | Ciphertext index value. Encrypted by proxy. |
633+
| u | Unique index | Ciphertext index value. Encrypted by proxy. |
634+
| sv | STE vector index | Ciphertext index value. Encrypted by proxy. |
589635
590636
## Helper packages
591637

0 commit comments

Comments
 (0)