Skip to content

Commit 59a6982

Browse files
authored
Merge pull request #18 from cipherstash/add-docs-on-query-only-encryption
Update docs and JSON schema with "query-only" encryption changes
2 parents 8220c15 + 6122f41 commit 59a6982

File tree

3 files changed

+220
-126
lines changed

3 files changed

+220
-126
lines changed

README.md

Lines changed: 82 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -40,14 +40,14 @@ Once the custom types and functions are installed, you can start using EQL in yo
4040

4141
1. Create a table with a column of type `cs_encrypted_v1` which will store your encrypted data.
4242
1. Use EQL functions to add indexes for the columns you want to encrypt.
43-
- Indexes are used by Cipherstash Proxy to understand what cryptography schemes are required for your use case.
43+
- Indexes are used by Cipherstash Proxy to understand what cryptography schemes are required for your use case.
4444
1. Initialize Cipherstash Proxy for cryptographic operations.
45-
- The Proxy will dynamically encrypt data on the way in and decrypt data on the way out based on the indexes you have defined.
45+
- The Proxy will dynamically encrypt data on the way in and decrypt data on the way out based on the indexes you have defined.
4646
1. Insert data into the defined columns using a specific payload format.
47-
- The payload format is defined in the [data format](#data-format) section.
47+
- The payload format is defined in the [data format](#data-format) section.
4848
1. Query the data using the EQL functions defined in the [querying data with EQL](#querying-data-with-eql) section.
49-
- No modifications are required to simply `SELECT` data from your encrypted columns.
50-
- In order to perform `WHERE` and `ORDER BY` queries, you must wrap the queries in the EQL functions defined in the [querying data with EQL](#querying-data-with-eql) section.
49+
- No modifications are required to simply `SELECT` data from your encrypted columns.
50+
- In order to perform `WHERE` and `ORDER BY` queries, you must wrap the queries in the EQL functions defined in the [querying data with EQL](#querying-data-with-eql) section.
5151
1. Integrate with your application via the [helper packages](#helper-packages) to interact with the encrypted data.
5252

5353
You can find a full getting started guide in the [GETTINGSTARTED.md](GETTINGSTARTED.md) file.
@@ -150,13 +150,13 @@ Which will execute on the server as:
150150
SELECT encrypted_email FROM users;
151151
```
152152

153-
And is the EQL equivalent of the following plaintext query.
153+
And is the EQL equivalent of the following plaintext query:
154154

155155
```sql
156156
SELECT email FROM users;
157157
```
158158

159-
All the data returned from the database is fully decrypted and an audit trail is generated.
159+
All the data returned from the database is fully decrypted.
160160

161161
## Querying data with EQL
162162

@@ -170,15 +170,15 @@ Enables basic full-text search.
170170

171171
```rb
172172
# Create the EQL payload using helper functions
173-
payload = eqlPayload("users", "encrpyted_field", "plaintext value")
173+
payload = EQL.for_match("users", "encrypted_field", "plaintext value")
174174

175175
Users.where("cs_match_v1(field) @> cs_match_v1(?)", payload)
176176
```
177177

178178
Which will execute on the server as:
179179

180180
```sql
181-
SELECT * FROM users WHERE cs_match_v1(field) @> cs_match_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrpyted_field"}}');
181+
SELECT * FROM users WHERE cs_match_v1(field) @> cs_match_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrypted_field"},"q":"match"}');
182182
```
183183

184184
And is the EQL equivalent of the following plaintext query.
@@ -195,15 +195,15 @@ Retrieves the unique index for enforcing uniqueness.
195195

196196
```rb
197197
# Create the EQL payload using helper functions
198-
payload = eqlPayload("users", "encrpyted_field", "plaintext value")
198+
payload = EQL.for_unique("users", "encrypted_field", "plaintext value")
199199

200200
Users.where("cs_unique_v1(field) = cs_unique_v1(?)", payload)
201201
```
202202

203203
Which will execute on the server as:
204204

205205
```sql
206-
SELECT * FROM users WHERE cs_unique_v1(field) = cs_unique_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrpyted_field"}}');
206+
SELECT * FROM users WHERE cs_unique_v1(field) = cs_unique_v1('{"v":1,"k":"pt","p":"plaintext value","i":{"t":"users","c":"encrypted_field"},"q":"unique"}');
207207
```
208208

209209
And is the EQL equivalent of the following plaintext query.
@@ -220,7 +220,7 @@ Retrieves the Order-Revealing Encryption index for range queries.
220220

221221
```rb
222222
# Create the EQL payload using helper functions
223-
eqlPayload("users", "encrypted_date", Time.now)
223+
date = EQL.for_ore("users", "encrypted_date", Time.now)
224224

225225
User.where("cs_ore_64_8_v1(encrypted_date) < cs_ore_64_8_v1(?)", date)
226226
```
@@ -246,7 +246,7 @@ User.order("cs_ore_64_8_v1(encrypted_field)").all().map(&:id)
246246
Which will execute on the server as:
247247

248248
```sql
249-
SELECT id FROM examples ORDER BY cs_ore_64_8_v1(feild) DESC;
249+
SELECT id FROM examples ORDER BY cs_ore_64_8_v1(encrypted_field) DESC;
250250
```
251251

252252
And is the EQL equivalent of the following plaintext query.
@@ -259,7 +259,7 @@ SELECT id FROM examples ORDER BY field DESC;
259259

260260
### `cs_ste_term_v1(val JSONB, epath TEXT)`
261261

262-
Retrieves the encrypted *term* associated with the encrypted JSON path, `epath`.
262+
Retrieves the encrypted _term_ associated with the encrypted JSON path, `epath`.
263263

264264
### `cs_ste_vec_v1(val JSONB)`
265265

@@ -269,7 +269,7 @@ Retrieves the Structured Encryption Vector for containment queries.
269269

270270
```rb
271271
# Serialize a JSONB value bound to the users table column
272-
term = User::ENCRYPTED_JSONB.serialize({field: "value"})
272+
term = EQL.for_ste_vec("users", "attrs", {field: "value"})
273273
User.where("cs_ste_vec_v1(attrs) @> cs_ste_vec_v1(?)", term)
274274
```
275275

@@ -295,8 +295,8 @@ This is useful for sorting or filtering on integers in encrypted JSON objects.
295295
296296
```rb
297297
# Serialize a JSONB value bound to the users table column
298-
path = EJSON_PATH.serialize("$.login_count")
299-
term = User::ENCRYPTED_INT.serialize(100)
298+
path = EQL.for_ejson_path("users", "attrs", "$.login_count")
299+
term = EQL.for_ore("users", "attrs", 100)
300300
User.where("cs_ste_term_v1(attrs, ?) > cs_ore_64_8_v1(?)", path, term)
301301
```
302302
@@ -309,18 +309,18 @@ SELECT * FROM users WHERE cs_ste_term_v1(attrs, 'DQ1rbhWJXmmqi/+niUG6qw') > 'QAJ
309309
And is the EQL equivalent of the following plaintext query.
310310
311311
```sql
312-
SELECT * FROM users WHERE attrs->'login_count' > 10;
312+
SELECT * FROM users WHERE attrs->'login_count' > 10;
313313
```
314314
315315
### `cs_ste_value_v1(val JSONB, epath TEXT)`
316316
317-
Retrieves the encrypted *value* associated with the encrypted JSON path, `epath`.
317+
Retrieves the encrypted _value_ associated with the encrypted JSON path, `epath`.
318318
319319
**Example:**
320320
321321
```rb
322322
# Serialize a JSONB value bound to the users table column
323-
path = EJSON_PATH.serialize("$.login_count")
323+
path = EQL.for_ejson_path("users", "attrs", "$.login_count")
324324
User.find_by_sql(["SELECT cs_ste_value_v1(attrs, ?) FROM users", path])
325325
```
326326
@@ -333,7 +333,7 @@ SELECT cs_ste_value_v1(attrs, 'DQ1rbhWJXmmqi/+niUG6qw') FROM users;
333333
And is the EQL equivalent of the following plaintext query.
334334
335335
```sql
336-
SELECT attrs->'login_count' FROM users;
336+
SELECT attrs->'login_count' FROM users;
337337
```
338338
339339
## Managing indexes with EQL
@@ -346,24 +346,25 @@ These functions expect a `jsonb` value that conforms to the storage schema.
346346
cs_add_index(table_name text, column_name text, index_name text, cast_as text, opts jsonb)
347347
```
348348
349-
| Parameter | Description | Notes
350-
| ------------- | -------------------------------------------------- | ------------------------------------
351-
| `table_name` | Name of target table | Required
352-
| `column_name` | Name of target column | Required
353-
| `index_name` | The index kind | Required.
354-
| `cast_as` | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text`
355-
| `opts` | Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below)
349+
| Parameter | Description | Notes |
350+
| ------------- | -------------------------------------------------- | ------------------------------------------------------------------------ |
351+
| `table_name` | Name of target table | Required |
352+
| `column_name` | Name of target column | Required |
353+
| `index_name` | The index kind | Required. |
354+
| `cast_as` | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text` |
355+
| `opts` | Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below) |
356356
357357
#### cast_as
358358
359359
Supported types:
360-
- `text`
361-
- `int`
362-
- `small_int`
363-
- `big_int`
364-
- `boolean`
365-
- `date`
366-
- `jsonb`
360+
361+
- `text`
362+
- `int`
363+
- `small_int`
364+
- `big_int`
365+
- `boolean`
366+
- `date`
367+
- `jsonb`
367368
368369
#### match opts
369370
@@ -428,13 +429,13 @@ An ste_vec index requires one piece of configuration: the `context` (a string) w
428429
This ensures that all of the encrypted values are unique to that context.
429430
It is generally recommended to use the table and column name as a the context (e.g. `users/name`).
430431
431-
Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts cannot be compared.
432-
Containment queries that manage to mix index terms from multiple columns will never return a positive result.
432+
Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts cannot be compared.
433+
Containment queries that manage to mix index terms from multiple columns will never return a positive result.
433434
This is by design.
434435
435436
The index is generated from a JSONB document by first flattening the structure of the document such that a hash can be generated for each unique path prefix to a node.
436437
437-
The complete set of JSON types is supported by the indexer.
438+
The complete set of JSON types is supported by the indexer.
438439
Null values are ignored by the indexer.
439440
440441
- Object `{ ... }`
@@ -451,12 +452,9 @@ For a document like this:
451452
"email": "[email protected]",
452453
"name": {
453454
"first_name": "Alice",
454-
"last_name": "McCrypto",
455+
"last_name": "McCrypto"
455456
},
456-
"roles": [
457-
"admin",
458-
"owner",
459-
]
457+
"roles": ["admin", "owner"]
460458
}
461459
}
462460
```
@@ -466,17 +464,33 @@ Hashes would be produced from the following list of entries:
466464
```js
467465
[
468466
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")],
469-
[Obj, Key("account"), Obj, Key("name"), Obj, Key("first_name"), String("Alice")],
470-
[Obj, Key("account"), Obj, Key("name"), Obj, Key("last_name"), String("McCrypto")],
467+
[
468+
Obj,
469+
Key("account"),
470+
Obj,
471+
Key("name"),
472+
Obj,
473+
Key("first_name"),
474+
String("Alice"),
475+
],
476+
[
477+
Obj,
478+
Key("account"),
479+
Obj,
480+
Key("name"),
481+
Obj,
482+
Key("last_name"),
483+
String("McCrypto"),
484+
],
471485
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")],
472486
[Obj, Key("account"), Obj, Key("roles"), Array, String("owner")],
473-
]
487+
];
474488
```
475489
476490
Using the first entry to illustrate how an entry is converted to hashes:
477491
478492
```js
479-
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")]
493+
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")];
480494
```
481495
482496
The hashes would be generated for all prefixes of the full path to the leaf node.
@@ -489,15 +503,15 @@ The hashes would be generated for all prefixes of the full path to the leaf node
489503
[Obj, Key("account"), Obj, Key("email")],
490504
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")],
491505
// (remaining leaf nodes omitted)
492-
]
506+
];
493507
```
494508
495509
Query terms are processed in the same manner as the input document.
496510
497511
A query prior to encrypting & indexing looks like a structurally similar subset of the encrypted document, for example:
498512
499513
```json
500-
{ "account": { "email": "[email protected]", "roles": "admin" }}
514+
{ "account": { "email": "[email protected]", "roles": "admin" } }
501515
```
502516
503517
The expression `cs_ste_vec_v1(encrypted_account) @> cs_ste_vec_v1($query)` would match all records where the `encrypted_account` column contains a JSONB object with an "account" key containing an object with an "email" key where the value is the string "[email protected]".
@@ -510,11 +524,12 @@ When reduced to a prefix list, it would look like this:
510524
[Obj, Key("account")],
511525
[Obj, Key("account"), Obj],
512526
[Obj, Key("account"), Obj, Key("email")],
513-
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")]
514-
[Obj, Key("account"), Obj, Key("roles")],
527+
[Obj, Key("account"), Obj, Key("email"), String("[email protected]")][
528+
(Obj, Key("account"), Obj, Key("roles"))
529+
],
515530
[Obj, Key("account"), Obj, Key("roles"), Array],
516-
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")]
517-
]
531+
[Obj, Key("account"), Obj, Key("roles"), Array, String("admin")],
532+
];
518533
```
519534
520535
Which is then turned into an ste_vec of hashes which can be directly queries against the index.
@@ -573,19 +588,20 @@ The format is defined as a [JSON Schema](src/cs_encrypted_v1.schema.json).
573588
It should never be necessary to directly interact with the stored `jsonb`.
574589
Cipherstash proxy handles the encoding, and EQL provides the functions.
575590
576-
| Field | Name | Description
577-
| -------- | ------------------ | ------------------------------------------------------------
578-
| s | Schema version | JSON Schema version of this json document.
579-
| v | Version | The configuration version that generated this stored value.
580-
| k | Kind | The kind of the data (plaintext/pt, ciphertext/ct, encrypting/et).
581-
| i.t | Table identifier | Name of the table containing encrypted column.
582-
| i.c | Column identifier | Name of the encrypted column.
583-
| p | Plaintext | Plaintext value sent by database client. Required if kind is plaintext/pt or encrypting/et.
584-
| c | Ciphertext | Ciphertext value. Encrypted by proxy. Required if kind is plaintext/pt or encrypting/et.
585-
| m.1 | Match index | Ciphertext index value. Encrypted by proxy.
586-
| o.1 | ORE index | Ciphertext index value. Encrypted by proxy.
587-
| u.1 | Unique index | Ciphertext index value. Encrypted by proxy.
588-
| sv.1 | STE vector index | Ciphertext index value. Encrypted by proxy.
591+
| Field | Name | Description |
592+
| ----- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
593+
| s | Schema version | JSON Schema version of this json document. |
594+
| v | Version | The configuration version that generated this stored value. |
595+
| k | Kind | The kind of the data (plaintext/pt, ciphertext/ct, encrypting/et). |
596+
| i.t | Table identifier | Name of the table containing encrypted column. |
597+
| i.c | Column identifier | Name of the encrypted column. |
598+
| p | Plaintext | Plaintext value sent by database client. Required if kind is plaintext/pt or encrypting/et. |
599+
| q | For query | Specifies that the plaintext should be encrypted for a specific query operation. If `null`, source encryption and encryption for all indexes will be performed. Valid values are `"match"`, `"ore"`, `"unique"`, `"ste_vec"`, `"ejson_path"`, and `"websearch_to_match"`. |
600+
| c | Ciphertext | Ciphertext value. Encrypted by proxy. Required if kind is plaintext/pt or encrypting/et. |
601+
| m | Match index | Ciphertext index value. Encrypted by proxy. |
602+
| o | ORE index | Ciphertext index value. Encrypted by proxy. |
603+
| u | Unique index | Ciphertext index value. Encrypted by proxy. |
604+
| sv | STE vector index | Ciphertext index value. Encrypted by proxy. |
589605
590606
## Helper packages
591607

0 commit comments

Comments
 (0)