Skip to content

Commit 3e866d5

Browse files
Copilotjoocer
andcommitted
Fix protocol permissions to use table namespaces instead of separate roles
Co-authored-by: joocer <[email protected]>
1 parent 3753c87 commit 3e866d5

File tree

3 files changed

+208
-125
lines changed

3 files changed

+208
-125
lines changed

testdata/PERMISSIONS_README.md

Lines changed: 51 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -14,74 +14,92 @@ The `permissions.json` file contains one JSON object per line, each defining a p
1414
- **permission**: The type of permission (currently only "READ" is supported)
1515
- **table**: A pattern (supporting wildcards) that matches table names
1616

17-
## Protocol Prefix Permissions
17+
## Protocol Prefixes as Table Namespaces
1818

19-
Starting with the wildcard support for cloud storage paths, you can now control access to different storage protocols using permission patterns:
19+
Protocol prefixes (`file://`, `gs://`, `s3://`) are treated as table namespaces, just like dataset namespaces (e.g., `opteryx.*`). You can control access to these protocols by adding permission entries for specific roles.
2020

21-
### File System Access
21+
### Example Configurations
22+
23+
#### Restrict a Role to Only Dataset Access (No Cloud Storage)
2224
```json
23-
{"role":"file_access", "permission": "READ", "table": "file://*"}
25+
{"role":"restricted", "permission": "READ", "table": "opteryx.*"}
2426
```
25-
Grants read access to all local file system paths using the `file://` protocol.
27+
Users with the `restricted` role can only access tables in the `opteryx.*` namespace, but cannot access `file://`, `gs://`, or `s3://` paths.
2628

27-
### Google Cloud Storage Access
29+
#### Grant a Role Access to Dataset and GCS
2830
```json
29-
{"role":"gcs_access", "permission": "READ", "table": "gs://*"}
31+
{"role":"data_analyst", "permission": "READ", "table": "opteryx.*"}
32+
{"role":"data_analyst", "permission": "READ", "table": "gs://*"}
3033
```
31-
Grants read access to all Google Cloud Storage paths using the `gs://` protocol.
34+
Users with the `data_analyst` role can access both `opteryx.*` tables and any `gs://` paths.
3235

33-
### Amazon S3 Access
36+
#### Grant a Role Access to All Cloud Protocols
3437
```json
35-
{"role":"s3_access", "permission": "READ", "table": "s3://*"}
38+
{"role":"data_engineer", "permission": "READ", "table": "opteryx.*"}
39+
{"role":"data_engineer", "permission": "READ", "table": "file://*"}
40+
{"role":"data_engineer", "permission": "READ", "table": "gs://*"}
41+
{"role":"data_engineer", "permission": "READ", "table": "s3://*"}
3642
```
37-
Grants read access to all Amazon S3 paths using the `s3://` protocol.
38-
39-
## Examples
40-
41-
### Restrict Access to Specific Protocols
43+
Users with the `data_engineer` role can access all data sources.
4244

43-
A user with only the `restricted` role can only access tables in the `opteryx.*` namespace:
45+
#### Grant a Role Access to Specific GCS Buckets
4446
```json
45-
{"role":"restricted", "permission": "READ", "table": "opteryx.*"}
47+
{"role":"project_team", "permission": "READ", "table": "gs://project-bucket/*"}
4648
```
49+
Users with the `project_team` role can only access paths in the `gs://project-bucket/` bucket.
4750

48-
### Grant Multi-Protocol Access
49-
50-
A user can have multiple roles to access different protocols:
51-
- Role `file_access` + role `gcs_access` → can access both `file://` and `gs://` paths
52-
- Role `restricted` + role `s3_access` → can access `opteryx.*` tables and `s3://` paths
51+
## Default Access
5352

54-
### Default Access
55-
56-
The system includes a default role `opteryx` that has access to everything:
53+
The system includes a default role `opteryx` with wildcard access to everything:
5754
```json
5855
{"role":"opteryx", "permission": "READ", "table": "*"}
5956
```
57+
This is added automatically and cannot be overridden by the permissions.json file.
6058

6159
## Usage in Queries
6260

63-
When you query using protocol prefixes, the permission system checks the full table name:
61+
When you query using protocol prefixes, the permission system checks if your role has access to that table pattern:
6462

6563
```sql
66-
-- Requires 'gcs_access' role or 'opteryx' role
64+
-- Requires a role with permission for "gs://*" pattern
6765
SELECT * FROM gs://my-bucket/data/*.parquet
6866
69-
-- Requires 's3_access' role or 'opteryx' role
67+
-- Requires a role with permission for "s3://*" pattern
7068
SELECT * FROM s3://my-bucket/logs/2024-01-??.csv
7169
72-
-- Requires 'file_access' role or 'opteryx' role
70+
-- Requires a role with permission for "file://*" pattern
7371
SELECT * FROM file://path/to/data/*.csv
7472
75-
-- Requires 'restricted' role or 'opteryx' role
73+
-- Requires a role with permission for "opteryx.*" pattern
74+
SELECT * FROM opteryx.space_missions
75+
```
76+
77+
## Multiple Roles
78+
79+
Users can have multiple roles. If any role grants access to a table pattern, the user can access it:
80+
81+
```sql
82+
-- User with roles ["restricted", "cloud_user"] where:
83+
-- - "restricted" has permission for "opteryx.*"
84+
-- - "cloud_user" has permission for "gs://*"
85+
86+
-- ✓ Allowed - restricted role grants access
7687
SELECT * FROM opteryx.space_missions
88+
89+
-- ✓ Allowed - cloud_user role grants access
90+
SELECT * FROM gs://bucket/data/*.parquet
91+
92+
-- ✗ Denied - no role grants access
93+
SELECT * FROM s3://bucket/data/*.parquet
7794
```
7895
7996
## Security Best Practices
8097
8198
1. **Least Privilege**: Only grant the minimum permissions needed for each role
82-
2. **Separate Roles**: Create separate roles for different data sources (file, GCS, S3, databases)
83-
3. **Monitor Access**: Log and review which roles access which data sources
84-
4. **Audit Regularly**: Review and update permissions as access requirements change
99+
2. **Namespace Separation**: Use table patterns to restrict access to specific namespaces or buckets
100+
3. **Protocol Control**: Explicitly grant or deny protocol access (file://, gs://, s3://) per role
101+
4. **Monitor Access**: Log and review which roles access which data sources
102+
5. **Audit Regularly**: Review and update permissions as access requirements change
85103
86104
## Testing
87105

testdata/permissions.json

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1 @@
11
{"role":"restricted", "permission": "READ", "table": "opteryx.*"}
2-
{"role":"file_access", "permission": "READ", "table": "file://*"}
3-
{"role":"gcs_access", "permission": "READ", "table": "gs://*"}
4-
{"role":"s3_access", "permission": "READ", "table": "s3://*"}

0 commit comments

Comments
 (0)