@@ -14,74 +14,92 @@ The `permissions.json` file contains one JSON object per line, each defining a p
1414- ** permission** : The type of permission (currently only "READ" is supported)
1515- ** table** : A pattern (supporting wildcards) that matches table names
1616
17- ## Protocol Prefix Permissions
17+ ## Protocol Prefixes as Table Namespaces
1818
19- Starting with the wildcard support for cloud storage paths, you can now control access to different storage protocols using permission patterns:
19+ Protocol prefixes ( ` file:// ` , ` gs:// ` , ` s3:// ` ) are treated as table namespaces, just like dataset namespaces (e.g., ` opteryx.* ` ). You can control access to these protocols by adding permission entries for specific roles.
2020
21- ### File System Access
21+ ### Example Configurations
22+
23+ #### Restrict a Role to Only Dataset Access (No Cloud Storage)
2224``` json
23- {"role" :" file_access " , "permission" : " READ" , "table" : " file:// *" }
25+ {"role" :" restricted " , "permission" : " READ" , "table" : " opteryx. *" }
2426```
25- Grants read access to all local file system paths using the ` file:// ` protocol .
27+ Users with the ` restricted ` role can only access tables in the ` opteryx.* ` namespace, but cannot access ` file:// ` , ` gs:// ` , or ` s3:// ` paths .
2628
27- ### Google Cloud Storage Access
29+ #### Grant a Role Access to Dataset and GCS
2830``` json
29- {"role" :" gcs_access" , "permission" : " READ" , "table" : " gs://*" }
31+ {"role" :" data_analyst" , "permission" : " READ" , "table" : " opteryx.*" }
32+ {"role" :" data_analyst" , "permission" : " READ" , "table" : " gs://*" }
3033```
31- Grants read access to all Google Cloud Storage paths using the ` gs:// ` protocol .
34+ Users with the ` data_analyst ` role can access both ` opteryx.* ` tables and any ` gs:// ` paths .
3235
33- ### Amazon S3 Access
36+ #### Grant a Role Access to All Cloud Protocols
3437``` json
35- {"role" :" s3_access" , "permission" : " READ" , "table" : " s3://*" }
38+ {"role" :" data_engineer" , "permission" : " READ" , "table" : " opteryx.*" }
39+ {"role" :" data_engineer" , "permission" : " READ" , "table" : " file://*" }
40+ {"role" :" data_engineer" , "permission" : " READ" , "table" : " gs://*" }
41+ {"role" :" data_engineer" , "permission" : " READ" , "table" : " s3://*" }
3642```
37- Grants read access to all Amazon S3 paths using the ` s3:// ` protocol.
38-
39- ## Examples
40-
41- ### Restrict Access to Specific Protocols
43+ Users with the ` data_engineer ` role can access all data sources.
4244
43- A user with only the ` restricted ` role can only access tables in the ` opteryx.* ` namespace:
45+ #### Grant a Role Access to Specific GCS Buckets
4446``` json
45- {"role" :" restricted " , "permission" : " READ" , "table" : " opteryx. *" }
47+ {"role" :" project_team " , "permission" : " READ" , "table" : " gs://project-bucket/ *" }
4648```
49+ Users with the ` project_team ` role can only access paths in the ` gs://project-bucket/ ` bucket.
4750
48- ### Grant Multi-Protocol Access
49-
50- A user can have multiple roles to access different protocols:
51- - Role ` file_access ` + role ` gcs_access ` → can access both ` file:// ` and ` gs:// ` paths
52- - Role ` restricted ` + role ` s3_access ` → can access ` opteryx.* ` tables and ` s3:// ` paths
51+ ## Default Access
5352
54- ### Default Access
55-
56- The system includes a default role ` opteryx ` that has access to everything:
53+ The system includes a default role ` opteryx ` with wildcard access to everything:
5754``` json
5855{"role" :" opteryx" , "permission" : " READ" , "table" : " *" }
5956```
57+ This is added automatically and cannot be overridden by the permissions.json file.
6058
6159## Usage in Queries
6260
63- When you query using protocol prefixes, the permission system checks the full table name :
61+ When you query using protocol prefixes, the permission system checks if your role has access to that table pattern :
6462
6563``` sql
66- -- Requires 'gcs_access' role or 'opteryx' role
64+ -- Requires a role with permission for "gs://*" pattern
6765SELECT * FROM gs:// my- bucket/ data/* .parquet
6866
69- -- Requires 's3_access' role or 'opteryx' role
67+ -- Requires a role with permission for "s3://*" pattern
7068SELECT * FROM s3://my-bucket/logs/2024-01-??.csv
7169
72- -- Requires 'file_access' role or 'opteryx' role
70+ -- Requires a role with permission for "file://*" pattern
7371SELECT * FROM file://path/to/data/*.csv
7472
75- -- Requires 'restricted' role or 'opteryx' role
73+ -- Requires a role with permission for "opteryx.*" pattern
74+ SELECT * FROM opteryx.space_missions
75+ ```
76+
77+ ## Multiple Roles
78+
79+ Users can have multiple roles. If any role grants access to a table pattern, the user can access it:
80+
81+ ```sql
82+ -- User with roles ["restricted", "cloud_user"] where:
83+ -- - "restricted" has permission for "opteryx.*"
84+ -- - "cloud_user" has permission for "gs://*"
85+
86+ -- ✓ Allowed - restricted role grants access
7687SELECT * FROM opteryx.space_missions
88+
89+ -- ✓ Allowed - cloud_user role grants access
90+ SELECT * FROM gs://bucket/data/*.parquet
91+
92+ -- ✗ Denied - no role grants access
93+ SELECT * FROM s3://bucket/data/*.parquet
7794```
7895
7996## Security Best Practices
8097
81981. **Least Privilege**: Only grant the minimum permissions needed for each role
82- 2. **Separate Roles**: Create separate roles for different data sources (file, GCS, S3, databases)
83- 3. **Monitor Access**: Log and review which roles access which data sources
84- 4. **Audit Regularly**: Review and update permissions as access requirements change
99+ 2. **Namespace Separation**: Use table patterns to restrict access to specific namespaces or buckets
100+ 3. **Protocol Control**: Explicitly grant or deny protocol access (file://, gs://, s3://) per role
101+ 4. **Monitor Access**: Log and review which roles access which data sources
102+ 5. **Audit Regularly**: Review and update permissions as access requirements change
85103
86104## Testing
87105
0 commit comments