|  | 
|  | 1 | +# Protocol Prefix Permissions | 
|  | 2 | + | 
|  | 3 | +This directory contains example permission configurations for controlling access to different data sources in Opteryx. | 
|  | 4 | + | 
|  | 5 | +## permissions.json Format | 
|  | 6 | + | 
|  | 7 | +The `permissions.json` file contains one JSON object per line, each defining a permission rule: | 
|  | 8 | + | 
|  | 9 | +```json | 
|  | 10 | +{"role":"role_name", "permission": "READ", "table": "pattern"} | 
|  | 11 | +``` | 
|  | 12 | + | 
|  | 13 | +- **role**: The name of the role that has this permission | 
|  | 14 | +- **permission**: The type of permission (currently only "READ" is supported) | 
|  | 15 | +- **table**: A pattern (supporting wildcards) that matches table names | 
|  | 16 | + | 
|  | 17 | +## Protocol Prefixes as Table Namespaces | 
|  | 18 | + | 
|  | 19 | +Protocol prefixes (`file://`, `gs://`, `s3://`) are treated as table namespaces, just like dataset namespaces (e.g., `opteryx.*`). You can control access to these protocols by adding permission entries for specific roles. | 
|  | 20 | + | 
|  | 21 | +### Example Configurations | 
|  | 22 | + | 
|  | 23 | +#### Restrict a Role to Only Dataset Access (No Cloud Storage) | 
|  | 24 | +```json | 
|  | 25 | +{"role":"restricted", "permission": "READ", "table": "opteryx.*"} | 
|  | 26 | +``` | 
|  | 27 | +Users with the `restricted` role can only access tables in the `opteryx.*` namespace, but cannot access `file://`, `gs://`, or `s3://` paths. | 
|  | 28 | + | 
|  | 29 | +#### Grant a Role Access to Dataset and GCS | 
|  | 30 | +```json | 
|  | 31 | +{"role":"data_analyst", "permission": "READ", "table": "opteryx.*"} | 
|  | 32 | +{"role":"data_analyst", "permission": "READ", "table": "gs://*"} | 
|  | 33 | +``` | 
|  | 34 | +Users with the `data_analyst` role can access both `opteryx.*` tables and any `gs://` paths. | 
|  | 35 | + | 
|  | 36 | +#### Grant a Role Access to All Cloud Protocols | 
|  | 37 | +```json | 
|  | 38 | +{"role":"data_engineer", "permission": "READ", "table": "opteryx.*"} | 
|  | 39 | +{"role":"data_engineer", "permission": "READ", "table": "file://*"} | 
|  | 40 | +{"role":"data_engineer", "permission": "READ", "table": "gs://*"} | 
|  | 41 | +{"role":"data_engineer", "permission": "READ", "table": "s3://*"} | 
|  | 42 | +``` | 
|  | 43 | +Users with the `data_engineer` role can access all data sources. | 
|  | 44 | + | 
|  | 45 | +#### Grant a Role Access to Specific GCS Buckets | 
|  | 46 | +```json | 
|  | 47 | +{"role":"project_team", "permission": "READ", "table": "gs://project-bucket/*"} | 
|  | 48 | +``` | 
|  | 49 | +Users with the `project_team` role can only access paths in the `gs://project-bucket/` bucket. | 
|  | 50 | + | 
|  | 51 | +## Default Access | 
|  | 52 | + | 
|  | 53 | +The system includes a default role `opteryx` with wildcard access to everything: | 
|  | 54 | +```json | 
|  | 55 | +{"role":"opteryx", "permission": "READ", "table": "*"} | 
|  | 56 | +``` | 
|  | 57 | +This is added automatically and cannot be overridden by the permissions.json file. | 
|  | 58 | + | 
|  | 59 | +## Usage in Queries | 
|  | 60 | + | 
|  | 61 | +When you query using protocol prefixes, the permission system checks if your role has access to that table pattern: | 
|  | 62 | + | 
|  | 63 | +```sql | 
|  | 64 | +-- Requires a role with permission for "gs://*" pattern | 
|  | 65 | +SELECT * FROM gs://my-bucket/data/*.parquet | 
|  | 66 | +
 | 
|  | 67 | +-- Requires a role with permission for "s3://*" pattern | 
|  | 68 | +SELECT * FROM s3://my-bucket/logs/2024-01-??.csv | 
|  | 69 | +
 | 
|  | 70 | +-- Requires a role with permission for "file://*" pattern | 
|  | 71 | +SELECT * FROM file://path/to/data/*.csv | 
|  | 72 | +
 | 
|  | 73 | +-- Requires a role with permission for "opteryx.*" pattern | 
|  | 74 | +SELECT * FROM opteryx.space_missions | 
|  | 75 | +``` | 
|  | 76 | +
 | 
|  | 77 | +## Multiple Roles | 
|  | 78 | +
 | 
|  | 79 | +Users can have multiple roles. If any role grants access to a table pattern, the user can access it: | 
|  | 80 | +
 | 
|  | 81 | +```sql | 
|  | 82 | +-- User with roles ["restricted", "cloud_user"] where: | 
|  | 83 | +-- - "restricted" has permission for "opteryx.*" | 
|  | 84 | +-- - "cloud_user" has permission for "gs://*" | 
|  | 85 | +
 | 
|  | 86 | +-- ✓ Allowed - restricted role grants access | 
|  | 87 | +SELECT * FROM opteryx.space_missions | 
|  | 88 | +
 | 
|  | 89 | +-- ✓ Allowed - cloud_user role grants access   | 
|  | 90 | +SELECT * FROM gs://bucket/data/*.parquet | 
|  | 91 | +
 | 
|  | 92 | +-- ✗ Denied - no role grants access | 
|  | 93 | +SELECT * FROM s3://bucket/data/*.parquet | 
|  | 94 | +``` | 
|  | 95 | +
 | 
|  | 96 | +## Security Best Practices | 
|  | 97 | +
 | 
|  | 98 | +1. **Least Privilege**: Only grant the minimum permissions needed for each role | 
|  | 99 | +2. **Namespace Separation**: Use table patterns to restrict access to specific namespaces or buckets | 
|  | 100 | +3. **Protocol Control**: Explicitly grant or deny protocol access (file://, gs://, s3://) per role | 
|  | 101 | +4. **Monitor Access**: Log and review which roles access which data sources | 
|  | 102 | +5. **Audit Regularly**: Review and update permissions as access requirements change | 
|  | 103 | +
 | 
|  | 104 | +## Testing | 
|  | 105 | +
 | 
|  | 106 | +See `tests/unit/security/test_protocol_permissions.py` for comprehensive tests of the protocol prefix permission system. | 
0 commit comments