Skip to content

Commit 3c6fc89

Browse files
authored
feat(athena-driver): export env variables for IAM assume role auth (#9882)
1 parent de537e1 commit 3c6fc89

File tree

5 files changed

+405
-37
lines changed

5 files changed

+405
-37
lines changed

docs/pages/product/configuration/data-sources/aws-athena.mdx

Lines changed: 35 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313

1414
Add the following to a `.env` file in your Cube project:
1515

16+
#### Static Credentials
17+
1618
```dotenv
1719
CUBEJS_DB_TYPE=athena
1820
CUBEJS_AWS_KEY=AKIA************
@@ -24,6 +26,24 @@ CUBEJS_DB_NAME=my_non_default_athena_database
2426
CUBEJS_AWS_ATHENA_CATALOG=AwsDataCatalog
2527
```
2628

29+
#### IAM Role Assumption
30+
31+
For enhanced security, you can configure Cube to assume an IAM role to access Athena:
32+
33+
```dotenv
34+
CUBEJS_DB_TYPE=athena
35+
CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN=arn:aws:iam::123456789012:role/AthenaAccessRole
36+
CUBEJS_AWS_REGION=us-east-1
37+
CUBEJS_AWS_S3_OUTPUT_LOCATION=s3://my-athena-output-bucket
38+
CUBEJS_AWS_ATHENA_WORKGROUP=primary
39+
# Optional: if the role requires an external ID
40+
CUBEJS_AWS_ATHENA_ASSUME_ROLE_EXTERNAL_ID=unique-external-id
41+
```
42+
43+
When using role assumption:
44+
- If running in AWS (EC2, ECS, EKS with IRSA), the driver will use the instance's IAM role or service account to assume the target role
45+
- You can also provide `CUBEJS_AWS_KEY` and `CUBEJS_AWS_SECRET` as master credentials for the role assumption
46+
2747
### Cube Cloud
2848

2949
<InfoBox heading="Allowing connections from Cube Cloud IP">
@@ -52,17 +72,21 @@ if [dedicated infrastructure][ref-dedicated-infra] is used. Check out the
5272

5373
## Environment Variables
5474

55-
| Environment Variable | Description | Possible Values | Required |
56-
| ------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ | :------: |
57-
| `CUBEJS_AWS_KEY` | The AWS Access Key ID to use for database connections | A valid AWS Access Key ID ||
58-
| `CUBEJS_AWS_SECRET` | The AWS Secret Access Key to use for database connections | A valid AWS Secret Access Key ||
59-
| `CUBEJS_AWS_REGION` | The AWS region of the Cube deployment | [A valid AWS region][aws-docs-regions] ||
60-
| `CUBEJS_AWS_S3_OUTPUT_LOCATION` | The S3 path to store query results made by the Cube deployment | A valid S3 path ||
61-
| `CUBEJS_AWS_ATHENA_WORKGROUP` | The name of the workgroup in which the query is being started | [A valid Athena Workgroup][aws-athena-workgroup] ||
62-
| `CUBEJS_AWS_ATHENA_CATALOG` | The name of the catalog to use by default | [A valid Athena Catalog name][awsdatacatalog] ||
63-
| `CUBEJS_DB_NAME` | The name of the database to use by default | A valid Athena Database name ||
64-
| `CUBEJS_DB_SCHEMA` | The name of the schema to use as `information_schema` filter. Reduces count of tables loaded during schema generation. | A valid schema name ||
65-
| `CUBEJS_CONCURRENCY` | The number of [concurrent queries][ref-data-source-concurrency] to the data source | A valid number ||
75+
| Environment Variable | Description | Possible Values | Required |
76+
| ----------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ | :------: |
77+
| `CUBEJS_AWS_KEY` | The AWS Access Key ID to use for database connections | A valid AWS Access Key ID | ❌<sup>1</sup> |
78+
| `CUBEJS_AWS_SECRET` | The AWS Secret Access Key to use for database connections | A valid AWS Secret Access Key | ❌<sup>1</sup> |
79+
| `CUBEJS_AWS_REGION` | The AWS region of the Cube deployment | [A valid AWS region][aws-docs-regions] ||
80+
| `CUBEJS_AWS_S3_OUTPUT_LOCATION` | The S3 path to store query results made by the Cube deployment | A valid S3 path ||
81+
| `CUBEJS_AWS_ATHENA_WORKGROUP` | The name of the workgroup in which the query is being started | [A valid Athena Workgroup][aws-athena-workgroup] ||
82+
| `CUBEJS_AWS_ATHENA_CATALOG` | The name of the catalog to use by default | [A valid Athena Catalog name][awsdatacatalog] ||
83+
| `CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN` | The ARN of the IAM role to assume for Athena access | A valid IAM role ARN ||
84+
| `CUBEJS_AWS_ATHENA_ASSUME_ROLE_EXTERNAL_ID` | The external ID to use when assuming the IAM role (if required by the role's trust policy) | A string ||
85+
| `CUBEJS_DB_NAME` | The name of the database to use by default | A valid Athena Database name ||
86+
| `CUBEJS_DB_SCHEMA` | The name of the schema to use as `information_schema` filter. Reduces count of tables loaded during schema generation. | A valid schema name ||
87+
| `CUBEJS_CONCURRENCY` | The number of [concurrent queries][ref-data-source-concurrency] to the data source | A valid number ||
88+
89+
<sup>1</sup> Either provide `CUBEJS_AWS_KEY` and `CUBEJS_AWS_SECRET` for static credentials, or use `CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN` for role-based authentication. When using role assumption without static credentials, the driver will use the AWS SDK's default credential chain (IAM instance profile, EKS IRSA, etc.).
6690

6791
[ref-data-source-concurrency]: /product/configuration/concurrency#data-source-concurrency
6892

packages/cubejs-athena-driver/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
"types": "dist/src/index.d.ts",
3030
"dependencies": {
3131
"@aws-sdk/client-athena": "^3.22.0",
32+
"@aws-sdk/credential-providers": "^3.22.0",
3233
"@cubejs-backend/base-driver": "1.3.52",
3334
"@cubejs-backend/shared": "1.3.52",
3435
"sqlstring": "^2.3.1"

packages/cubejs-athena-driver/src/AthenaDriver.ts

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ import {
3636
} from '@cubejs-backend/base-driver';
3737
import * as SqlString from 'sqlstring';
3838
import { AthenaClientConfig } from '@aws-sdk/client-athena/dist-types/AthenaClient';
39+
import { fromTemporaryCredentials } from '@aws-sdk/credential-providers';
3940
import { URL } from 'url';
4041

4142
interface AthenaDriverOptions extends AthenaClientConfig {
@@ -124,16 +125,37 @@ export class AthenaDriver extends BaseDriver implements DriverInterface {
124125
config.secretAccessKey ||
125126
getEnv('athenaAwsSecret', { dataSource });
126127

128+
const assumeRoleArn = getEnv('athenaAwsAssumeRoleArn', { dataSource });
129+
const assumeRoleExternalId = getEnv('athenaAwsAssumeRoleExternalId', { dataSource });
130+
127131
const { schema, ...restConfig } = config;
128132

129133
this.schema = schema ||
130134
getEnv('dbName', { dataSource }) ||
131135
getEnv('dbSchema', { dataSource });
132136

137+
// Configure credentials based on authentication method
138+
let credentials;
139+
if (assumeRoleArn) {
140+
// Use assume role authentication
141+
credentials = fromTemporaryCredentials({
142+
params: {
143+
RoleArn: assumeRoleArn,
144+
...(assumeRoleExternalId && { ExternalId: assumeRoleExternalId }),
145+
},
146+
...(accessKeyId && secretAccessKey && {
147+
masterCredentials: { accessKeyId, secretAccessKey },
148+
}),
149+
});
150+
} else if (accessKeyId && secretAccessKey) {
151+
// If access key and secret are provided, use them as master credentials
152+
// Otherwise, let the SDK use the default credential chain (IRSA, instance profile, etc.)
153+
credentials = { accessKeyId, secretAccessKey };
154+
}
155+
133156
this.config = {
134-
credentials: accessKeyId && secretAccessKey
135-
? { accessKeyId, secretAccessKey }
136-
: undefined,
157+
// If no credentials are provided, the SDK will use the default chain
158+
...(credentials && { credentials }),
137159
...restConfig,
138160
region:
139161
config.region ||

packages/cubejs-backend-shared/src/env.ts

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1147,6 +1147,32 @@ const variables: Record<string, (...args: any) => any> = {
11471147
]
11481148
),
11491149

1150+
/**
1151+
* Athena AWS Assume Role ARN.
1152+
*/
1153+
athenaAwsAssumeRoleArn: ({
1154+
dataSource
1155+
}: {
1156+
dataSource: string,
1157+
}) => (
1158+
process.env[
1159+
keyByDataSource('CUBEJS_AWS_ATHENA_ASSUME_ROLE_ARN', dataSource)
1160+
]
1161+
),
1162+
1163+
/**
1164+
* Athena AWS Assume Role External ID.
1165+
*/
1166+
athenaAwsAssumeRoleExternalId: ({
1167+
dataSource
1168+
}: {
1169+
dataSource: string,
1170+
}) => (
1171+
process.env[
1172+
keyByDataSource('CUBEJS_AWS_ATHENA_ASSUME_ROLE_EXTERNAL_ID', dataSource)
1173+
]
1174+
),
1175+
11501176
/** ****************************************************************
11511177
* BigQuery Driver *
11521178
***************************************************************** */

0 commit comments

Comments
 (0)