Skip to content

Commit cdd6377

Browse files
committed
Resolve conflicts with main
2 parents 90cbf10 + c018680 commit cdd6377

30 files changed

+461
-259
lines changed

.bumpversion.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ first_value = 1
3333

3434
[bumpversion:file:docs/source/layers.rst]
3535

36-
[bumpversion:file:docs/source/what.rst]
36+
[bumpversion:file:docs/source/about.rst]
3737

3838
[bumpversion:file:awswrangler/__metadata__.py]
3939

.github/workflows/bandit.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ jobs:
1717
build:
1818
runs-on: ubuntu-latest
1919
steps:
20-
- uses: actions/checkout@v2
20+
- uses: actions/checkout@v3
2121
- name: Set up Python ${{ matrix.python-version }}
22-
uses: actions/setup-python@v1
22+
uses: actions/setup-python@v4
2323
with:
2424
python-version: ${{ matrix.python-version }}
2525
- name: Install

.github/workflows/cfn-nag.yml

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,21 @@ on:
1616
permissions:
1717
contents: read
1818

19+
env:
20+
CDK_DEFAULT_ACCOUNT: 111111111111
21+
CDK_DEFAULT_REGION: us-east-1
22+
1923
jobs:
2024
build:
2125
runs-on: ubuntu-latest
2226
steps:
23-
- uses: actions/checkout@v2
27+
- uses: actions/checkout@v3
2428
- name: Use Node.js
25-
uses: actions/setup-node@v1
29+
uses: actions/setup-node@v3
2630
with:
27-
node-version: '14.x'
31+
node-version: 16
2832
- name: Cache Node.js modules
29-
uses: actions/cache@v2
33+
uses: actions/cache@v3
3034
with:
3135
path: ~/.npm
3236
key: ${{ runner.OS }}-node-${{ hashFiles('**/package-lock.json') }}
@@ -37,16 +41,28 @@ jobs:
3741
run: |
3842
npm install -g aws-cdk
3943
cdk --version
40-
- uses: actions/checkout@v2
44+
- uses: actions/checkout@v3
4145
- name: Set up Python ${{ matrix.python-version }}
42-
uses: actions/setup-python@v1
46+
uses: actions/setup-python@v4
4347
with:
4448
python-version: ${{ matrix.python-version }}
4549
- name: Install Requirements
4650
run: |
47-
cd test_infra && rm -rf poetry.lock
51+
cd test_infra
52+
cat <<EOT >> cdk.context.json
53+
{
54+
"availability-zones:account=111111111111:region=us-east-1": [
55+
"us-east-1a",
56+
"us-east-1b",
57+
"us-east-1c",
58+
"us-east-1d",
59+
"us-east-1e",
60+
"us-east-1f"
61+
]
62+
}
63+
EOT
4864
python -m pip install --upgrade pip
49-
python -m pip install poetry
65+
python -m pip install poetry==1.1.15 # 1.2.0 breaking resolution of packages
5066
poetry config virtualenvs.create false --local
5167
poetry install -vvv
5268
- name: CDK Synth

.github/workflows/minimal-tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ jobs:
2424
python-version: [3.8]
2525

2626
steps:
27-
- uses: actions/checkout@v2
27+
- uses: actions/checkout@v3
2828
- name: Set up Python ${{ matrix.python-version }}
29-
uses: actions/setup-python@v1
29+
uses: actions/setup-python@v4
3030
with:
3131
python-version: ${{ matrix.python-version }}
3232
- name: Install Requirements

.github/workflows/static-checking.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@ jobs:
2323
python-version: [3.8]
2424

2525
steps:
26-
- uses: actions/checkout@v2
26+
- uses: actions/checkout@v3
2727
- name: Set up Python ${{ matrix.python-version }}
28-
uses: actions/setup-python@v1
28+
uses: actions/setup-python@v4
2929
with:
3030
python-version: ${{ matrix.python-version }}
3131
- name: Install Requirements

CONTRIBUTING_COMMON_ERRORS.md

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -111,22 +111,3 @@ brew install unixodbc
111111
```
112112

113113
-----
114-
115-
## CloudFormation Deployment
116-
117-
### Error Message
118-
119-
During the deployment of `aws-sdk-pandas-databases`, the creation of the resource `CodeBuildTestRoleLFPermissions` fails with
120-
121-
```
122-
Resource does not exist or requester is not authorized to access requested permissions. (Service: AWSLakeFormation; Status Code: 400; Error Code: AccessDeniedException; Request ID: 14a26718-ee4e-49f2-a7ca-d308e49485f8; Proxy: null)
123-
```
124-
125-
### Solution
126-
127-
The IAM role used to deploy the CloudForation stack does not have permissions to assign permissions in AWS Lake Formation. The quickest solution is to find the IAM role and set it as an admin in Lake Formation.
128-
129-
In order to find the role:
130-
1. Navigate to the CloudFormation console in your account
131-
1. Select the `aws-sdk-pandas-databases` stack which failed to deploy
132-
1. Under the "Stack info" tab, find the value for "IAM role". The name of the role should be in the following format: `arn:aws:iam::{ACCOUNT_ID}:role/cdk-{UUID}-cfn-exec-role-{ACCOUNT_ID}-{REGION}`

awswrangler/_data_types.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -360,6 +360,8 @@ def athena2pandas(dtype: str) -> str: # pylint: disable=too-many-branches,too-m
360360
return "decimal"
361361
if dtype in ("binary", "varbinary"):
362362
return "bytes"
363+
if dtype in ("array", "row", "map"):
364+
return "object"
363365
raise exceptions.UnsupportedType(f"Unsupported Athena type: {dtype}")
364366

365367

awswrangler/athena/_utils.py

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -211,16 +211,6 @@ def _get_query_metadata( # pylint: disable=too-many-statements
211211
col_name: str
212212
col_type: str
213213
for col_name, col_type in cols_types.items():
214-
if col_type == "array":
215-
raise exceptions.UnsupportedType(
216-
"List data type is not supported with regular (non-CTAS and non-UNLOAD) queries. "
217-
"Please use ctas_approach=True or unload_approach=True for List columns."
218-
)
219-
if col_type == "row":
220-
raise exceptions.UnsupportedType(
221-
"Struct data type is not supported with regular (non-CTAS and non-UNLOAD) queries. "
222-
"Please use ctas_approach=True or unload_approach=True for Struct columns."
223-
)
224214
pandas_type: str = _data_types.athena2pandas(dtype=col_type)
225215
if (categories is not None) and (col_name in categories):
226216
dtype[col_name] = "category"

awswrangler/data_api/redshift.py

Lines changed: 48 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,19 @@
1313
class RedshiftDataApi(connector.DataApiConnector):
1414
"""Provides access to a Redshift cluster via the Data API.
1515
16+
Note
17+
----
18+
When connecting to a standard Redshift cluster, `cluster_id` is used.
19+
When connecting to Redshift Serverless, `workgroup_name` is used. These two arguments are mutually exclusive.
20+
1621
Parameters
1722
----------
1823
cluster_id: str
19-
Id for the target Redshift cluster.
24+
Id for the target Redshift cluster - only required if `workgroup_name` not provided.
2025
database: str
2126
Target database name.
27+
workgroup_name: str
28+
Name for the target serverless Redshift workgroup - only required if `cluster_id` not provided.
2229
secret_arn: str
2330
The ARN for the secret to be used for authentication - only required if `db_user` not provided.
2431
db_user: str
@@ -35,8 +42,9 @@ class RedshiftDataApi(connector.DataApiConnector):
3542

3643
def __init__(
3744
self,
38-
cluster_id: str,
39-
database: str,
45+
cluster_id: str = "",
46+
database: str = "",
47+
workgroup_name: str = "",
4048
secret_arn: str = "",
4149
db_user: str = "",
4250
sleep: float = 0.25,
@@ -46,29 +54,44 @@ def __init__(
4654
) -> None:
4755
self.cluster_id = cluster_id
4856
self.database = database
57+
self.workgroup_name = workgroup_name
4958
self.secret_arn = secret_arn
5059
self.db_user = db_user
5160
self.client: boto3.client = _utils.client(service_name="redshift-data", session=boto3_session)
5261
self.waiter = RedshiftDataApiWaiter(self.client, sleep, backoff, retries)
5362
logger: logging.Logger = logging.getLogger(__name__)
5463
super().__init__(self.client, logger)
5564

65+
def _validate_redshift_target(self) -> None:
66+
if self.database == "":
67+
raise ValueError("`database` must be set for connection")
68+
if self.cluster_id == "" and self.workgroup_name == "":
69+
raise ValueError("Either `cluster_id` or `workgroup_name`(Redshift Serverless) must be set for connection")
70+
5671
def _validate_auth_method(self) -> None:
57-
if self.secret_arn == "" and self.db_user == "":
72+
if self.workgroup_name == "" and self.secret_arn == "" and self.db_user == "":
5873
raise ValueError("Either `secret_arn` or `db_user` must be set for authentication")
5974

6075
def _execute_statement(self, sql: str, database: Optional[str] = None) -> str:
76+
self._validate_redshift_target()
6177
self._validate_auth_method()
62-
credentials = {"SecretArn": self.secret_arn}
63-
if self.db_user:
78+
credentials = {}
79+
if self.secret_arn:
80+
credentials = {"SecretArn": self.secret_arn}
81+
elif self.db_user:
6482
credentials = {"DbUser": self.db_user}
6583

6684
if database is None:
6785
database = self.database
6886

87+
if self.cluster_id:
88+
redshift_target = {"ClusterIdentifier": self.cluster_id}
89+
elif self.workgroup_name:
90+
redshift_target = {"WorkgroupName": self.workgroup_name}
91+
6992
self.logger.debug("Executing %s", sql)
7093
response: Dict[str, Any] = self.client.execute_statement(
71-
ClusterIdentifier=self.cluster_id,
94+
**redshift_target,
7295
Database=database,
7396
Sql=sql,
7497
**credentials,
@@ -167,21 +190,29 @@ class RedshiftDataApiTimeoutException(Exception):
167190

168191

169192
def connect(
170-
cluster_id: str,
171-
database: str,
193+
cluster_id: str = "",
194+
database: str = "",
195+
workgroup_name: str = "",
172196
secret_arn: str = "",
173197
db_user: str = "",
174198
boto3_session: Optional[boto3.Session] = None,
175199
**kwargs: Any,
176200
) -> RedshiftDataApi:
177201
"""Create a Redshift Data API connection.
178202
203+
Note
204+
----
205+
When connecting to a standard Redshift cluster, `cluster_id` is used.
206+
When connecting to Redshift Serverless, `workgroup_name` is used. These two arguments are mutually exclusive.
207+
179208
Parameters
180209
----------
181210
cluster_id: str
182-
Id for the target Redshift cluster.
211+
Id for the target Redshift cluster - only required if `workgroup_name` not provided.
183212
database: str
184213
Target database name.
214+
workgroup_name: str
215+
Name for the target serverless Redshift workgroup - only required if `cluster_id` not provided.
185216
secret_arn: str
186217
The ARN for the secret to be used for authentication - only required if `db_user` not provided.
187218
db_user: str
@@ -196,7 +227,13 @@ def connect(
196227
A RedshiftDataApi connection instance that can be used with `wr.redshift.data_api.read_sql_query`.
197228
"""
198229
return RedshiftDataApi(
199-
cluster_id, database, secret_arn=secret_arn, db_user=db_user, boto3_session=boto3_session, **kwargs
230+
cluster_id=cluster_id,
231+
database=database,
232+
workgroup_name=workgroup_name,
233+
secret_arn=secret_arn,
234+
db_user=db_user,
235+
boto3_session=boto3_session,
236+
**kwargs,
200237
)
201238

202239

awswrangler/opensearch/_utils.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,10 @@ def _strip_endpoint(endpoint: str) -> str:
3333
return uri_schema.sub("", endpoint).strip().strip("/")
3434

3535

36+
def _is_https(port: int) -> bool:
37+
return port == 443
38+
39+
3640
def connect(
3741
host: str,
3842
port: Optional[int] = 443,
@@ -95,8 +99,8 @@ def connect(
9599
host=_strip_endpoint(host),
96100
port=port,
97101
http_auth=http_auth,
98-
use_ssl=True,
99-
verify_certs=True,
102+
use_ssl=_is_https(port),
103+
verify_certs=_is_https(port),
100104
connection_class=RequestsHttpConnection,
101105
timeout=30,
102106
max_retries=10,

0 commit comments

Comments
 (0)