-
Notifications
You must be signed in to change notification settings - Fork 723
Closed
Description
Describe the bug
SELECT * FROM db.table LIMIT 3works in Athena query editor and awswranglerSELECT COUNT(*) FROM db.table LIMIT 3works in Athena query editor but awswrangler throws an error (same withCOUNT(1) or COUNT(s):
File ~/py/awsdata/.venv/lib/python3.12/site-packages/awswrangler/athena/_utils.py:861, in create_ctas_table(sql, database, ctas_table, ctas_database, s3_output, storage_format, write_compression, partitioning_info, bucketing_info, field_delimiter, schema_only, workgroup, data_source, encryption, kms_key, categories, wait, athena_query_wait_polling_delay, execution_params, params, paramstyle, boto3_session)
857 raise exceptions.InvalidCtasApproachQuery(
858 f"Please, define distinct names for your columns. Root error message: {msg}"
859 )
860 if "Column name not specified" in msg:
--> 861 raise exceptions.InvalidArgumentValue(
862 "Please, define all columns names in your query. (E.g. 'SELECT MAX(col1) AS max_col1, ...')"
863 )
864 if "Column type is unknown" in msg:
865 raise exceptions.InvalidArgumentValue(
866 "Please, don't leave undefined columns types in your query. You can cast to ensure it. "
867 "(E.g. 'SELECT CAST(NULL AS INTEGER) AS MY_COL, ...')"
868 )
InvalidArgumentValue: Please, define all columns names in your query. (E.g. 'SELECT MAX(col1) AS max_col1, ...')
This may have something to do with vector field in my table:
CREATE EXTERNAL TABLE `test1`(
`s` string,
`vec` array<smallint>,
`mbin` binary,
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://...'
TBLPROPERTIES (
'compressionType'='gzip',
'classification'='parquet',
'projection.enabled'='false',
'typeOfData'='file')
How to Reproduce
not sure
Expected behavior
No response
Your project
No response
Screenshots
No response
OS
Linux
Python version
3.12
AWS SDK for pandas version
3.11.0
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels