-
Notifications
You must be signed in to change notification settings - Fork 722
Description
Describe the bug
When querying a table in an Oracle database, the module runs into a TypeError when trying to determine whether the result is a decimal data type or not. It appears that additional None handling is required when iterating through an oracledb.Cursor's description field.
Error:
File c:\Users\Jimmy\miniconda3\envs\work\lib\site-packages\awswrangler\oracle.py:306, in read_sql_query(sql, con, index_col, params, chunksize, dtype, safe, timestamp_as_object, dtype_backend)
[259](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:259) """Return a DataFrame corresponding to the result set of the query string.
[260](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:260)
[261](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:261) Parameters
...
--> [610](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:610) if row[1] == oracledb.DB_TYPE_NUMBER and row[5] > 0:
[611](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:611) dtype[row[0]] = pa.decimal128(row[4], row[5])
[613](file:///C:/Users/Jimmy/miniconda3/envs/work/lib/site-packages/awswrangler/oracle.py:613) _logger.debug("decimal dtypes: %s", dtype)
TypeError: '>' not supported between instances of 'NoneType' and 'int'Link to the method producing the error: https://github.com/aws/aws-sdk-pandas/blob/main/awswrangler/oracle.py#L603
Links to various Cursor.description documentation:
https://python-oracledb.readthedocs.io/en/latest/api_manual/cursor.html#Cursor.description
https://peps.python.org/pep-0249/#description
How to Reproduce
Create the following table in an Oracle database
CREATE TABLE TESTDB.TESTTABLE1 (
VARCOL VARCHAR2(200),
INTCOL INTEGER,
NUMCOL NUMERIC(38, 10)
)Ran the following Python code:
import awswrangler
import oracledb
ora_conn = oracledb.connect(
dsn="localhost:5432/FREEPDB1", user="SYSTEM", password="oracle"
)
df = awswrangler.oracle.read_sql_query(
"SELECT * FROM all_tab_columns WHERE table_name = 'TESTTABLE1'", ora_conn
)
display(df)Expected behavior
The awswrangler.oracle.read_sql_query(...) call should return a DataFrame instead of raising a TypeError.
The DataFrame should have similar results to the below screenshot.
Your project
No response
Screenshots
No response
OS
Windows 11
Python version
3.9.21
AWS SDK for pandas version
3.11.0
Additional context
Changing https://github.com/aws/aws-sdk-pandas/blob/main/awswrangler/oracle.py#L610 to check for None allows the above query to run. However, I'm not sure if that is a robust enough solution.
Changed Code:
# if row[1] == oracledb.DB_TYPE_NUMBER and row[5] > 0:
if row[1] == oracledb.DB_TYPE_NUMBER and row[5] is not None and row[5] > 0:
