Skip to content

SNOW-1331404: fetch_pandas_all converts arrays into stringsΒ #1922

@lostkamp

Description

@lostkamp

Python version

Python 3.11.7 (main, Mar 12 2024, 09:54:34) [GCC 13.2.0]

Operating system and processor architecture

Linux-6.5.0-27-generic-x86_64-with-glibc2.38

Installed packages

annotated-types==0.6.0
anyio==4.3.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asn1crypto==1.5.1
asttokens==2.4.1
async-lru==2.0.4
attrs==23.2.0
Babel==2.14.0
beautifulsoup4==4.12.3
bleach==6.1.0
botocore==1.34.81
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.4
comm==0.2.1
contourpy==1.2.0
cryptography==42.0.5
cycler==0.12.1
debugpy==1.8.1
decorator==5.1.1
defusedxml==0.7.1
docutils==0.16
entrypoints==0.4
executing==2.0.1
fastapi==0.110.1
fastjsonschema==2.19.1
filelock==3.13.1
flake8==4.0.1
flake8-quotes==3.4.0
flakeheaven==3.3.0
fonttools==4.49.0
fqdn==1.5.1
h11==0.14.0
httpcore==1.0.4
httpx==0.27.0
idna==3.6
iniconfig==2.0.0
ipykernel==6.29.3
ipython==8.22.2
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.3
jmespath==1.0.1
joblib==1.3.2
json5==0.9.22
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter-events==0.9.0
jupyter-lsp==2.2.4
jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.13.0
jupyter_server_terminals==0.5.2
jupyterlab==4.1.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.4
kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.8.3
matplotlib-inline==0.1.6
mccabe==0.6.1
mistune==3.0.2
msal==1.28.0
nbclient==0.9.0
nbconvert==7.16.2
nbformat==5.9.2
nest-asyncio==1.6.0
notebook_shim==0.2.4
numpy==1.26.4
overrides==7.7.0
packaging==24.0
pandas==2.2.0
pandocfilters==1.5.1
parso==0.8.3
pexpect==4.9.0
pillow==10.2.0
platformdirs==3.11.0
pluggy==1.4.0
prometheus_client==0.20.0
prompt-toolkit==3.0.43
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==15.0.2
pyasn1==0.6.0
pycodestyle==2.8.0
pycparser==2.21
pydantic==2.6.3
pydantic_core==2.16.3
pyflakes==2.4.0
Pygments==2.17.2
PyJWT==2.8.0
pyOpenSSL==24.1.0
pyparsing==3.1.2
pytest==8.0.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
referencing==0.33.0
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.18.0
rsa==4.7.2
s3transfer==0.10.1
scikit-learn==1.4.1.post1
scipy==1.12.0
seaborn==0.13.2
Send2Trash==1.8.2
six==1.16.0
sniffio==1.3.1
snowflake-connector-python==3.8.1
sortedcontainers==2.4.0
soupsieve==2.5
stack-data==0.6.3
starlette==0.37.2
terminado==0.18.0
threadpoolctl==3.3.0
tinycss2==1.2.1
toml==0.10.2
tomlkit==0.12.4
tornado==6.4
traitlets==5.14.1
types-python-dateutil==2.8.19.20240311
typing_extensions==4.10.0
tzdata==2024.1
uri-template==1.3.0
urllib3==2.2.1
uvicorn==0.28.0
wcwidth==0.2.13
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0

What did you do?

Fetching a column of type array turns the arrays into strings in the dataframe, which makes them difficult to parse.


>>> query = 'select array_construct(10, 20, 30) as col'
>>> df = cursor.execute(query).fetch_pandas_all()
>>> df
                        COL
0  [\n  10,\n  20,\n  30\n]
>>> type(df['COL'].iloc[0])
str

What did you expect to see?

I would expect arrays to be represented as a list or tuple of values in the dataframe cell.

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)

Metadata

Metadata

Assignees

Labels

enhancementThe issue is a request for improvement or a new featurestatus-triage_doneInitial triage done, will be further handled by the driver team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions