Can't read google cloud storage data with pyspark in VSCode jupyter #11050
Unanswered
vikasd22
asked this question in
Questions and Answers
Replies: 1 comment 2 replies
-
My guess is that there's some bash script that needs to run first. You might try closing all instances of VS code and then launching it from a bash shell. That should get VS code to inherit the environment of the bash shell. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I have seen a peculiar issue in VScode Jupyter, where I can not read google cloud storage parquet files with Pyspark. It works in jupyterlab in browser with no problem. For example, if I do this in VScode jupyter:
I get the following error:
Note that above code work fine as a python script.
I have already played around with the config and vscode Jupyter versions. All the version seems to have this issue. I am not sure if It is a bug. This seems to be pyspark specific. Pandas doesn't seem to have any problem reading the files.
I have already tried this stack overflow solution as well:
https://stackoverflow.com/questions/55595263/how-to-fix-no-filesystem-for-scheme-gs-in-pyspark
I would love if anyone can shed some light on it.
Beta Was this translation helpful? Give feedback.
All reactions