-
Notifications
You must be signed in to change notification settings - Fork 138
Fix parquet io #1749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix parquet io #1749
Conversation
| pass | ||
| else: | ||
| # Fallback: try normal xarray open | ||
| self.result = xr.open_dataset(outfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lines 2230-2239 seem a bit overly complicated. Could this not be simply:
if outfile is not None:
ext = os.path.splitext(outfile)[1].lower()
if ext == ".parquet":
pass # let the parquet exporter set self.result in its close()
else:
# reimport from file to get the complete simulation (lazily) in memory
self.result = xr.open_dataset(outfile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I am note completely sure about replacing export_buffer_length is None with outfile is None,
as one could have an outfile but still set export_buffer_length to None to get just a single write at the end.
In this case it would be unnecessary to reimport from that file, as the complete simulation is already in memory.
On the other hand, it might perhaps be ok for consistency.
| # need to reimport from file to get all data in memory | ||
| del self.environment | ||
| if hasattr(self, 'environment_profiles'): | ||
| del self.environment_profiles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deletion of self.environment and self.environment_profiles has here been removed. Tests are passing, so might be ok, but I seem to recall that this deletion was necessary for some reason. Was this removed deliberately, or was it just ChatGPT not seing the point?
|
Hi, However, I have never used the parquet writer, so perhaps @poplarShift or @mateuszmatu would be interested to have a look before merging this? |
|
I have not looked more into this, as I also prefer netCDF. Maybe @poplarShift can update this? |
|
@knutfrode Thanks for your comments. I can integrate them later today or Monday in case we hear from @poplarShift first. |
|
Hi all, sorry for the silence on this. Apart from the maintenance thing there's also the mysterious issue #1321 which I still haven't figured out. |
|
But then I guess that if Kristen can make it work again, then we could keep it, and it will still serve as inspiration for others for future development (or generalisation). |
Looks like the parquet exporter wasn't updated when output changes happened with OpenDrift. This is my attempt to fix it. As a disclaimer, I wrote these changes using ChatGPT because I was not familiar enough with parquet nor the details of the original code to get it right. I tested it with OpenDrift's tests and my own subsidiary package tests and it looks like it is working but I encourage an extra eye on this if possible.