-
Notifications
You must be signed in to change notification settings - Fork 38
Description
I have found that the COSIPY model has a tendancy to need a very high amount of memory, particularly if one is trying to simulate over a large spatial and temporal resolutions. I believe the reason for this is due to the fact that the current code structure doesn't allow the threads/workers to release memory until the 'future.result()' method is called and the results are being written at the very end of the simulation.
For context, I am using a computing cluster with 28 threads/workers each with an allocation of 4 GB of RAM to simulate a spatial domain of 2,043 over 745,104 timesteps (hourly between 1939 - 2024). The model input 'DATA' xarray dataset is 48 GB and I quickly have a crash due to insufficient memory upon reaching line 246 of COSIPY.py:
futures.append(client.submit(cosipy_core, DATA.isel(lat=y, lon=x), y, x, ....
Even trying to use fewer workers with an allocation of 12 GB of RAM would still cause this issue. I am not entirely sure but I think this is because the model is trying to distribute the a huge portion of the DATA file to each thread/worker which must be stored in RAM in this line.
I found that by restructuring the code to simulate and write the output results in groups of 28 (the number of available threads/workers that can simulate simultaneously for me), I could reduce the RAM required down to less than 500 MB. The drawback is a slight increase in writing time, but this wasn't particulary significant for me.
Attached is a screenshot of the rough solution I made to my heavily modified version of the COSIPY code.
I think if something like this could be incorporated into COSIPY it would greatly benefit the community - especially those with access to limited computational resources. Perhaps a user-adjustable parameter, that allows the user to choose how frequently to write results to the output file in order to conserve memory.
