Skip to content

RAM Requirement Reduction / Periodic Result File Writing #71

@MarcusGastaldello

Description

@MarcusGastaldello

I have found that the COSIPY model has a tendancy to need a very high amount of memory, particularly if one is trying to simulate over a large spatial and temporal resolutions. I believe the reason for this is due to the fact that the current code structure doesn't allow the threads/workers to release memory until the 'future.result()' method is called and the results are being written at the very end of the simulation.

For context, I am using a computing cluster with 28 threads/workers each with an allocation of 4 GB of RAM to simulate a spatial domain of 2,043 over 745,104 timesteps (hourly between 1939 - 2024). The model input 'DATA' xarray dataset is 48 GB and I quickly have a crash due to insufficient memory upon reaching line 246 of COSIPY.py:

futures.append(client.submit(cosipy_core, DATA.isel(lat=y, lon=x), y, x, ....

Even trying to use fewer workers with an allocation of 12 GB of RAM would still cause this issue. I am not entirely sure but I think this is because the model is trying to distribute the a huge portion of the DATA file to each thread/worker which must be stored in RAM in this line.

I found that by restructuring the code to simulate and write the output results in groups of 28 (the number of available threads/workers that can simulate simultaneously for me), I could reduce the RAM required down to less than 500 MB. The drawback is a slight increase in writing time, but this wasn't particulary significant for me.

Attached is a screenshot of the rough solution I made to my heavily modified version of the COSIPY code.

image

I think if something like this could be incorporated into COSIPY it would greatly benefit the community - especially those with access to limited computational resources. Perhaps a user-adjustable parameter, that allows the user to choose how frequently to write results to the output file in order to conserve memory.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions