Skip to content

Commit ce3514b

Browse files
authored
Update README.md
1 parent 65c4388 commit ce3514b

File tree

1 file changed

+183
-2
lines changed

1 file changed

+183
-2
lines changed

README.md

Lines changed: 183 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,183 @@
1-
# dscim
2-
Data-Driven Spatial Climate Impact Model core component code
1+
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
2+
3+
# DSCIM: The Data-driven Spatial Climate Impact Model
4+
5+
This Python library enables the calculation of a sector integrated social cost of carbon
6+
(SCC) using a variety of valuation methods and assumptions. The main purpose of this
7+
library is to parse the monetized spatial damages from different sectors and integrate them
8+
using different options (or menu options) that encompass different decisions, such as
9+
discount levels, discount strategies, and different considerations related to
10+
economic and climate uncertainty.
11+
12+
## Structure and logic
13+
14+
The library is split into several components that implement the hierarchy
15+
defined by the menu options. These are the main elements of the library and
16+
serve as the main classes to call different menu options.
17+
18+
```mermaid
19+
graph TD
20+
21+
SubGraph1Flow(Storage and I/O)
22+
subgraph "Storage utilities"
23+
SubGraph1Flow --> A[Stacked_damages]
24+
SubGraph1Flow -- Climate Data --> Climate
25+
SubGraph1Flow -- Economic Data --> EconData
26+
end
27+
28+
subgraph "Recipe Book"
29+
A[StackedDamages] --> B[MainMenu]
30+
B[MainMenu] --> C[AddingUpRecipe];
31+
B[MainMenu] --> D[RiskAversionRecipe];
32+
B[MainMenu] --> E[EquityRecipe]
33+
end
34+
```
35+
36+
`StackedDamages` takes care of parsing all monetized damage data from several
37+
sectors and read the data using a `dask.distributed.Client`. At the same time,
38+
this class takes care of ingesting FaIR GMST and GMSL data needed to draw damage
39+
functions and calculate FaIR marginal damages to an additional emission of
40+
carbon. The data can be read using the following components:
41+
42+
Class | Function |
43+
|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
44+
| `Climate` | Wrapper class to read all things climate, including GMST and GMSL. You can pass a `fair_path` with a NetCDF with FaIR control and pulse simulations and median FaIR runs. You can use `gmst_path` to input a CSV file with model and year anomaly data, for fitting the damage functions. |
45+
| `EconVars` | Class to ingest sector path related data, this includes GDP and population data. Some intermediate variables are also included in this class, check the documentation for more details |
46+
| `StackedDamages` | Damages wrapper class. This class contains all the elements above and additionally reads all the computed monetized damages. A single path is needed to read all damages, and sectors must be separated by folders. If necessary, the class will save data in `.zarr` format to make chunking operations more efficient. Check documentation of the class for more details. |
47+
48+
49+
and these elements can be used for the menu options:
50+
- `AddingUpRecipe`: Adding up all damages and collapse them to calculate a general SCC without valuing uncertainty.
51+
- `RiskAversionRecipe`: Add risk aversion certainty equivalent to consumption calculations - Value uncertainty over econometric and climate draws.
52+
- `EquityRecipe`: Add risk aversion and equity to the consumption calculations. Equity includes taking a certainty equivalent over spatial impact regions.
53+
54+
55+
## Requirements
56+
57+
The library runs on Python +3.8 and it expects a that all requirements are
58+
installed previous running any code, check Installation The integration
59+
process is stacking different damage outcomes from several sectors
60+
at the impact region level. Thus, you will need several tricks to deal with
61+
the data I/O.
62+
63+
## Computing
64+
65+
### Computing introduction
66+
67+
One of the tricks we rely on is the extensive use of `Dask` and `xarray` to
68+
read raw damage data in `nc4` or `zarr` format (This latter is how coastal damages are provided).
69+
Hence, you will need to have a `Dask` `distributed.client` to harness the power of distributed computing.
70+
The computing requirements will vary depending on the execution of different
71+
menu options and the number of sectors you are aggregating. These are some general rules about
72+
computational intensity:
73+
74+
1. For recipes, `EquityRecipe > RiskAversionRecipe > AddingUpRecipe`
75+
2. For discounting, `euler_gwr > euler_ramsey > naive_gwr > naive_ramsey > constant > constant_model_collapsed`
76+
3. More options (ie., greater number of SSPs, greater number of sectors) means more computing resources required.
77+
4. `Dask` does not perfectly release memory after each menu run. Thus, if you are running
78+
several menu options, in loops or otherwise, you may need to execute a `client.restart()` partway through
79+
to force `Dask` into emptying memory.
80+
5. Inclusion of coastal increases memory usage exponentially (due to the 500 batches and 10 GMSL bins against which
81+
other sectors' damages must be broadcasted). Be careful and smart when running this option,
82+
and don't be afraid to reconsider chunking for the files being read in.
83+
84+
### Setting up a Dask client
85+
86+
Ensure that the following packages are installed and updated:
87+
[Dask](https://docs.dask.org/en/latest/install.html), [distributed](https://distributed.dask.org/en/latest/install.html), [Jupyter Dask extension](https://github.com/dask/dask-labextension), `dask_jobqueue`.
88+
89+
Ensure that your Jupyter Lab has add-ons enabled so that you can access Dask as an extension.
90+
91+
You have two options for setting up a Dask client.
92+
93+
#### Local client
94+
<details><summary>Click to expand</summary>
95+
If your local node has sufficient memory and computational power, you will only need to create a local Dask client.
96+
97+
_If you are operating on Midway3, you should be able to run the menu in its entirety.
98+
Each `caslake` computing node on Midway3 has 193 GB memory, and 48 CPUs. This is sufficient for all options._
99+
100+
- Open the Dask tab on the left side of your Jupyter Lab page.
101+
- Click `New + ` and wait for a cluster to appear.
102+
- Drag and drop the cluster into your notebook and execute the cell.
103+
- You now have a new Dask client!
104+
- click on the `CPU`, `Worker Memory`, and `Progress` tabs to track progress. You can arrange them in a side bar of your
105+
Jupyter notebook to keep them all visible at the same time.
106+
- note that opening 2 or 3 local Clients does _not_ get you 2 or 3 times the compute space. These clients will be sharing
107+
the same node, so in fact computing may be slower as they are fighting for resources. (_check this, it's a hypothesis_)
108+
![](images/dask_example.png)
109+
</details>
110+
111+
#### Distributed client
112+
<details><summary>Click to expand</summary>
113+
If your local node does not have sufficient computational power, you will need to manually request separate
114+
nodes with `dask.distributed`:
115+
```
116+
cluster = SLURMCluster()
117+
print(cluster.job_script())
118+
cluster.scale(10)
119+
client = Client(cluster)
120+
client
121+
```
122+
You can adjust the number of workers by changing the integer inside `cluster.scale()`. You can adjust the CPUs
123+
and memory per worker inside `~/.config/dask/jobqueue.yaml`.
124+
125+
To track progress of this client, copy-paste the "Dashboard" IP address and SSH into it. Example code:
126+
```
127+
ssh -N -f -L 8787:10.50.250.7:8510 [email protected]
128+
```
129+
Then go to `localhost:8787` in your browser to watch the magic.
130+
</details>
131+
132+
### Dask troubleshooting
133+
134+
Most Dask issues in the menu come from one of two sources:
135+
1. requesting Dask to compute too many tasks (your chunks are too small) which will result in a sort of "hung state"
136+
and empty progress bar.
137+
2. requesting Dask to compute too _large_ tasks (your chunks are too big). In this case, you will see memory under
138+
`Worker Memory` taskbar shoot off the charts. Then your kernel will likely be killed by SLURM.
139+
140+
How can you avoid these situations?
141+
1. Start with `client.restart()`. Sometimes, Dask does not properly release tasks from memory and this plugs up
142+
the client. Doing a fresh restart (and perhaps a fresh restart of your notebook) will fix the problem.
143+
2. Next, check your chunks! Ensure that any `xr.open_dataset()` or `xr.open_mfdataset()` commands have a `chunks`
144+
argument passed. If not, Dask's default is to load the entire file into memory before rechunking later. This
145+
is very bad news for impact-region-level damages, which are 10TB of data.
146+
3. Start executing the menu object by object. Call an object, select a small slice of it, and add `.compute()`. If the object
147+
computes successfully without overloading memory, it's not the memory leak. Keep moving through the menu until you find the
148+
source of the error. _Hot tip: it's usually the initial reading-in of files where nasty things happen._ Check each object in the menu to
149+
ensure three things:
150+
- chunks should be a reasonable size ('reasonable' is relative, but approximately 250-750 MB is typically successful
151+
on a Midway3 `caslake` computing node)
152+
- not too many chunks! Again, this is relative, but more than 10,000 likely means you should reconsider your chunksize.
153+
- not too many tasks per chunk. Again, relative, but more than 300,000 tasks early in the menu is unusual and should be
154+
checked to make sure there aren't any unnecessary rechunking operations being forced upon the menu.
155+
4. Consider rechunking your inputs. If your inputs are chunked in a manner that's orthogonal to your first few operations,
156+
Dask will have a nasty time trying to rechunk all those files before executing things on them. Rechunking and resaving
157+
usually takes a few minutes; rechunking in the middle of an operation can take hours.
158+
5. If this has all been done and you are still getting large memory errors, it's possible that Dask isn't correctly separating
159+
and applying operations to chunks. If this is the case, consider adding a `map_blocks` method, which explicitly
160+
tells Dask to apply the operation to each chunk sequentially.
161+
162+
For more information about how to
163+
execute `Dask` and the `job-queue` library (in case you are in a computing
164+
cluster), refer to [Dask Distributed][3] and [job-queue][4] documentation.
165+
You can check several use-case examples on the computed notebook under examples.
166+
167+
### Priority
168+
169+
Maintaining priority is important when given tight deadlines to run menu options. To learn more about
170+
priority, click [here](https://rcc.uchicago.edu/docs/tutorials/rcc-tips-and-tricks.html#priority
171+
).
172+
173+
In general, following these hygiene rules will keep priority high:
174+
1. Kill all notebooks/clusters when not in use.
175+
2. Only request what you need (in terms of `WALLTIME`, `WORKERS`, and `WORKER MEMORY`).
176+
3. Run things right the first time around. Your notebook text is worth an extra double check :)
177+
178+
[3]: https://distributed.dask.org/en/latest/
179+
[4]: https://jobqueue.dask.org/en/latest/
180+
[5]: https://sylabs.io/guides/3.5/user-guide/quick_start.html
181+
[6]: https://sylabs.io/
182+
[7]: https://pangeo.io/setup_guides/hpc.html
183+
[8]: https://climateimpactlab.gitlab.io/Impacts/integration/

0 commit comments

Comments
 (0)