Skip to content

Commit 14b9e78

Browse files
authored
Update gt4py: support for literal precision (#192)
* Update gt4py: support for literal precision * Actually forwarding literal precision to gt4py * Exposing type casts and new math functions * Documentation update --------- Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
1 parent d414652 commit 14b9e78

File tree

7 files changed

+110
-97
lines changed

7 files changed

+110
-97
lines changed

docs/index.md

Lines changed: 6 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -1,102 +1,13 @@
11
# NDSL Documentation
22

3-
NDSL allows atmospheric scientists to write focus on what matters in model development and hides away the complexities of coding for a super computer.
3+
NDSL is a middleware for climate and weather modelling developed jointly by NOAA and NASA. It allows atmospheric scientists to focus on what matters in model development and essentially decouples performance engineering from model development.
44

5-
## Quick Start
5+
## Portable performance
66

7-
Python `3.11.x` is required for NDSL and all its third party dependencies for installation.
7+
NDSL brings together [GT4Py](https://github.com/GridTools/gt4py/) and [DaCe](https://github.com/spcl/dace/), two libraries developed for high-performance and portability. On top of those pillars, NDSL deploys a series of optimized APIs for common operations, e.g. halo exchange or domain decomposition, and tools to port existing models.
88

9-
NDSL submodules `gt4py` and `dace` to point to vetted versions, use `git clone --recurse-submodule` to update the git submodules.
9+
## Batteries-included for FV-based models
1010

11-
NDSL is **NOT** available on `pypi`. Installation of the package has to be local, via `pip install ./NDSL` (`-e` supported). The packages have a few options:
11+
Historically, NDSL was developed to port the FV3 dynamical core on the cubed-sphere. Therefore, the middleware ships with ready-to-execute specialization for models based on cubed-sphere grids and FV-based models in particular.
1212

13-
- `ndsl[test]`: installs the test packages (based on `pytest`)
14-
- `ndsl[develop]`: installs tools for development and tests.
15-
16-
NDSL uses pytest for its unit tests, the tests are available via:
17-
18-
- `pytest -x test`: running CPU serial tests (GPU as well if `cupy` is installed)
19-
- `mpirun -np 6 pytest -x test/mpi`: running CPU parallel tests (GPU as well if `cupy` is installed)
20-
21-
## Requirements & supported compilers
22-
23-
For CPU backends:
24-
25-
- 3.11.x >= Python < 3.12.x
26-
- Compilers:
27-
- GNU 11.2+
28-
29-
For GPU backends (the above plus):
30-
31-
- CUDA 11.2+
32-
- Python package:
33-
- `cupy` (latest with proper driver support [see install notes](https://docs.cupy.dev/en/stable/install.html))
34-
- Libraries:
35-
- MPI compiled with cuda support
36-
37-
## NDSL installation and testing
38-
39-
NDSL is not available at `pypi`, it uses
40-
41-
```bash
42-
pip install NDSL
43-
```
44-
45-
to install NDSL locally.
46-
47-
NDSL has a few options:
48-
49-
- `ndsl[test]`: installs the test packages (based on `pytest`)
50-
- `ndsl[develop]`: installs tools for development and tests.
51-
52-
Tests are available via:
53-
54-
- `pytest -x test`: running CPU serial tests (GPU as well if `cupy` is installed)
55-
- `mpirun -np 6 pytest -x test/mpi`: running CPU parallel tests (GPU as well if `cupy` is installed)
56-
57-
## Configurations for Pace
58-
59-
Configurations for Pace to use NDSL with different backend:
60-
61-
- FV3_DACEMODE=Python[Build|BuildAndRun|Run] controls the full program optimizer behavior
62-
63-
- Python: default, use stencil only, no full program optimization
64-
65-
- Build: will build the program then exit. This _build no matter what_. (backend must be `dace:gpu` or `dace:cpu`)
66-
67-
- BuildAndRun: same as above but after build the program will keep executing (backend must be `dace:gpu` or `dace:cpu`)
68-
69-
- Run: load pre-compiled program and execute, fail if the .so is not present (_no hash check!_) (backend must be `dace:gpu` or `dace:cpu`)
70-
71-
- NDSL_LITERAL_PRECISION=64 controls the floating point precision throughout the program.
72-
73-
Install Pace with different NDSL backend:
74-
75-
- Shell scripts to install Pace using NDSL backend on specific machines such as Gaea can be found in `examples/build_scripts/`.
76-
- When cloning Pace you will need to update the repository's submodules as well:
77-
78-
```bash
79-
git clone --recursive https://github.com/ai2cm/pace.git
80-
```
81-
82-
or if you have already cloned the repository:
83-
84-
```bash
85-
git submodule update --init --recursive
86-
```
87-
88-
- Pace requires GCC > 9.2, MPI, and Python 3.8 on your system, and CUDA is required to run with a GPU backend.
89-
- We recommend creating a python `venv` or conda environment specifically for Pace.
90-
91-
```bash
92-
python3 -m venv venv_name
93-
source venv_name/bin/activate
94-
```
95-
96-
- Inside of your pace `venv` or conda environment pip install the Python requirements, GT4Py, and Pace:
97-
98-
```bash
99-
pip3 install -r requirements_dev.txt -c constraints.txt
100-
```
101-
102-
- There are also separate requirements files which can be installed for linting (`requirements_lint.txt`) and building documentation (`requirements_docs.txt`).
13+
Next: get [up and running](./quickstart.md).

docs/quickstart.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Quickstart
2+
3+
Alright - let's get you up an running!
4+
5+
NDSL requires Python version `3.11` and a GNU compiler. We strongly recommend using a conda or virtual environment.
6+
7+
```shell
8+
# We have submodules for GT4Py and DaCe. Don't forget to pull them
9+
git clone --recurse-submodules git@github.com:NOAA-GFDL/NDSL.git
10+
11+
cd NDSL/
12+
13+
# We strongly recommend using conda or a virtual environment
14+
python -m venv .venv/
15+
source ./venv/bin/activate
16+
17+
# [optional] Install MPI if you don't have a system installation.
18+
pip install openmpi
19+
20+
# Finally, install NDSL
21+
pip install .[demos]
22+
```
23+
24+
Now you can run through the Jupyter notebooks in `examples/NDSL` :rocket:.
25+
26+
Read on in the [user manual](./user/index.md).
27+
28+
!!! note "Supported compilers"
29+
30+
NDSL currently only works with the GNU compiler. Using `clang` will result in errors related to undefined OpenMP flags.
31+
32+
For MacOS users, we know that `gcc` version 14 from homebrew works.
33+
34+
!!! question "Why cloning the repository?"
35+
36+
We are cloning the repository because NDSL is not available on `pypi`.

docs/user/index.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,51 @@
11
# Usage documentation
22

33
This part of the documentation is geared towards users of NDSL.
4+
5+
## Up and running
6+
7+
See our [quickstart guide](./quickstart.md) on how to get up and running.
8+
9+
## Configuration
10+
11+
NDSL tries to have sensible defaults. In cases you want tweak something, here are some pointers:
12+
13+
### Literal precision (float/int)
14+
15+
Unspecified integer and floating point literals (e.g. `42` and `3.1415`) default to 64-bit precision. This can be changed with the environment variable `PACE_FLOAT_PRECISION`.
16+
17+
For mixed precision code, you can specify the "hard coded" precision with type hints and casts, e.g.
18+
19+
```python
20+
with computation(PARALLEL), interval(...):
21+
# Either 32-bit or 64-bit depending on `PACE_FLOAT_PRECISION`
22+
my_int = 42
23+
my_float = 3.1415
24+
25+
# Always 32-bit
26+
my_int32: int32 = 42
27+
my_float32: float32 = 3.1415
28+
29+
# Explicit 64-bit cast within otherwise unspecified calculation
30+
factor = 0.5 * float64(3.1415 + 2.71828)
31+
```
32+
33+
### Full program optimizer
34+
35+
The behavior of the full program optimizer is controlled by `FV3_DACEMODE`. Valid values are:
36+
37+
`Python`
38+
39+
: The default. Disables full program optimization and only accelerates stencil code.
40+
41+
`Build`
42+
43+
: Build the program, then exit. This mode is only available for backends `dace:gpu` and `dace:cpu`.
44+
45+
`BuildAndRun`
46+
47+
: Build the program, then run it immediately. This mode is only available for backends `dace:gpu` and `dace:cpu`.
48+
49+
`Run`
50+
51+
: Load a pre-compiled program and run it. Fails if the pre-compiled program can not be found. This mode is only available for backends `dace:gpu` and `dace:cpu`.

external/gt4py

Submodule gt4py updated 69 files

mkdocs.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ theme:
99

1010
nav:
1111
- Home: index.md
12+
- Quickstart: quickstart.md
1213
- User documentation: user/index.md
1314
- Porting:
1415
- General Concepts: porting/index.md
@@ -24,8 +25,12 @@ markdown_extensions:
2425
- abbr
2526
# support for colored notes / warnings / tips / examples
2627
- admonition
28+
# support for "definition lists" (<dl>)
29+
- def_list
2730
# support for footnotes
2831
- footnotes
32+
# support for emojis
33+
- pymdownx.emoji
2934
# support for syntax highlighting
3035
- pymdownx.highlight:
3136
anchor_linenums: true

ndsl/dsl/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@ def _get_literal_precision(default: Literal["32", "64"] = "64") -> Literal["32",
4141

4242

4343
NDSL_GLOBAL_PRECISION = int(_get_literal_precision())
44-
os.environ["GT4PY_LITERAL_PRECISION"] = str(NDSL_GLOBAL_PRECISION)
44+
os.environ["GT4PY_LITERAL_INT_PRECISION"] = str(NDSL_GLOBAL_PRECISION)
45+
os.environ["GT4PY_LITERAL_FLOAT_PRECISION"] = str(NDSL_GLOBAL_PRECISION)
4546

4647

4748
# Set cache names for default gt backends workflow

ndsl/dsl/gt4py/__init__.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,18 @@
2626
computation,
2727
cos,
2828
cosh,
29+
erf,
30+
erfc,
2931
exp,
3032
externals,
33+
float32,
34+
float64,
3135
floor,
3236
function,
3337
gamma,
3438
horizontal,
39+
int32,
40+
int64,
3541
interval,
3642
isfinite,
3743
isinf,
@@ -80,12 +86,18 @@
8086
"computation",
8187
"cos",
8288
"cosh",
89+
"erf",
90+
"erfc",
8391
"exp",
8492
"externals",
93+
"float32",
94+
"float64",
8595
"floor",
8696
"function",
8797
"gamma",
8898
"horizontal",
99+
"int32",
100+
"int64",
89101
"interval",
90102
"isfinite",
91103
"isinf",

0 commit comments

Comments
 (0)