Skip to content

Commit e8353f7

Browse files
authored
Update mfc-agent-rules.mdc
1 parent 2a84897 commit e8353f7

File tree

1 file changed

+31
-32
lines changed

1 file changed

+31
-32
lines changed

.cursor/rules/mfc-agent-rules.mdc

Lines changed: 31 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -62,27 +62,7 @@ Written primarily for Fortran/Fypp; the GPU and style sections matter only when
6262

6363
---
6464

65-
# 3 FYPP Macros for GPU acceleration Programming Guidelines (for GPU kernels)
66-
67-
Do not directly use OpenACC or OpenMP directives directly.
68-
Instead, use the FYPP macros contained in src/common/include/parallel_macros.fpp
69-
70-
Wrap tight loops with
71-
72-
```fortran
73-
$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
74-
```
75-
* Add `collapse=n` to merge nested loops when safe.
76-
* Declare loop-local variables with `private='[...]'`.
77-
* Allocate large arrays with `managed` or move them into a persistent
78-
`$:GPU_ENTER_DATA(...)` region at start-up.
79-
* **Do not** place `stop` / `error stop` inside device code.
80-
* Must compile with Cray `ftn` and NVIDIA `nvfortran` for GPU offloading; also build CPU-only with
81-
GNU `gfortran` and Intel `ifx`/`ifort`.
82-
83-
---
84-
85-
# 4 File & Module Structure
65+
# 3 File & Module Structure
8666

8767
- **File Naming**:
8868
- `.fpp` files: Fypp preprocessed files that get translated to `.f90`
@@ -99,25 +79,44 @@ $:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
9979
- `contains` section
10080
- Implementation of subroutines and functions
10181

102-
# 5 Fypp Macros and GPU Acceleration
82+
---
83+
84+
# 4 Fypp Macros
10385

104-
## Use of Fypp
10586
- **Fypp Directives**:
10687
- Start with `#:` (e.g., `#:include`, `#:def`, `#:enddef`)
10788
- Macros defined in `include/*.fpp` files
10889
- Used for code generation, conditional compilation, and GPU offloading
10990

110-
## Some examples
91+
---
92+
93+
# 5 FYPP Macros for GPU acceleration Programming Guidelines (for GPU kernels)
94+
95+
- Do not use OpenACC or OpenMP directives directly.
96+
- Instead, use the FYPP macros contained in `src/common/include/parallel_macros.fpp`
97+
- Documentation on how to use the Fypp macros for GPU offloading is available at https://mflowcode.github.io/documentation/md_gpuParallelization.html
98+
99+
Wrap tight loops with
100+
```fortran
101+
$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
102+
```
103+
* Add `collapse=n` to merge nested loops when safe.
104+
* Declare loop-local variables with `private='[...]'`.
105+
* Allocate large arrays with `managed` or move them into a persistent
106+
`$:GPU_ENTER_DATA(...)` region at start-up.
107+
* **Do not** place `stop` / `error stop` inside device code.
108+
* Must compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with
109+
GNU `gfortran` and Intel `ifx`/`ifort`.
111110

112-
Documentation on how to use the Fypp macros for GPU offloading is available at https://mflowcode.github.io/documentation/md_gpuParallelization.html
111+
- Example GPU macros include the below, among others:
112+
- `$:GPU_ROUTINE(parallelism='[seq]')` - Marks GPU-callable routines
113+
- `$:GPU_PARALLEL_LOOP(collapse=N)` - Parallelizes loops
114+
- `$:GPU_LOOP(parallelism='[seq]')` - Marks sequential loops
115+
- `$:GPU_UPDATE(device='[var1,var2]')` - Updates device data
116+
- `$:GPU_ENTER_DATA(copyin='[var]')` - Copies data to device
117+
- `$:GPU_EXIT_DATA(delete='[var]')` - Removes data from device
113118

114-
Some examples include:
115-
- `$:GPU_ROUTINE(parallelism='[seq]')` - Marks GPU-callable routines
116-
- `$:GPU_PARALLEL_LOOP(collapse=N)` - Parallelizes loops
117-
- `$:GPU_LOOP(parallelism='[seq]')` - Marks sequential loops
118-
- `$:GPU_UPDATE(device='[var1,var2]')` - Updates device data
119-
- `$:GPU_ENTER_DATA(copyin='[var]')` - Copies data to device
120-
- `$:GPU_EXIT_DATA(delete='[var]')` - Removes data from device
119+
---
121120

122121
# 6 Documentation Style
123122

0 commit comments

Comments
 (0)