Skip to content

Commit 5b40656

Browse files
committed
upd README
1 parent 51c07b6 commit 5b40656

File tree

1 file changed

+21
-46
lines changed

1 file changed

+21
-46
lines changed

README.md

Lines changed: 21 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,22 @@
1-
# Batched Advection kernels
1+
# Batched Kernels with Memory Allocations (BKMA)
22

3-
This code implements a 1D advection operator inside a multidimensionnal space. It implements a [semi-Lagrangian scheme](https://en.wikipedia.org/wiki/Semi-Lagrangian_scheme) using the [SYCL 2020](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html) progamming models.
3+
![Advection process](docs/fig/AdvectionProcess.png)
44

5-
To reproduce the benchmark, follow the [benchmark README.md](benchmark/README.md) instructions.
5+
## 1D Convolution operator
6+
Implement a [1D convolution operator](https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html) in-place using BKMA strategies.
67

7-
## General algorithm
8-
For one time step, the algorithm's structure is as follow:
8+
## Lagrangian Advection
99

10-
![Advection process](docs/fig/AdvectionProcess.png)
10+
This code implements a 1D advection operator inside a multidimensionnal space. It implements a [semi-Lagrangian scheme](https://en.wikipedia.org/wiki/Semi-Lagrangian_scheme) using the [SYCL 2020](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html) progamming models.
11+
12+
To reproduce the benchmark, follow the [benchmark README.md](benchmark/README.md) instructions.
1113

1214
### SYCL Implementations
13-
The algorithm is implemented in various ways using different SYCL constructs. It requires local memory allocation via the [local accessor](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:accessor.local). The implementations are in the `src/core/impl` directory.
15+
The algorithm is implemented in various ways using different SYCL constructs. It requires local memory allocation via the [local accessor](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:accessor.local). The implementations are in the `src/core` directory.
16+
17+
- BasicRange (out of place), no hierarchical parallelism involved
18+
- NDRange (in-place), work-groups and work-items, direct mapping of the problem dimensions
19+
- AdaptiveWg (in-place or out-of-place), optimized work-group sizes, streaming, optimal local memory usage
1420

1521
# Build the project:
1622
You can use the `compile.sh` script to compile for various hardware and sycl-implementations. For multi-device compilation flows, build the project manually.
@@ -19,13 +25,13 @@ Use the `./compile.sh --help` to see the options.
1925
Example usage:
2026
```sh
2127
#generate the advection executable and advection.ini file
22-
./compile.sh --hw x86_64 --sycl intel-llvm
28+
./compile.sh --hw cpu --sycl dpcpp
2329

24-
#create build_intel-llvm_a100 folder with benchmarks
25-
./compile.sh --hw a100 --sycl intel-llvm --benchmark_BUILD_DIR=/path/to/google/benchmark/build
30+
#create build_dpcpp_a100 folder with benchmarks
31+
./compile.sh --hw a100 --sycl dpcpp --benchmark_DIR=/path/to/google/benchmark/build
2632

27-
#create build_acpp_mi250 folder with tests and execute tests
28-
./compile.sh --hw mi250 --sycl acpp --build-tests --run-tests
33+
#create build_acpp_mi300 folder with tests and execute tests
34+
./compile.sh --hw mi300 --sycl acpp --build-tests --run-tests
2935
```
3036

3137
## Manually build the project
@@ -34,40 +40,9 @@ Flags varies on the SYCL implementation you are using.
3440
- For acpp, export the `ACPP_TARGETS` environment variable before compiling
3541

3642
# Run the executable
37-
1. Set the runtime parameters in `build/src/advection.ini`
38-
39-
```ini
40-
[run]
41-
# Total number of iterations
42-
maxIter = 100
43-
# Wheter to run on the GPU or CPU
44-
gpu = true
45-
# The kernel type to use for advection
46-
kernelImpl = Hierarchical
47-
# Size of work groups use in the kernels
48-
workGroupSizeX = 128
49-
workGroupSizeY = 1
50-
# Outputs a solution.log file to be read with the python notebook
51-
outputSolution = false
52-
53-
[geometry]
54-
nb = 512 # nb of speed points (batch dimension)
55-
n1 = 1024 # nb of spatial points (dimension of interest)
56-
n2 = 2 # fictive dimension, is also stride for x-dim
57-
58-
[discretization]
59-
dt = 0.001
60-
61-
minRealX = 0
62-
maxRealX = 1
63-
minRealVx = -1
64-
maxRealVx = 1
65-
```
66-
67-
Deltas $d_{vx}$ and $d_x$ are deduced by the number of points and min/max values.
68-
69-
2. Run the executable `build/src/advection`
43+
1. Set the runtime parameters in `build/src/<conv1d|advection>.ini`
44+
2. Run the executable `build/src/advection/<conv1d|advection>`
7045

7146

7247
### Credits
73-
This code is largely inspired by the [vlp4D](https://github.com/yasahi-hpc/vlp4d) code.
48+
The advection operator in this code is largely inspired by the [vlp4D](https://github.com/yasahi-hpc/vlp4d) code.

0 commit comments

Comments
 (0)