Skip to content

Commit c6973cf

Browse files
committed
Update readme
1 parent 9afa1bb commit c6973cf

File tree

1 file changed

+53
-108
lines changed
  • DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/stoppable_kernel

1 file changed

+53
-108
lines changed

DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/stoppable_kernel/README.md

Lines changed: 53 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -49,13 +49,13 @@ You can also find more information about [troubleshooting build errors](/DirectP
4949

5050
## Purpose
5151

52-
This tutorial demonstrates how to add a `stop` register to allow a host application to kill (or reset) your kernel at any point. This design pattern is useful in applications where you want your kernel to run for some indefinite number of iterations that can't be communicated ahead of time. For example, consider a situation where you want your kernel to periodically re-launch with new kernel arguments when something happens that only the host is aware of, such as an input device disconnecting, or some amount of time passing.
52+
This tutorial demonstrates how to add a `stop` register to allow a host application to kill (or reset) your kernel at any point. This design pattern is useful in applications where you want your kernel to run for some indefinite number of iterations that can't be communicated ahead of time. For example, consider a situation where you want your kernel to periodically re-launch with new kernel arguments when something happens that only the host is aware of, such as an input device disconnecting, or some amount of time passing.
5353

5454
## Key Implementation Details
5555

5656
The key to implementing this behavior is to create a `while()` loop that terminates when a 'stop' signal is seen on a pipe interface. Pipe interfaces (unlike kernel arguments) can be read during a kernel's execution. You can also use the `protocol::avalon_mm_uses_ready` property to allow the host code to interact with the pipe through your kernel's memory-map instead of a streaming interface. For details, see the [CSR Pipes](/DirectProgramming/C++SYCL_FPGA/Tutorials/Features/hls_flow_interfaces/component_interfaces_comparison/csr-pipes) sub-sample within the [Component Interfaces Comparison](/DirectProgramming/C++SYCL_FPGA/Tutorials/Features/hls_flow_interfaces/component_interfaces_comparison) code sample.
5757

58-
The `while()` loop continues iterating until the host application (or even a different kernel) writes a `true` into the `StopPipe`. We use **non-blocking** pipe operations to guarantee that the kernel checks *all* of its pipe interfaces every clock cycle. It is important to use non-blocking pipe reads and writes, because blocking pipe operations may take some time to respond. If the kernel is blocking on a different pipe operation, it will not respond to a write to the `StopPipe` interface.
58+
The `while()` loop continues iterating until the host application (or even a different kernel) writes a `true` into the `StopPipe`. We use **non-blocking** pipe operations to guarantee that the kernel checks _all_ of its pipe interfaces every clock cycle. It is important to use non-blocking pipe reads and writes, because blocking pipe operations may take some time to respond. If the kernel is blocking on a different pipe operation, it will not respond to a write to the `StopPipe` interface.
5959

6060
```c++
6161
[[intel::initiation_interval(1)]] // NO-FORMAT: Attribute
@@ -95,7 +95,7 @@ The testbench in `main.cpp` exercises the kernel in the following steps:
9595
3. Read 256 more outputs from the kernel, which should be a monotonically growing sequence starting at 263.
9696
4. Stop the kernel.
9797
5. Initialize the kernel with a new initialization value of 77.
98-
6. Read 256 more outputs from the kernel, which should be a monotonically growing sequence starting at 77.
98+
6. Read 256 more outputs from the kernel, which should be a monotonically growing sequence starting at 77.
9999

100100
## Building the `stoppable_kernel` Tutorial
101101

@@ -151,51 +151,25 @@ This design uses CMake to generate a build script for GNU/make.
151151
> ```
152152
> cmake .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
153153
> ```
154-
>
155-
> Alternatively, you can target an explicit FPGA board variant and BSP by using the following command:
156-
>
157-
> ```
158-
> cmake .. -DFPGA_DEVICE=<board-support-package>:<board-variant>
159-
> ```
160-
>
161-
> **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form
162-
>
163-
> ```
164-
> $> aoc -list-boards
165-
> Board list:
166-
> <board-variant>
167-
> Board Package: <path/to/board/package>/board-support-package
168-
> <board-variant2>
169-
> Board Package: <path/to/board/package>/board-support-package
170-
> ```
171-
>
172-
> You will only be able to run an executable on the FPGA if you specified a BSP.
173-
174-
3. Compile the design with the generated `Makefile`. The following build targets are provided, matching the recommended development flow:
175-
176-
| Target | Expected Time | Output | Description |
177-
| :-------------- | :------------- | :--------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
178-
| `make fpga_emu` | Seconds | x86-64 binary | Compiles the FPGA device code to the CPU. Use the Intel® FPGA Emulation Platform for OpenCL™ software to verify your SYCL code’s functional correctness. |
179-
| `make report` | Minutes | RTL + FPGA reports | Compiles the FPGA device code to RTL and generates an optimization report that describes the structures generated on the FPGA, identifies performance bottlenecks, and estimates resource utilization. This report will include the interfaces defined in your selected Board Support Package. The generated RTL may be exported to Intel® Quartus Prime software. |
180-
| `make fpga_sim` | Minutes | RTL + FPGA reports + x86-64 binary | Compiles the FPGA device code to RTL and generates a simulation testbench. Use the Questa\*-Intel® FPGA Edition simulator to verify your design. |
181-
| `make fpga` | Multiple Hours | Quartus Place & Route (Full accelerator) + FPGA reports + x86-64 host binary | Compiles the FPGA device code to RTL and compiles the generated RTL using Intel® Quartus® Prime. If you specified a BSP with `FPGA_DEVICE`, this will generate an FPGA image that you can run on the corresponding accelerator board. |
182154
183-
The `fpga_emu`, `fpga_sim` and `fpga` targets produce binaries that you can run. The executables will be called `TARGET_NAME.fpga_emu`, `TARGET_NAME.fpga_sim`, and `TARGET_NAME.fpga`, where `TARGET_NAME` is the value you specify in `CMakeLists.txt`.
184-
185-
You can see a listing of the commands that are run:
186-
187-
```bash
188-
build $> make report
189-
[ 33%] To compile manually:
190-
/[ ... ]/linux64/bin/icpx -I../../../../include -fintelfpga -Wall -qactypes -DFPGA_HARDWARE -c ../src/stoppable_kernel.cpp -o CMakeFiles/report.dir/src/ stoppable_kernel.cpp.o
191-
192-
To link manually:
193-
/[ ... ]/linux64/bin/icpx -fintelfpga -Xshardware -Xstarget=Agilex7 -fsycl-link=early -o stoppable_kernel.report CMakeFiles/report.dir/src/ stoppable_kernel.cpp.o
194-
195-
[ ... ]
196-
197-
[100%] Built target report
198-
```
155+
3. Compile the design. (The provided targets match the recommended development flow.)
156+
157+
1. Compile for emulation (fast compile time, targets emulates an FPGA device).
158+
```
159+
make fpga_emu
160+
```
161+
2. Generate the HTML optimization reports.
162+
```
163+
make report
164+
```
165+
3. Compile for simulation (fast compile time, targets simulator FPGA device).
166+
```
167+
make fpga_sim
168+
```
169+
4. Compile with Quartus place and route (To get accurate area estimate, longer compile time).
170+
```
171+
make fpga
172+
```
199173
200174
### On a Windows\* System
201175
@@ -230,58 +204,25 @@ This design uses CMake to generate a build script for `nmake`.
230204
> ```
231205
> cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
232206
> ```
233-
>
234-
> Alternatively, you can target an explicit FPGA board variant and BSP by using the following command:
235-
>
236-
> ```
237-
> cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<board-support-package>:<board-variant>
238-
> ```
239-
>
240-
> **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form
241-
>
242-
> ```
243-
> $> aoc -list-boards
244-
> Board list:
245-
> <board-variant>
246-
> Board Package: <path/to/board/package>/board-support-package
247-
> <board-variant2>
248-
> Board Package: <path/to/board/package>/board-support-package
249-
> ```
250-
>
251-
> You will only be able to run an executable on the FPGA if you specified a BSP.
252-
253-
3. Compile the design with the generated `Makefile`. The following build targets are provided, matching the recommended development flow:
254-
255-
| Target | Expected Time | Output | Description |
256-
| :--------------- | :------------- | :--------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
257-
| `nmake fpga_emu` | Seconds | x86-64 binary | Compiles the FPGA device code to the CPU. Use the Intel® FPGA Emulation Platform for OpenCL™ software to verify your SYCL code’s functional correctness. |
258-
| `nmake report` | Minutes | RTL + FPGA reports | Compiles the FPGA device code to RTL and generates an optimization report that describes the structures generated on the FPGA, identifies performance bottlenecks, and estimates resource utilization. This report will include the interfaces defined in your selected Board Support Package. The generated RTL may be exported to Intel® Quartus Prime software. |
259-
| `nmake fpga_sim` | Minutes | RTL + FPGA reports + x86-64 binary | Compiles the FPGA device code to RTL and generates a simulation testbench. Use the Questa\*-Intel® FPGA Edition simulator to verify your design. |
260-
| `nmake fpga` | Multiple Hours | Quartus Place & Route (Full accelerator) + FPGA reports + x86-64 host binary | Compiles the FPGA device code to RTL and compiles the generated RTL using Intel® Quartus® Prime. If you specified a BSP with `FPGA_DEVICE`, this will generate an FPGA image that you can run on the corresponding accelerator board. |
261207
262-
The `fpga_emu`, `fpga_sim`, and `fpga` targets also produce binaries that you can run. The executables will be called `TARGET_NAME.fpga_emu.exe`, `TARGET_NAME.fpga_sim.exe`, and `TARGET_NAME.fpga.exe`, where `TARGET_NAME` is the value you specify in `CMakeLists.txt`.
263-
264-
> **Note**: If you encounter any issues with long paths when compiling under Windows\*, you may have to create your 'build' directory in a shorter path, for example c:\samples\build. You can then run cmake from that directory, and provide cmake with the full path to your sample directory, for example:
265-
>
266-
> ```
267-
> C:\samples\build> cmake -G "NMake Makefiles" C:\long\path\to\code\sample\CMakeLists.txt
268-
> ```
269-
>
270-
> You can see a listing of the commands that are run:
271-
272-
```bash
273-
build> nmake report
274-
275-
[ 33%] To compile manually:
276-
C:/Program Files (x86)/Intel/oneAPI/compiler/latest/windows/bin/icx-cl.exe -I../../../../include -fintelfpga -Wall /EHsc -Qactypes -DFPGA_HARDWARE -c ../src/stoppable_kernel.cpp -o CMakeFiles/report.dir/src/stoppable_kernel.cpp.obj
277-
278-
To link manually:
279-
C:/Program Files (x86)/Intel/oneAPI/compiler/latest/windows/bin/icx-cl.exe -fintelfpga -Xshardware -Xstarget=Agilex7 -fsycl-link=early -o stoppable_kernel.report.exe CMakeFiles/report.dir/src/stoppable_kernel.cpp.obj
280-
281-
[ ... ]
282-
283-
[100%] Built target report
284-
```
208+
3. Compile the design. (The provided targets match the recommended development flow.)
209+
210+
1. Compile for emulation (fast compile time, targets emulated FPGA device).
211+
```
212+
nmake fpga_emu
213+
```
214+
2. Generate the optimization report.
215+
```
216+
nmake report
217+
```
218+
3. Compile for simulation (fast compile time, targets simulator FPGA device).
219+
```
220+
nmake fpga_sim
221+
```
222+
4. Compile with Quartus place and route (To get accurate area estimate, longer compile time).
223+
```
224+
nmake fpga
225+
```
285226
286227
## Run the `stoppable_kernel` Executable
287228
@@ -295,10 +236,6 @@ This design uses CMake to generate a build script for `nmake`.
295236
```
296237
CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 ./stoppable.fpga_sim
297238
```
298-
3. Alternatively, run the sample on the FPGA device (only if you ran `cmake` with `-DFPGA_DEVICE=<board-support-package>:<board-variant>`).
299-
```
300-
./stoppable.fpga
301-
```
302239
303240
### On Windows
304241
@@ -312,16 +249,24 @@ This design uses CMake to generate a build script for `nmake`.
312249
stoppable.fpga_sim.exe
313250
set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=
314251
```
315-
3. Alternatively, run the sample on the FPGA device (only if you ran `cmake` with `-DFPGA_DEVICE=<board-support-package>:<board-variant>`).
316-
```
317-
stoppable.fpga.exe
318-
```
319252
320253
## Example Output
321254
322255
```
323-
Running on device: Intel(R) FPGA Emulation Device
324-
add two vectors of size 256
256+
Running on device: SimulatorDevice : Multi-process Simulator (aclmsim0)
257+
258+
Start kernel StoppableCounter at 7.
259+
Flush pipe until 'start of packet' is seen.
260+
Start counting from 7
261+
Flushed 0 beats.
262+
Start counting from 263
263+
Stop kernel StoppableCounter
264+
265+
Start StoppableCounter at 77.
266+
Flush pipe until 'start of packet' is seen.
267+
Start counting from 77
268+
Flushed 117 beats.
269+
Stop kernel StoppableCounter
325270
PASSED
326271
```
327272

0 commit comments

Comments
 (0)