Skip to content

Commit 9047ea3

Browse files
committed
Update per Kevin's suggestions
1 parent 9afa1bb commit 9047ea3

File tree

6 files changed

+293
-356
lines changed

6 files changed

+293
-356
lines changed

DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/stoppable_kernel/CMakeLists.txt renamed to DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/restartable_kernel/CMakeLists.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ endif()
1010

1111
cmake_minimum_required (VERSION 3.7.2)
1212

13-
project(stoppable CXX)
13+
project(restartable CXX)
1414

1515
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
1616
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
@@ -20,7 +20,7 @@ set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
2020
### Customize these build variables
2121
###############################################################################
2222
set(SOURCE_FILES src/main.cpp)
23-
set(TARGET_NAME stoppable)
23+
set(TARGET_NAME restartable)
2424

2525
# Use cmake -DFPGA_DEVICE=<board-support-package>:<board-variant> to choose a
2626
# different device.
Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
# `Restartable Kernel` Sample
2+
3+
This tutorial demonstrates how to make a restartable kernel. The technique shown in this tutorial lets you dynamically terminate your kernel while it runs, allowing it to load a new set of kernel arguments.
4+
5+
| Optimized for | Description |
6+
| :------------------ | :------------------------------------------------------------------------------------------ |
7+
| OS | Ubuntu* 20.04 <br> RHEL*/CentOS* 8 <br> SUSE* 15 <br> Windows* 10 <br> Windows Server* 2019 |
8+
| Hardware | Intel® Agilex® 7, Agilex® 5, Arria® 10, Stratix® 10, and Cyclone® V FPGAs |
9+
| Software | Intel® oneAPI DPC++/C++ Compiler |
10+
| What you will learn | Best practices for creating and managing a oneAPI FPGA project |
11+
| Time to complete | 10 minutes |
12+
13+
> **Note**: Even though the Intel DPC++/C++ oneAPI compiler is enough to compile for emulation, generating reports and generating RTL, there are extra software requirements for the simulation flow and FPGA compiles.
14+
>
15+
> To use the simulator flow, Intel® Quartus® Prime Pro Edition (or Standard Edition when targeting Cyclone® V) and one of the following simulators must be installed and accessible through your PATH:
16+
>
17+
> - Questa\*-Intel® FPGA Edition
18+
> - Questa\*-Intel® FPGA Starter Edition
19+
> - ModelSim® SE
20+
>
21+
> When using the hardware compile flow, Intel® Quartus® Prime Pro Edition (or Standard Edition when targeting Cyclone® V) must be installed and accessible through your PATH.
22+
>
23+
> :warning: Make sure you add the device files associated with the FPGA that you are targeting to your Intel® Quartus® Prime installation.
24+
25+
> **Note**: In oneAPI full systems, kernels that use SYCL Unified Shared Memory (USM) host allocations or USM shared allocations (and therefore the code in this tutorial) are only supported by Board Support Packages (BSPs) with USM support. Kernels that use these types of allocations can always be used to generate standalone IPs.
26+
27+
## Prerequisites
28+
29+
This sample is part of the FPGA code samples.
30+
It is categorized as a Tier 3 sample that demonstrates a design pattern.
31+
32+
```mermaid
33+
flowchart LR
34+
tier1("Tier 1: Get Started")
35+
tier2("Tier 2: Explore the Fundamentals")
36+
tier3("Tier 3: Explore the Advanced Techniques")
37+
tier4("Tier 4: Explore the Reference Designs")
38+
39+
tier1 --> tier2 --> tier3 --> tier4
40+
41+
style tier1 fill:#f96,stroke:#0071c1,stroke-width:1px,color:#fff
42+
style tier2 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff
43+
style tier3 fill:#0071c1,stroke:#333,stroke-width:1px,color:#fff
44+
style tier4 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff
45+
```
46+
47+
Find more information about how to navigate this part of the code samples in the [FPGA top-level README.md](/DirectProgramming/C++SYCL_FPGA/README.md).
48+
You can also find more information about [troubleshooting build errors](/DirectProgramming/C++SYCL_FPGA/README.md#troubleshooting), [running the sample on the Intel® DevCloud](/DirectProgramming/C++SYCL_FPGA/README.md#build-and-run-the-samples-on-intel-devcloud-optional), [using Visual Studio Code with the code samples](/DirectProgramming/C++SYCL_FPGA/README.md#use-visual-studio-code-vs-code-optional), [links to selected documentation](/DirectProgramming/C++SYCL_FPGA/README.md#documentation), etc.
49+
50+
## Purpose
51+
52+
This tutorial demonstrates how to add a `stop` register to allow a host application to kill (or reset) your kernel at any point. This design pattern is useful in applications where you want your kernel to run for some indefinite number of iterations that can't be communicated ahead of time. For example, consider a situation where you want your kernel to periodically re-launch with new kernel arguments when something happens that only the host is aware of, such as an input device disconnecting, or some amount of time passing.
53+
54+
## Key Implementation Details
55+
56+
The key to implementing this behavior is to create a `while()` loop that terminates when a 'stop' signal is seen on a pipe interface. Pipe interfaces (unlike kernel arguments) can be read multiple times during a kernel's execution, so you can use them to send messages to your kernel while it executes. The `while()` loop continues iterating until the host application (or even a different kernel) writes a `true` into the `StopPipe`. We use **non-blocking** pipe operations to guarantee that the kernel checks *all* of its pipe interfaces every clock cycle. It is important to use non-blocking pipe reads and writes, because blocking pipe operations may take some time to respond. If the kernel is blocking on a different pipe operation, it will not respond to a write to the `StopPipe` interface.
57+
58+
```c++
59+
[[intel::initiation_interval(1)]] // NO-FORMAT: Attribute
60+
while (keep_going) {
61+
// Use non-blocking operations to ensure that the kernel can check all its
62+
// pipe interfaces every clock cycle, even if one or more data interfaces
63+
// are stalling (asserting valid = 0) or back-pressuring (asserting ready
64+
// = 0).
65+
bool did_write = false;
66+
PipePayloadType beat = <...>;
67+
OutputPipe::write(beat, did_write);
68+
69+
// Only adjust the state of the kernel if the pipe write succeeded.
70+
// This is logically equivalent to blocking.
71+
if (did_write) {
72+
counter++;
73+
}
74+
75+
// Use non-blocking operations to ensure that the kernel can check all its
76+
// pipe interfaces every clock cycle.
77+
bool did_read_keep_going = false;
78+
bool stop_result = StopPipe::read(did_read_keep_going);
79+
if (did_read_keep_going) {
80+
keep_going = !stop_result;
81+
}
82+
}
83+
```
84+
85+
In this sample, `StopPipe` has been assigned the `protocol::avalon_mm_uses_ready` property so it terminates in the kernel's control/status register (CSR) instead of in a streaming interface. Terminating in the CSR allows this kernel to be managed by a memory-mapped host (such as a Nios® V softcore processor), while terminating in a streaming interface is convenient if this kernel were to be managed by another SYCL kernel. For details about the `protocol::avalon_mm_uses_ready` property, see the [CSR Pipes](/DirectProgramming/C++SYCL_FPGA/Tutorials/Features/hls_flow_interfaces/component_interfaces_comparison/csr-pipes) sub-sample within the [Component Interfaces Comparison](/DirectProgramming/C++SYCL_FPGA/Tutorials/Features/hls_flow_interfaces/component_interfaces_comparison) code sample.
86+
87+
![](assets/stopcsr.png)
88+
89+
The testbench in `main.cpp` exercises the kernel in the following steps:
90+
91+
1. Initialize the counter kernel with an initial value of 7.
92+
2. Read a sequence of 256 outputs from the kernel, which should be a monotonically growing sequence starting at 7.
93+
3. Read 256 more outputs from the kernel, which should be a monotonically growing sequence starting at 263.
94+
4. Stop the kernel.
95+
5. Initialize the kernel with a new initialization value of 77.
96+
6. ead 256 more outputs from the kernel, which should be a monotonically growing sequence starting at 77.
97+
98+
## Building the `restartable_kernel` Tutorial
99+
100+
> **Note**: When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables.
101+
> Set up your CLI environment by sourcing the `setvars` script located in the root of your oneAPI installation every time you open a new terminal window.
102+
> This practice ensures that your compiler, libraries, and tools are ready for development.
103+
>
104+
> Linux\*:
105+
>
106+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
107+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
108+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
109+
>
110+
> Windows\*:
111+
>
112+
> - `C:\"Program Files (x86)"\Intel\oneAPI\setvars.bat`
113+
> - Windows PowerShell\*, use the following command: `cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'`
114+
>
115+
> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html) or [Use the setvars Script with Windows\*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).
116+
117+
Use these commands to run the design, depending on your OS.
118+
119+
### On a Linux\* System
120+
121+
This design uses CMake to generate a build script for GNU/make.
122+
123+
1. Change to the sample directory.
124+
125+
2. Configure the build system for the Agilex® 7 device family, which is the default.
126+
127+
```
128+
mkdir build
129+
cd build
130+
cmake ..
131+
```
132+
133+
> **Note**: You can change the default target by using the command:
134+
>
135+
> ```
136+
> cmake .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
137+
> ```
138+
139+
3. Compile the design. (The provided targets match the recommended development flow.)
140+
141+
1. Compile for emulation (fast compile time, targets emulates an FPGA device).
142+
```
143+
make fpga_emu
144+
```
145+
2. Generate the HTML optimization reports.
146+
```
147+
make report
148+
```
149+
3. Compile for simulation (fast compile time, targets simulator FPGA device).
150+
```
151+
make fpga_sim
152+
```
153+
4. Compile with Quartus place and route (To get accurate area estimate, longer compile time).
154+
```
155+
make fpga
156+
```
157+
158+
### On a Windows\* System
159+
160+
This design uses CMake to generate a build script for `nmake`.
161+
162+
1. Change to the sample directory.
163+
164+
2. Configure the build system for the Agilex® 7 device family, which is the default.
165+
166+
```
167+
mkdir build
168+
cd build
169+
cmake -G "NMake Makefiles" ..
170+
```
171+
172+
You can create a debuggable binary by setting `CMAKE_BUILD_TYPE` to `Debug`:
173+
174+
```
175+
mkdir build
176+
cd build
177+
cmake -G "NMake Makefiles" .. -DCMAKE_BUILD_TYPE=Debug
178+
```
179+
180+
If you want to use the `report`, `fpga_sim`, or `fpga` flows, you should switch the `CMAKE_BUILD_TYPE` back to `Release``:
181+
182+
```
183+
cmake -G "NMake Makefiles" .. -DCMAKE_BUILD_TYPE=Release
184+
```
185+
186+
> **Note**: You can change the default target by using the command:
187+
>
188+
> ```
189+
> cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
190+
> ```
191+
192+
193+
194+
3. Compile the design. (The provided targets match the recommended development flow.)
195+
196+
1. Compile for emulation (fast compile time, targets emulated FPGA device).
197+
```
198+
nmake fpga_emu
199+
```
200+
2. Generate the optimization report.
201+
```
202+
nmake report
203+
```
204+
3. Compile for simulation (fast compile time, targets simulator FPGA device).
205+
```
206+
nmake fpga_sim
207+
```
208+
4. Compile with Quartus place and route (To get accurate area estimate, longer compile time).
209+
```
210+
nmake fpga
211+
```
212+
213+
> **Note**: If you encounter any issues with long paths when compiling under Windows\*, you may have to create your 'build' directory in a shorter path, for example `C:\samples\build`. You can then run cmake from that directory, and provide cmake with the full path to your sample directory, for example:
214+
>
215+
> ```
216+
> C:\samples\build> cmake -G "NMake Makefiles" C:\long\path\to\code\sample\CMakeLists.txt
217+
> ```
218+
219+
## Run the `restartable_kernel` Executable
220+
221+
### On Linux
222+
223+
1. Run the sample on the FPGA emulator (the kernel executes on the CPU).
224+
```
225+
./restartable.fpga_emu
226+
```
227+
2. Run the sample on the FPGA simulator device.
228+
```
229+
CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 ./restartable.fpga_sim
230+
```
231+
232+
### On Windows
233+
234+
1. Run the sample on the FPGA emulator (the kernel executes on the CPU).
235+
```
236+
restartable.fpga_emu.exe
237+
```
238+
2. Run the sample on the FPGA simulator device.
239+
```
240+
set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1
241+
restartable.fpga_sim.exe
242+
set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=
243+
```
244+
245+
## Example Output
246+
247+
```
248+
Running on device: Intel(R) FPGA Emulation Device
249+
250+
Start kernel RestartableCounter at 7.
251+
Flush pipe until 'start of packet' is seen.
252+
Flushed 0 beats.
253+
Start counting from 7
254+
Start counting from 263
255+
Stop kernel RestartableCounter
256+
257+
Start RestartableCounter at 77.
258+
Flush pipe until 'start of packet' is seen.
259+
Flushed 239107 beats.
260+
Start counting from 77
261+
Stop kernel RestartableCounter
262+
PASSED
263+
```
264+
265+
## License
266+
267+
Code samples are licensed under the MIT license. See
268+
[License.txt](/License.txt) for details.
269+
270+
Third party program Licenses can be found here: [third-party-programs.txt](/third-party-programs.txt).

DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/stoppable_kernel/src/main.cpp renamed to DirectProgramming/C++SYCL_FPGA/Tutorials/DesignPatterns/restartable_kernel/src/main.cpp

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
#include <sycl/sycl.hpp>
66

77
#include "exception_handler.hpp"
8-
#include "stoppable_counter_kernel.hpp"
8+
#include "restartable_counter_kernel.hpp"
99

1010
// Forward declare the kernel name in the global scope. This is an FPGA best
1111
// practice that reduces name mangling in the optimization reports.
@@ -24,19 +24,23 @@ bool CheckIncrements(sycl::queue q, int count_start, int iterations,
2424

2525
if (should_flush) {
2626
std::cout << "Flush pipe until 'start of packet' is seen." << std::endl;
27+
}
28+
else {
29+
std::cout << "Start counting from " << expected_count << std::endl;
2730
}
28-
std::cout << "Start counting from " << expected_count << std::endl;
2931

3032
int flushed_count = 0;
3133
for (int itr = 0; itr < iterations; itr++) {
32-
stoppable_counter::OutputBeat beat = PipeType::read(q);
34+
restartable_counter::OutputBeat beat = PipeType::read(q);
3335

3436
// Flush the pipe in case we are starting fresh.
3537
if (should_flush && itr == 0) {
3638
while (beat.sop != true) {
3739
beat = PipeType::read(q);
3840
flushed_count++;
3941
}
42+
std::cout << "\tFlushed " << flushed_count << " beats." << std::endl;
43+
std::cout << "Start counting from " << expected_count << std::endl;
4044
}
4145

4246
int calculated_count = beat.data;
@@ -48,10 +52,6 @@ bool CheckIncrements(sycl::queue q, int count_start, int iterations,
4852
expected_count++;
4953
}
5054

51-
if (should_flush) {
52-
std::cout << "\tFlushed " << flushed_count << " beats." << std::endl;
53-
}
54-
5555
return passed;
5656
}
5757

@@ -81,38 +81,38 @@ int main() {
8181
<< std::endl;
8282
{
8383
int count_start = 7;
84-
std::cout << "\nStart kernel StoppableCounter at " << count_start << ". "
84+
std::cout << "\nStart kernel RestartableCounter at " << count_start << ". "
8585
<< std::endl;
8686

8787
// Capture the event so that we can stop the kernel later on
8888
sycl::event e = q.single_task<CounterID>(
89-
stoppable_counter::StoppableCounter{count_start});
89+
restartable_counter::RestartableCounter{count_start});
9090

91-
passed = CheckIncrements<stoppable_counter::OutputPipe>(q, count_start,
91+
passed = CheckIncrements<restartable_counter::OutputPipe>(q, count_start,
9292
kIterations);
9393

9494
int new_start = count_start + kIterations;
9595
// continue reading more results
96-
passed &= CheckIncrements<stoppable_counter::OutputPipe>(
96+
passed &= CheckIncrements<restartable_counter::OutputPipe>(
9797
q, new_start, kIterations, false);
9898

99-
std::cout << "Stop kernel StoppableCounter" << std::endl;
99+
std::cout << "Stop kernel RestartableCounter" << std::endl;
100100
// Write a `true` into `StopPipe` to instruct the kernel to break out of
101101
// its main loop, then wait for the kernel to complete.
102-
stoppable_counter::StopPipe::write(q, true);
102+
restartable_counter::StopPipe::write(q, true);
103103
e.wait();
104104
}
105105
{
106106
int count_start = 77;
107-
std::cout << "\nStart StoppableCounter at " << count_start << "."
107+
std::cout << "\nStart RestartableCounter at " << count_start << "."
108108
<< std::endl;
109109
sycl::event e = q.single_task<CounterID>(
110-
stoppable_counter::StoppableCounter{count_start});
111-
passed &= CheckIncrements<stoppable_counter::OutputPipe>(q, count_start,
110+
restartable_counter::RestartableCounter{count_start});
111+
passed &= CheckIncrements<restartable_counter::OutputPipe>(q, count_start,
112112
kIterations);
113113

114-
std::cout << "Stop kernel StoppableCounter" << std::endl;
115-
stoppable_counter::StopPipe::write(q, true);
114+
std::cout << "Stop kernel RestartableCounter" << std::endl;
115+
restartable_counter::StopPipe::write(q, true);
116116
e.wait();
117117
}
118118

0 commit comments

Comments
 (0)