You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -17,7 +17,7 @@ Sparse Conjugate Gradient uses oneMKL sparse linear algebra routines to solve a
17
17
This sample performs its computations on the default SYCL* device. You can set the `SYCL_DEVICE_TYPE` environment variable to `cpu` or `gpu` to select the device to use.
18
18
19
19
## Key Implementation Details
20
-
oneMKL sparse routines use a two-stage method where the sparse matrix is analyzed to prepare subsequent calculations (the _optimize_ step). Sparse matrix-vector multiplication and triangular solves (`gemv` and `trsv`) are used to implement the main loop, along with vector routines from BLAS.
20
+
oneMKL sparse routines use a two-stage method where the sparse matrix is analyzed to prepare subsequent calculations (the _optimize_ step). Sparse matrix-vector multiplication and triangular solves (`gemv` and `trsv`) are used to implement the main loop, along with vector routines from BLAS. Two implementations are provided: The first implementation, in `sparse_cg.cpp`, has several places where a device to host copy and wait are initiated to allow the alpha and beta coefficients to be initiated in the BLAS vector routines as host scalars. The second implementation, in `sparse_cg2.cpp`, keeps the coefficients for alpha and beta on the device, which require that custom axpby2 and axpy3 functions are written to handle the construction of alpha and beta coefficients on-the-fly from the device. This removes some of the synchronization points that are seen in the first implementation.
21
21
22
22
## Using Visual Studio Code* (Optional)
23
23
You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations,
@@ -62,12 +62,13 @@ Run `nmake` to build and run the sample. `nmake clean` removes temporary files.
62
62
## Running the Sparse Conjugate Gradient Sample
63
63
64
64
### Example of Output
65
-
If everything is working correctly, the example program will rapidly converge to a solution and display the solution vector's first few entries. The test will run in both single and double precision (if available on the selected device).
65
+
If everything is working correctly, the example programs will rapidly converge to a solution. Each test will run in both single and double precision (if available on the selected device).
66
66
67
+
The first PCG implementation with host side coefficients:
0 commit comments