stfc
diff --git a/‎changelog‎
Lines changed: 1 addition & 0 deletions b/‎changelog‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/user_guide/examples.rst‎
Lines changed: 28 additions & 29 deletions b/‎doc/user_guide/examples.rst‎
Lines changed: 28 additions & 29 deletions
diff --git a/‎doc/user_guide/transformations.rst‎
Lines changed: 6 additions & 3 deletions b/‎doc/user_guide/transformations.rst‎
Lines changed: 6 additions & 3 deletions
diff --git a/‎examples/lfric/README.md‎
Lines changed: 11 additions & 26 deletions b/‎examples/lfric/README.md‎
Lines changed: 11 additions & 26 deletions
diff --git a/‎examples/lfric/code/dg_matrix_vector_kernel_mod.F90‎
Lines changed: 1 addition & 1 deletion b/‎examples/lfric/code/dg_matrix_vector_kernel_mod.F90‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/lfric/eg14/.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎examples/lfric/eg14/.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎examples/lfric/eg14/Makefile‎
Lines changed: 22 additions & 30 deletions b/‎examples/lfric/eg14/Makefile‎
Lines changed: 22 additions & 30 deletions
diff --git a/‎examples/lfric/eg14/README.md‎
Lines changed: 7 additions & 4 deletions b/‎examples/lfric/eg14/README.md‎
Lines changed: 7 additions & 4 deletions
diff --git a/‎examples/lfric/eg14/acc_kernels.py‎
Lines changed: 0 additions & 58 deletions b/‎examples/lfric/eg14/acc_kernels.py‎
Lines changed: 0 additions & 58 deletions
diff --git a/‎examples/lfric/eg14/acc_parallel.py‎
Lines changed: 0 additions & 66 deletions b/‎examples/lfric/eg14/acc_parallel.py‎
Lines changed: 0 additions & 66 deletions
@@ -1,4 +1,5 @@
 	1) PR #1747 for #1720. Adds support for If blocks to PSyAD.
+	2) PR #1669 for #450. Remove set_dirty/clean from ACC regions
 
 release 2.3.0 9th June 2022
 
 
@@ -481,35 +481,34 @@ better job when optimising the code.
 Example 14: OpenACC
 ^^^^^^^^^^^^^^^^^^^
 
-Example of adding OpenACC directives in the dynamo0.3 API. This is a
-work in progress so the generated code may not work as
-expected. However it is never-the-less useful as a starting
-point. Three scripts are provided.
-
-The first script (``acc_kernels.py``) shows how to add OpenACC Kernels
-directives to the PSy-layer. This example only works with distributed
-memory switched off as the OpenACC Kernels transformation does not yet
-support halo exchanges within an OpenACC Kernels region.
-
-The second script (``acc_parallel.py``)shows how to add OpenACC Loop,
-Parallel and Enter Data directives to the PSy-layer. Again this
-example only works with distributed memory switched off as the OpenACC
-Parallel transformation does not support halo exchanges within an
-OpenACC Parallel region.
-
-The third script (``acc_parallel_dm.py``) is the same as the second
-except that it does support distributed memory being switched on by
-placing an OpenACC Parallel directive around each OpenACC Loop
-directive, rather than having one for the whole invoke. This approach
-avoids having halo exchanges within an OpenACC Parallel region.
-
-The generated code has a number of problems including 1) it does not
-modify the kernels to include the OpenACC Routine directive, 2) a
-loop's upper bound is computed via a derived type (this should be
-computed beforehand) 3) set_dirty and set_clean calls are placed
-within an OpenACC Parallel directive and 4) there are no checks on
-whether loops are parallel or not, it is just assumed they are -
-i.e. support for colouring or locking is not yet implemented.
+Example of adding OpenACC directives in the dynamo0.3 API.
+A single transformation script (``acc_parallel_dm.py``) is provided
+which demonstrates how to add OpenACC Loop, Parallel and Enter Data
+directives to the PSy-layer. It supports distributed memory being
+switched on by placing an OpenACC Parallel directive around each
+OpenACC Loop directive, rather than having one for the whole invoke.
+This approach avoids having halo exchanges within an OpenACC Parallel
+region. The script also uses :ref:`ACCRoutineTrans <available_kernel_trans>`
+to transform the one user-supplied kernel through
+the addition of an ``!$acc routine`` directive. This ensures that the
+compiler builds a version suitable for execution on the accelerator (GPU).
+
+The generated code has two problems:
+
+ 1. There are no checks on whether loops are safe to parallelise or not,
+    it is just assumed they are - i.e. support for colouring or locking
+    is not yet implemented.
+ 2. Although the user-supplied kernel is transformed so as to have the
+    necessary ``!$acc routine`` directive, the associated (but unnecessary)
+    ``use`` statement in the transformed Algorithym layer still uses the
+    name of the original, untransformed kernel (issue #1724).
+
+Since no colouring is required in this case, the generated Alg layer
+may be fixed by hand (by simply deleting the offending ``use`` statement)
+and the resulting code compiled and run on GPU. However, performance will
+be very poor as, with the limited optimisations and directives currently
+applied, the NVIDIA compiler refuses to run the user-supplied kernel in
+parallel.
 
 Example 15: CPU Optimisation of Matvec
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -522,6 +522,8 @@ variable that is available to it from the enclosing module scope.
 .. note:: these rules *only* apply to kernels that are the target of
       PSyclone kernel transformations.
 
+.. _available_kernel_trans:
+
 Available Kernel Transformations
 ++++++++++++++++++++++++++++++++
 
@@ -1011,9 +1013,10 @@ user-supplied kernel routines are called from within
 PSyclone-generated loops in the PSy layer. PSyclone therefore provides
 the ``ACCRoutineTrans`` transformation which, given a Kernel node in
 the PSyIR, creates a new version of that kernel with the ``routine``
-directive added. Again, please see PSyclone/examples/gocean/eg2 for an
-example. This transformation is currently not supported for kernels in
-the Dynamo0.3 API.
+directive added. See either PSyclone/examples/gocean/eg2 or
+PSyclone/examples/lfric/eg14 for an example (although please note that
+this transformation is not yet fully working for kernels in
+the LFRic (Dynamo0.3) API - see #1724).
 
 SIR
 ---
 
@@ -278,34 +278,19 @@ psyclone -s ./kernel_constants.py \
 ## Example 14: OpenACC
 
 This example shows how OpenACC directives can be added to the LFRic
-PSy-layer. This is work in progress so the resultant code is not
-expected to run correctly but it gives a starting point for
-evaluation.
+PSy-layer. It adds OpenACC enter data, parallel and loop directives in the
+presence of halo exchanges. It also transforms the (one) user-supplied
+kernel with the addition of an `ACC routine` directive.
 
-1. Adding OpenACC kernels directives. -nodm is used as an exception is
-   raised if Halo Exchange nodes are found within an OpenACC kernels
-   region.
-   ```sh
-   cd eg14/
-   psyclone -s ./acc_kernels.py -nodm main.x90
-   ```
-
-2. Adding OpenACC enter data, parallel and loop directives. -nodm is
-   used as an exception is raised if Halo Exchange nodes are found within
-   an OpenACC parallel region.
-   ```sh
-   cd eg14/
-   psyclone -s ./acc_parallel.py -nodm main.x90
-   ```
+```sh
+cd eg14/
+psyclone -s ./acc_parallel_dm.py main.x90
+```
 
-3. Adding OpenACC enter data, parallel and loop directives in the
-   presence of halo exchanges. This does not currently produce compilable code
-   because calls to set_clean()/dirty() end up within parallel regions - TODO
-   #450.
-   ```sh
-   cd eg14/
-   psyclone -s ./acc_parallel_dm.py main.x90
-   ```
+The supplied Makefile defines a `compile` target that will build the
+transformed code. Currently the compilation will fail because the
+generated PSy-layer code does not contain the correct name for the
+transformed kernel module (issue #1724).
 
 ## Example 15: Optimise matvec Kernel for CPU
 
 
@@ -45,7 +45,7 @@ module dg_matrix_vector_kernel_mod
                                 GH_FIELD, GH_OPERATOR,      &
                                 GH_REAL, GH_READ, GH_WRITE, &
                                 ANY_DISCONTINUOUS_SPACE_1,  &
-                                ANY_SPACE_1,                &
+                                ANY_SPACE_1, GH_READWRITE,  &
                                 CELL_COLUMN
 
   use constants_mod,     only : r_def, i_def
 
@@ -1,3 +1,4 @@
 example_openacc
 main_alg.f90
 main_psy.f90
+testkern_w0_kernel_0_mod.f90
@@ -36,20 +36,22 @@
 
 # The compiler to use may be specified via the F90 and F90FLAGS
 # environment variables. To use the NVIDIA compiler and enable
-# openacc compilation, use:
+# openacc compilation with managed memory, use:
+#
 # export F90=nvfortran
-# export F90FLAGS="-acc"
+# export F90FLAGS="-acc=gpu -Minfo=all -gpu=managed"
 
 PSYROOT=../../..
 
 include $(PSYROOT)/examples/common.mk
 
-GENERATED_FILES = *.o *.mod  $(EXEC) main_alg.f90 main_psy.f90
+GENERATED_FILES = *.o *.mod $(EXEC) main_alg.f90 main_psy.f90 \
+                  testkern_w0_kernel_0_mod.f90
 
 F90 ?= gfortran
 F90FLAGS ?= -Wall -g
 
-OBJ = main_psy.o main_alg.o testkern_w0_kernel_mod.o
+OBJ = main_psy.o main_alg.o testkern_w0_kernel_0_mod.o
 
 EXEC = example_openacc
 
@@ -59,33 +61,27 @@ LFRIC_LIB=$(LFRIC_DIR)/lib$(LFRIC_NAME).a
 
 F90FLAGS += -I$(LFRIC_DIR)
 
-.PHONY: transformtransform_kernels transform_parallel transform_parallel_dm
-
-transform: transform_kernels transform_parallel transform_parallel_dm
-
-transform_kernels:
-	${PSYCLONE} -nodm -s ./acc_kernels.py \
-	-opsy main_psy.f90 -oalg main_alg.f90 main.x90
-
-transform_parallel:
-	${PSYCLONE} -nodm -s ./acc_parallel.py \
-	-opsy main_psy.f90 -oalg main_alg.f90 main.x90
+.PHONY: transform compile run
 
-transform_parallel_dm:
+# This makefile assumes that the transformed kernel will be named
+# 'testkern_w0_kernel_0_mod.f90'. However, if it already exists then PSyclone
+# will create 'testkern_..._1_mod.f90' so remove it first.
+transform:
+	rm -f testkern_w0_kernel_0_mod.f90
 	${PSYCLONE} -dm -s ./acc_parallel_dm.py \
 	-opsy main_psy.f90 -oalg main_alg.f90 main.x90
 
-
-%_psy.f90:	%.x90
+%_psy.f90: %.x90
 	${PSYCLONE} -s ./acc_parallel_dm.py \
 	-opsy $*_psy.f90 -oalg $*_alg.f90 $<
 
-#TODO #1669 - the code currently does not compile
-#             set_dirty calls inside openacc region
-#TODO #1694 - the code currently does not compile
-#             incorrect variable names and constants
-#             when using builtin
-compile: transform_parallel_dm $(EXEC)
+testkern_w0_kernel_0_mod.f90: main_psy.f90
+
+# TODO #1724 - compilation currently fails because module name in use
+# statement needs correcting following ACCRoutineTrans of Kernel.
+compile: transform
+	@echo "No compilation supported for lfric/eg14 due to #1724"
+
 
 run: compile
 	./$(EXEC)
@@ -97,8 +93,8 @@ $(LFRIC_LIB):
 	$(MAKE) -C $(LFRIC_DIR)
 
 # Dependencies
-main_psy.o:	testkern_w0_kernel_mod.o
-main_alg.o:	main_psy.o
+main_psy.o:	testkern_w0_kernel_0_mod.o
+main_alg.o:	main_psy.o testkern_w0_kernel_0_mod.o
 
 %.o: %.F90
 	$(F90) $(F90FLAGS) -c $<
@@ -111,9 +107,5 @@ main_alg.o:	main_psy.o
 
 main_alg.f90: main_psy.f90
 
-%_psy.f90:	%.x90
-	${PSYCLONE} -s ./acc_parallel_dm.py \
-	-opsy $*_psy.f90 -oalg $*_alg.f90 $<
-
 allclean: clean
 	$(MAKE) -C $(LFRIC_DIR) allclean
@@ -5,13 +5,16 @@ uses OpenACC. The framework for this stand-alone example is explained in
 more details in the directory
 ``<PSYCLONEHOME>/examples/lfric/eg17/full_example``.
 
-The script ``acc_parallel_dm.py`` applies the OpenACC transformation to all 
-kernels. See the [OpenACC](https://psyclone.readthedocs.io/en/stable/transformations.html#openacc)
-section of the PSyclone documentation for details about this transformation.
+The script ``acc_parallel_dm.py`` applies various OpenACC transformations
+to all kernels. See the PSyclone User Guide for [details](https://psyclone.readthedocs.io/en/stable/examples.html#example-14-openacc).
 
 ## Compilation
 
-A simple makefile is provided to compile the example. It needs:
+Note that due to #1724 compilation will currently fail. A temporary workaround
+is to edit the generated Alg file (``main_alg.f90``) and remove the
+``use testkern_w0_kernel_mod, only: ...`` line.
+
+A simple Makefile is provided to compile the example. It needs:
 - the infrastructure library ``liblfric.a`` provided in
   ``<PSYCLONEHOME>/src/psyclone/tests/test_files/dynamo0p3/infrastructure``
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,5 @@`
`1`	`1`	`1) PR #1747 for #1720. Adds support for If blocks to PSyAD.`
	`2`	`+ 2) PR #1669 for #450. Remove set_dirty/clean from ACC regions`
`2`	`3`
`3`	`4`	`release 2.3.0 9th June 2022`
`4`	`5`