Skip to content

Commit eb8599b

Browse files
committed
Merge branch 'main' into dham/abstract_reduced_functional
2 parents 52a8957 + 232c520 commit eb8599b

File tree

13 files changed

+1892
-127
lines changed

13 files changed

+1892
-127
lines changed
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
.. only:: html
2+
3+
.. contents::
4+
5+
====================
6+
Ensemble parallelism
7+
====================
8+
9+
Ensemble parallelism means solving simultaneous copies of a model
10+
with different coefficients, right hand sides, or initial data, in
11+
situations that require communication between the copies. Use cases
12+
include ensemble data assimilation, uncertainty quantification, and
13+
time parallelism. This manual section assumes some familiarity with
14+
parallel programming with MPI.
15+
16+
The Ensemble communicator
17+
=========================
18+
19+
In ensemble parallelism, we split the MPI communicator into a number
20+
of spatial subcommunicators, each of which we refer to as an
21+
ensemble member (shown in blue in the figure below). Within each
22+
ensemble member, existing Firedrake functionality allows us to specify
23+
the finite element problems which use spatial parallelism across the spatial
24+
subcommunicator in the usual way. Another set of
25+
subcommunicators - the ensemble subcommunicators - then allow
26+
communication between ensemble members (shown in grey in the figure
27+
below). Together, the spatial and ensemble subcommunicators form a
28+
Cartesian product over the original global communicator.
29+
30+
.. figure:: images/ensemble.svg
31+
:align: center
32+
33+
Spatial and ensemble parallelism for an ensemble with 5 members,
34+
each of which is executed in parallel over 5 processors.
35+
36+
The additional functionality required to support ensemble parallelism
37+
is the ability to send instances of :class:`~.Function` from one
38+
ensemble to another. This is handled by the :class:`~.Ensemble` class.
39+
40+
Each ensemble member must have the same spatial parallel domain decomposition, so
41+
instantiating an :class:`~.Ensemble` requires a communicator to split
42+
(usually, but not necessarily, ``MPI_COMM_WORLD``) plus the number of
43+
MPI processes to be used in each member of the ensemble (5 in the
44+
figure above, and 2 in the example code below). The number of ensemble
45+
members is implicitly calculated by dividing the size of the original
46+
communicator by the number processes in each ensemble member. The
47+
total number of processes launched by ``mpiexec`` must therefore be
48+
equal to the product of the number of ensemble members with the number of
49+
processes to be used for each ensemble member, and an exception will be
50+
raised if this is not the case.
51+
52+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
53+
:language: python3
54+
:dedent:
55+
:start-after: [test_ensemble_manual_example 1 >]
56+
:end-before: [test_ensemble_manual_example 1 <]
57+
58+
Then, the spatial sub-communicator ``Ensemble.comm`` must be passed
59+
to :func:`~.mesh.Mesh` (possibly via inbuilt mesh generators in
60+
:mod:`~.utility_meshes`), so that it will then be used by any
61+
:func:`~.FunctionSpace` and :class:`~.Function` derived from the mesh.
62+
63+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
64+
:language: python3
65+
:dedent:
66+
:start-after: [test_ensemble_manual_example 2 >]
67+
:end-before: [test_ensemble_manual_example 2 <]
68+
69+
The ensemble sub-communicator is then available through the attribute
70+
``Ensemble.ensemble_comm``.
71+
72+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
73+
:language: python3
74+
:dedent:
75+
:start-after: [test_ensemble_manual_example 3 >]
76+
:end-before: [test_ensemble_manual_example 3 <]
77+
78+
MPI communications across the spatial sub-communicator (i.e., within
79+
an ensemble member) are handled automatically by Firedrake, whilst MPI
80+
communications across the ensemble sub-communicator (i.e., between ensemble
81+
members) are handled through methods of :class:`~.Ensemble`. Currently
82+
send/recv, reductions and broadcasts are supported, as well as their
83+
non-blocking variants.
84+
The rank of the the ensemble member (``my_ensemble.ensemble_comm.rank``)
85+
and the number of ensemble members (``my_ensemble.ensemble_comm.rank``)
86+
can be accessed via the ``ensemble_rank`` and ``ensemble_size`` attributes.
87+
88+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
89+
:language: python3
90+
:dedent:
91+
:start-after: [test_ensemble_manual_example 4 >]
92+
:end-before: [test_ensemble_manual_example 4 <]
93+
94+
.. warning::
95+
96+
In the ``Ensemble`` communication methods, each rank sends data
97+
only across the ``ensemble_comm`` that it is a part of. This
98+
assumes not only that the total mesh is identical on each ensemble
99+
member, but also that the ``ensemble_comm`` connects identical
100+
parts of the mesh on each ensemble member. Because of this, the
101+
spatial partitioning of the mesh on each ``Ensemble.comm`` must be
102+
identical.
103+
104+
105+
EnsembleFunction and EnsembleFunctionSpace
106+
==========================================
107+
108+
A :class:`~.Function` is logically collective over a single spatial
109+
communicator ``Ensemble.comm``. However, for some applications we want
110+
to treat multiple :class:`~.Function` instances on different ensemble
111+
members as a single collective object over the entire global
112+
communicator ``Ensemble.global_comm``. For example, in time-parallel
113+
methods we may have a :class:`~.Function` for each timestep in a
114+
timeseries, and each timestep may live on a separate ensemble member.
115+
In this case we want to treat the entire timeseries as a single
116+
object.
117+
118+
Firedrake implements this using :class:`~.EnsembleFunctionSpace`
119+
and :class:`~.EnsembleFunction` (along with the dual objects
120+
:class:`~.EnsembleDualSpace` and :class:`~.EnsembleCofunction`).
121+
The :class:`~.EnsembleFunctionSpace` can be thought of as a mixed
122+
function space which is parallelised across the `components`, as
123+
opposed to just being parallelised in `space`, as would usually be the
124+
case with :func:`~.FunctionSpace`. Each component of an
125+
:class:`~.EnsembleFunctionSpace` is a Firedrake :func:`~.FunctionSpace`
126+
on a single spatial communicator.
127+
128+
To create an :class:`~.EnsembleFunctionSpace` you must provide an
129+
:class:`~.Ensemble` and, on each spatial communicator, a list of
130+
:func:`~.FunctionSpace` instances for the components on the local
131+
``Ensemble.comm``. There can be a different number of local
132+
:func:`~.FunctionSpace` on each ``Ensemble.comm``. In the example
133+
below we create an :class:`~.EnsembleFunctionSpace` with two
134+
components on the first ensemble member, and three components on
135+
every other ensemble member. Note that, unlike a
136+
:func:`~.FunctionSpace`, a component of an
137+
:class:`~.EnsembleFunctionSpace` may itself be a
138+
:func:`~.MixedFunctionSpace`.
139+
140+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
141+
:language: python3
142+
:dedent:
143+
:start-after: [test_ensemble_manual_example 5 >]
144+
:end-before: [test_ensemble_manual_example 5 <]
145+
146+
Analogously to accessing the components of a :func:`~.MixedFunctionSpace`
147+
using ``subspaces``, the :func:`~.FunctionSpace` for each local component
148+
of an :class:`~.EnsembleFunctionSpace` can be accessed via
149+
``EnsembleFunctionSpace.local_spaces``. Various other methods and
150+
properties such as ``dual`` (to create an :class:`~.EnsembleDualSpace`)
151+
and ``nglobal_spaces`` (total number of components across all ranks)
152+
are also available.
153+
154+
An :class:`~.EnsembleFunction` and :class:`~.EnsembleCofunction` can be
155+
created from the :class:`~.EnsembleFunctionSpace`. These have a ``subfunctions``
156+
property that can be used to access the components on the local ensemble
157+
member. Each element in ``EnsembleFunction.subfunctions`` is itself just a
158+
normal Firedrake :class:`~.Function`. If a component of the
159+
``EnsembleFunctionSpace`` is a ``MixedFunctionSpace``, then the corresponding
160+
component in ``EnsembleFunction.subfunctions`` will be a mixed ``Function`` in
161+
that ``MixedFunctionSpace``.
162+
163+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
164+
:language: python3
165+
:dedent:
166+
:start-after: [test_ensemble_manual_example 6 >]
167+
:end-before: [test_ensemble_manual_example 6 <]
168+
169+
:class:`~.EnsembleFunction` and :class:`~.EnsembleCofunction` have
170+
a range of methods equivalent to those of :class:`~.Function` and
171+
:class:`~.Cofunction`, such as ``assign``, ``zero``,
172+
``riesz_representation``, arithmetic operators e.g. ``+``, ``+=``,
173+
etc. These act component-wise on each local component.
174+
175+
Because the components in ``EnsembleFunction.subfunctions``
176+
(``EnsembleCofunction.subfunctions``) are just :class:`~.Function`
177+
(:class:`~.Cofunction`) instances, they can be used directly
178+
with variational forms and solvers. In the example code below,
179+
We create a :class:`~.LinearVariationalSolver` where the right
180+
hand side is a component of an :class:`~.EnsembleCofunction`,
181+
and the solution is written into a component of an
182+
:class:`~.EnsembleFunction`. Using the ``subfunctions``
183+
directly like this can simplify ensemble code and reduce
184+
unnecessary copies.
185+
Note that the ``options_prefix`` is set using both the local ensemble
186+
rank and the index of the local space, which means that separate
187+
PETSc parameters can be passed from the command line to the solver
188+
on each ensemble member.
189+
190+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
191+
:language: python3
192+
:dedent:
193+
:start-after: [test_ensemble_manual_example 7 >]
194+
:end-before: [test_ensemble_manual_example 7 <]
195+
196+
.. warning::
197+
198+
Although the ``Function`` (``Cofunction``) instances in
199+
``EnsembleFunction.subfunctions`` (``EnsembleCofunction.subfunctions``)
200+
can be used in UFL expressions, ``EnsembleFunction`` and
201+
``EnsembleCofunction`` themselves do not carry any symbolic
202+
information so cannot be used in UFL expressions.
203+
204+
Internally, the :class:`~.EnsembleFunction` creates a ``PETSc.Vec``
205+
on the ``Ensemble.global_comm`` which contains the data for all
206+
local components on all ensemble members. This ``Vec`` can be accessed
207+
with a context manager, similarly to the ``Function.dat.vec`` context
208+
managers used to access :class:`~.Function` data. There are also
209+
analogous ``vec_ro`` and ``vec_wo`` context managers for read/write
210+
only accesses. However note that, unlike the ``Function.dat.vec``
211+
context managers, the ``EnsembleFunction.vec`` context managers
212+
need braces i.e. ``vec()`` not ``vec``.
213+
214+
.. literalinclude:: ../../tests/firedrake/ensemble/test_ensemble_manual.py
215+
:language: python3
216+
:dedent:
217+
:start-after: [test_ensemble_manual_example 8 >]
218+
:end-before: [test_ensemble_manual_example 8 <]

docs/source/manual.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,5 +27,6 @@ Manual
2727
preconditioning
2828
petsc-interface
2929
parallelism
30+
ensemble_parallelism
3031
zenodo
3132
optimising

docs/source/parallelism.rst

Lines changed: 9 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -84,89 +84,21 @@ different simulations on the two halves we would write.
8484
8585
...
8686
87+
.. note::
88+
89+
If you need to create Firedrake meshes on different communicators,
90+
then usually the best approach is to use the :class:`~.Ensemble`,
91+
which manages splitting MPI communicators and communicating
92+
:class:`~.Function` objects between the split communicators. More
93+
information on using the :class:`~.Ensemble` can be found
94+
:doc:`here <ensemble_parallelism>`.
95+
8796
To access the communicator a mesh was created on, we can use the
8897
``mesh.comm`` property, or the function ``mesh.mpi_comm``.
8998

9099
.. warning::
91100
Do not use the internal ``mesh._comm`` attribute for communication.
92101
This communicator is for internal Firedrake MPI communication only.
93102

94-
95-
Ensemble parallelism
96-
====================
97-
98-
Ensemble parallelism means solving simultaneous copies of a model
99-
with different coefficients, RHS or initial data, in situations that
100-
require communication between the copies. Use cases include ensemble
101-
data assimilation, uncertainty quantification, and time parallelism.
102-
103-
In ensemble parallelism, we split the MPI communicator into a number of
104-
subcommunicators, each of which we refer to as an ensemble
105-
member. Within each ensemble member, existing Firedrake functionality
106-
allows us to specify the FE problem solves which use spatial
107-
parallelism across the subcommunicator in the usual way. Another
108-
set of subcommunicators then allow communication between ensemble
109-
members.
110-
111-
.. figure:: images/ensemble.svg
112-
:align: center
113-
114-
Spatial and ensemble paralellism for an ensemble with 5 members,
115-
each of which is executed in parallel over 5 processors.
116-
117-
The additional functionality required to support ensemble parallelism
118-
is the ability to send instances of :class:`~.Function` from one
119-
ensemble to another. This is handled by the :class:`~.Ensemble`
120-
class. Instantiating an ensemble requires a communicator (usually
121-
``MPI_COMM_WORLD``) plus the number of MPI processes to be used in
122-
each member of the ensemble (5, in the case of the example
123-
below). Each ensemble member will have the same spatial parallelism
124-
with the number of ensemble members given by dividing the size of the
125-
original communicator by the number processes in each ensemble
126-
member. The total number of processes launched by ``mpiexec`` must
127-
therefore be equal to the product of number of ensemble members with
128-
the number of processes to be used for each ensemble member.
129-
130-
.. code-block:: python3
131-
132-
from firedrake import *
133-
134-
my_ensemble = Ensemble(COMM_WORLD, 5)
135-
136-
Then, the spatial sub-communicator must be passed to :func:`~.mesh.Mesh` (or via
137-
inbuilt mesh generators in :mod:`~.utility_meshes`), so that it will then be used by function spaces
138-
and functions derived from the mesh.
139-
140-
.. code-block:: python3
141-
142-
mesh = UnitSquareMesh(20, 20, comm=my_ensemble.comm)
143-
x, y = SpatialCoordinate(mesh)
144-
V = FunctionSpace(mesh, "CG", 1)
145-
u = Function(V)
146-
147-
The ensemble sub-communicator is then available through the attribute ``Ensemble.ensemble_comm``.
148-
149-
.. code-block:: python3
150-
151-
q = Constant(my_ensemble.ensemble_comm.rank + 1)
152-
u.interpolate(sin(q*pi*x)*cos(q*pi*y))
153-
154-
MPI communications across the spatial sub-communicator (i.e., within
155-
an ensemble member) are handled automatically by Firedrake, whilst MPI
156-
communications across the ensemble sub-communicator (i.e., between ensemble
157-
members) are handled through methods of :class:`~.Ensemble`. Currently
158-
send/recv, reductions and broadcasts are supported, as well as their
159-
non-blocking variants.
160-
161-
.. code-block:: python3
162-
163-
my_ensemble.send(u, dest)
164-
my_ensemble.recv(u, source)
165-
166-
my_ensemble.reduce(u, usum, root)
167-
my_ensemble.allreduce(u, usum)
168-
169-
my_ensemble.bcast(u, root)
170-
171103
.. _MPI: http://mpi-forum.org/
172104
.. _STREAMS: http://www.cs.virginia.edu/stream/

firedrake/adjoint_utils/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@
1212
from firedrake.adjoint_utils.solving import * # noqa: F401
1313
from firedrake.adjoint_utils.mesh import * # noqa: F401
1414
from firedrake.adjoint_utils.checkpointing import * # noqa: F401
15+
from firedrake.adjoint_utils.ensemble_function import * # noqa: F401

0 commit comments

Comments
 (0)