-
Notifications
You must be signed in to change notification settings - Fork 118
Running Fluidity in parallel
Fluidity is parallelized using MPI and standard domain decomposition techniques.
Meshes can be generated using any program that outputs triangle meshes, or outputs mesh formats that can be converted to triangle. In particular, there are conversion programs for:
- Local:Mesh_generation_with_gmsh - gmsh2triangle in tools
- GiD - gid2triangle, part of fltools
- Local:Gem2triangle - gem2triangle, part of fltools in the legacy branch
All surfaces on which boundary conditions are applied should have appropriate boundary IDs, and if multiple regions are used then mesh regions should be assigned appropriate region IDs. See respective mesh generator pages for instructions on how to do this.
To be able to use fldecomp, run the following inside your fluidity folder:
make fltools
The fldecomp binary will then be created in the bin/ directory.
To decompose the triangle mesh, run:
fldecomp -m triangle -n [PARTS] [BASENAME]
where BASENAME is the triangle mesh base name (excluding extensions). "-m triangle" instruct fldecomp to perform a triangle-to-triangle decomposition. This will create PARTS partition triangle meshes together with PARTS .halo files.
In the options file, select "triangle" under /geometry/mesh/from_file/format for the from_file mesh. For the mesh filename, enter the triangle mesh base name excluding all file and process number extensions.
Also:
- Remember to select parallel compatible preconditioners in prognostic field solver options. eisenstat is not suitable for parallel simulations.
To launch a new options parallel simulation, add "[OPTIONS FILE]" to the Fluidity command line, e.g.:
mpiexec fluidity -v2 -l [OPTIONS FILE]
To run in a batch job on cx1, using something like the following PBS script:
#!/bin/bash
#Job name
#PBS -N backward_step
# Time required in hh:mm:ss
#PBS -l walltime=48:00:00
# Resource requirements
# Always try to specify exactly what we need and the PBS scheduler
# will make sure to get your job running as quick as possible. If
# you ask for too much you could be waiting a while for sufficient
# resources to become available. Experiment!
#PBS -l select=2:ncpus=4
# Files to contain standard output and standard error
##PBS -o stdout
##PBS -e stderr
PROJECT=backward_facing_step_3d.flml
echo Working directory is $PBS_O_WORKDIR
cd $PBS_O_WORKDIR
rm -f stdout* stderr* core*
module load intel-suite
module load mpi
module load vtk
module load cgns
module load petsc/2.3.3-p1-amcg
module load python/2.4-fake
# This will put the location of the temporary directory into a temporary file
# in case you need to check it's progress
mpiexec $PWD/fluidity -v2 -l $PWD/$PROJECT
This will run on 8 processors (2 * 4 from the line PBS -l select=2:ncpus=4).
The output from a parallel run is a bunch of .vtu and .pvtu files. A .vtu file is output for each processor and each timestep, e.g. backward_facing_step_3d_191_0.vtu is the .vtu file for step 191 from processor 0. A .pvtu file is generated for each timestep, e.g. backward_facing_step_3d_191.pvtu is for timestep 191.
The best way to view the output is using paraview. Simply open the .ptvu file.
On cx1, you will need to load the paraview module: module load paraview/3.4.0
=== Limitations / Known Issues ===
- Only one from_file mesh may be specified.
Category:new Fluidity structure
The first thing to do run flgem/gem as usual. Second, you need to partition the output into subdomains which will be the input files for parallel Fluidity. You should usually do this interactively as its serial and you generally only have to do it once while you may want to rerun the parallel problem multiple times. It's usually not too expensive anyhow. If you have a really large input you can always do this elsewhere on a machine which has sufficient memory and scp the result back to the cluster you want to run on. So, for example: flgem annulus.gem fldecomp -n 16 annulus This gem's and partitions the project annulus into 16 subdomains. To run this in parallel you must first modify your batch queue script to request 16 cores. If you are using PBS on the Imperial College Cluster you could do this using:
#PBS -l select=4:ncpus=4:mem=4950mb:icib=true
This also selects Infiniband on the IC cluster. If you're at IC you better be using this option when you're running in parallel or else you might get compute-nodes which only have an ethernet interconnect (i.e. it's going to be slow). It doesn't matter of course if your parallel problem is small enough to sit inside a single SMP node. Finally, add a line to actually run fluidity in parallel:
mpiexec ./dfluidity annulus
When you want to visualize dump files you have two options. Either use:
fl2vtu annulus 1
which will create a parallel VTK file which you can visualize using something like:
mayavi -d annulus_1.pvtu -m SurfaceMap
(note the .pvtu). Alternatively, you can merge the partitions to form a single file
fl2vtu -m annulus 1
mayavi -d annulus_1.vtu -m SurfaceMap
Although be warned - this is not a good idea if your dump files start getting very big in which case you have to think about using Paraview for parallel visualization.
gormo@rex:~$ cat host_file
rex
rex
rex
rex
mpirun -np 4 --hostfile host_file $PWD/dfluidity tank.flml
xhost +rex
gormo@rex:~$ echo $DISPLAY
:0.0
mpirun -np 4 -x DISPLAY=:0.0 xterm -e gdb $PWD/dfluidity-debug