Skip to content

Commit e7dd9a9

Browse files
authored
Merge pull request #39 from ohearnk/QUICK-25.03-docs
Documentation updates for QUICK-25.03 release
2 parents 84c68e2 + 2e249e6 commit e7dd9a9

12 files changed

+103
-96
lines changed

docs/source/about.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
About QUICK QM Package
22
======================
33

4-
QUICK is a GPU enabled *ab initio* and density functional theory software
4+
QUICK is a GPU-enabled *ab initio* and density functional theory software
55
capable of performing electronic structure calculations on general
66
organic/biomolecular systems. It was initially developed by Ed Brothers. His
77
work included the development of the Hartree-Fock and density functional theory
@@ -17,8 +17,8 @@ aspects to improve the CUDA versions. Gina Sitaraman, Leopold Grinberg, Mahdieh
1717
Ghazimirsaeed, and Trinayan Baruah from AMD contributed by providing important
1818
suggestions to improve the performance of the HIP versions.
1919

20-
Madu Manathunga, Kurt A. O'Hearn, Akhil Shajan, Andy Götz, and Kennie Merz
21-
currently develop and maintain the code.
20+
Kurt A. O'Hearn, Vikrant Tripathy, Akhil Shajan, Madu Manathunga, Andy Götz,
21+
and Kennie Merz currently develop and maintain the code.
2222

2323
Contact: `quick.merzlab@gmail.com <quick.merzlab@gmail.com>`_
2424

docs/source/all-quick-documentations.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ All QUICK documentation versions
22
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
33

44
• `QUICK development version (active development, unreleased) <https://quick-docs.readthedocs.io/en/latest/>`_
5+
• `QUICK-25.03 <https://quick-docs.readthedocs.io/en/25.3.0/>`_ (also released with AmberTools25)
56
• `QUICK-24.03 <https://quick-docs.readthedocs.io/en/24.3.0/>`_ (also released with AmberTools24)
67
• `QUICK-23.08 <https://quick-docs.readthedocs.io/en/23.8.0/>`_
78
• `QUICK-22.03 <https://quick-docs.readthedocs.io/en/22.3.0/>`_ (also released with AmberTools22)

docs/source/basis-sets.rst

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,15 @@
3131
aug-PC-1
3232
aug-PC-2
3333
34-
Note 1: We follow the same basis set names reported at the `basis set exchange web page <https://www.basissetexchange.org/>`_.
34+
Note 1: We follow the same basis set names reported at the
35+
`basis set exchange web page <https://www.basissetexchange.org/>`_.
3536

36-
Note 2: The current version of the QUICK ERI engine only support basis functions up to *f*. Therefore, energy and gradient calculations with functions up to *f* are possible. By default, *f* functions are disabled in the GPU code. Open-shell gradient calculations with *f* functions are not yet available on GPU.
37-
38-
Note 3: ECPs are currently not supported by QUICK. Due to this reason, we have excluded elements that require ECPs from the above basis sets that are included with QUICK.
37+
Note 2: The current version of the QUICK two elecron repulsion integral (ERI)
38+
engine only support basis functions up to *f*. Therefore, energy and gradient
39+
calculations with functions up to *f* are possible. By default, *f* functions
40+
are disabled in the GPU code. Open-shell gradient calculations with *f*
41+
functions are not yet available on GPU.
3942

43+
Note 3: Effective core potentials (ECPs) are currently not supported by QUICK.
44+
Due to this reason, we have excluded elements that require ECPs from the above
45+
basis sets that are included with QUICK.

docs/source/cmake-options.rst

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11
CMake Build System Options
22
^^^^^^^^^^^^^^^^^^^^^^^^^^
33

4-
This page gives a summary of CMake options that can be used with QUICK. Note that like all CMake options, these options are sticky. Once passed to CMake, they will remain set unless you set them to a different value (with -D), unset them (with -U), or delete the build directory.
4+
This page gives a summary of CMake options that can be used with QUICK. Note
5+
that like all CMake options, these options are sticky. Once passed to CMake,
6+
they will remain set unless you set them to a different value (with -D), unset
7+
them (with -U), or delete the build directory.
58

69
General options
710
***************
811

9-
• *-DCOMPILER=<GNU|INTEL|AUTO>*: Allows selection of the compiler toolchain to use. *-DCOMPILER=AUTO* enables default CMake behaviour.
12+
• *-DCOMPILER=<GNU|CLANG|INTELLLVM|PGI|AUTO>*: Allows selection of the compiler toolchain to use. *-DCOMPILER=AUTO* enables default CMake behaviour. *NOTE:* the INTELLLVM and PGI options should be used for the Intel oneAPI and NVIDIA HPC SDK (NVHPC) compilers, respectively. For Clang, the Fortran compiler (flang) is incompatible with the QUICK code, so a mixed GNU/Clang build is performed (C/C++ compilers for Clang, Fortran for GCC (gfortran)).
1013
• *-DENABLEF=TRUE*: Enables the compilation of time consuming F functions in the ERI code of the GPU versions. **NOTE**: The current version of the F function code takes very long to compile (hours) and requires a large amount of RAM. Work is planned to optimize this in future releases.
1114
• *-DCMAKE_BUILD_TYPE=<Debug|Release>*: Controls whether to build debug or release versions.
1215
• *-DOPTIMIZE=<TRUE|FALSE>*: Controls whether to enable compiler optimizations. On by default.
@@ -20,14 +23,16 @@ External library control
2023
• *-DFORCE_INTERNAL_LIBS=blas*: Forces use of the internal BLAS library even if a system one is available.
2124
• *-DFORCE_DISABLE_LIBS=mkl*: Disable use of system MKL to replace BLAS and LAPACK.
2225
• *-DCMAKE_PREFIX_PATH=<path>*: Use the given path as a prefix where dependencies are installed. Libraries and headers will be searched for in <path>/lib and <path>/include.
23-
• *-DMKL_HOME=<path>*: Look for Intel MKL in the given directory. The environment variable MKL_HOME is also searched.
26+
• *-DMKL_HOME=<path>*: Look for Intel MKL in the given directory. The environment variable MKL_HOME is also searched. *NOTE:* When using this flag, the additional flag *-DTRUST_SYSTEM_LIBS=TRUE* must also be appended.
27+
• *-DMKL_MULTI_THREADED=<TRUE|FALSE>*: Specify whether the Intel MKL library should be used as single or multi-threaded.
2428
• *-DMAGMA=TRUE*: Enable matrix diagonalization using Magma library in HIP/HIP-MPI version.
2529
• *-DMAGMA_PATH=<path>*: Look for Magma library in the given directory.
2630

2731
Parallel versions
2832
*****************
2933

30-
By default QUICK will only build the serial version. This can be changed with these options:
34+
By default QUICK will only build the serial version. This can be changed with
35+
these options:
3136

3237
• *-DMPI=TRUE*: Also build MPI versions of all programs.
3338
• *-DCUDA=TRUE*: Also build CUDA versions of all programs. If both MPI and CUDA are active at the same time, a MPI+CUDA version will additionally be built.

docs/source/developer-guide.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -486,8 +486,9 @@ Note 1: Current version of QUICK ERI engine only support basis functions up to
486486
*d* (up to f support for CUDA/MPI+CUDA if enabled). Therefore, do not add high
487487
angular momentum basis sets and attempt to use f/g functions.
488488

489-
Note 2: ECPs are not supported by |QUICK_VERSION|. Therefore care must be taken
490-
not to add elements that require ECPs as this would lead to wrong results.
489+
Note 2: Effective core potentials (ECPs) are not supported by |QUICK_VERSION|.
490+
Therefore care must be taken not to add elements that require ECPs as this
491+
would lead to wrong results.
491492

492493
Adding new test cases into test suite
493494
-------------------------------------

docs/source/features-limitations.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ Features
1919
• Supports QM/MM calculations with Amber22 and later
2020
• Fortran API to use QUICK as QM energy and force engine
2121
• Message Passing Interface (MPI) distributed parallelization for CPU platforms
22-
• Massively parallel, single GPU implementation via CUDA and HIP for Nvidia and AMD GPUs (HIP available in QUICK-23.08, currently disabled)
23-
• Distributed, multi-GPU support via MPI+CUDA/MPI+HIP, also across multiple compute nodes
22+
• Massively parallel, single GPU implementation via CUDA and HIP for NVIDIA and AMD GPUs
23+
• Distributed, multi-node, multi-GPU support via MPI+CUDA/MPI+HIP codes
2424

2525
Limitations
2626
***********
@@ -32,6 +32,5 @@ Limitations
3232
• Effective core potentials (ECPs) are not supported
3333
• DFT calculations are performed exclusively using the SG1 grid system
3434
• No meta-GGA nor range-separated hybrid functionals are supported at present
35-
• HIP/MPI+HIP support disabled for this release due to GPU code rewrites (f basis function support), please use QUICK version 23.08b for HIP support
3635

3736
*Last updated by Andreas Goetz on 04/25/2024.*

docs/source/installation-guide.rst

Lines changed: 50 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -4,75 +4,64 @@ Installation Guide
44
==================
55

66
QUICK has been compiled and tested on x86 and ARM CPU architectures, and on
7-
Nvidia and AMD GPU architectures.
7+
NVIDIA and AMD GPU architectures.
88

9-
**NOTE:** For GPU builds, the compilation of the GPU enabled ERI code can take
10-
a significant amount of time (several minutes for default builds and several
11-
hours for f-function basis set support) - be patient, the compiler is working
12-
hard to generate lightning fast code for you.
13-
14-
**NOTE:** HIP/MPI+HIP support is disabled for this release. Please use QUICK
15-
version 23.08b for HIP support
9+
**NOTE:** For GPU builds, the compilation of the GPU-enabled two electron
10+
repulsion integral (ERI) code can take a significant amount of time (several
11+
minutes for default builds and several hours for f-function basis set support)
12+
-- be patient as the compiler is working hard to generate highly-performant code.
1613

1714
Compatible Compilers and Hardware
1815
---------------------------------
1916

20-
In general QUICK works well with a range of compilers (GNU, Intel), math
21-
libraries (Intel MKL, reference BLAS/LAPACK, MAGMA), MPI implementations
22-
(OpenMPI, Intel MPI), and GPU SDK versions (CUDA and ROCm/HIP). We have
23-
specifically tested |QUICK_VERSION| with following compilers, libraries, and
24-
tools.
25-
26-
**Linux:**
27-
28-
1. GNU GCC v7.3.0; OpenMPI v3.1.1; CUDA v9.2.88; CMake v3.11.4
29-
2. GNU GCC v8.3.0; OpenMPI v3.1.4; CUDA v10.2.89; CMake v3.15.1
30-
3. GNU GCC v11.3.0; OpenMPI v4.1.4; CUDA v11.8; CMake v3.23.1
31-
4. GNU GCC v12.3.0; OpenMPI v5.0.0; CUDA v12.3; CMake v3.26.3
32-
5. Clang v14.0 / GNU GCC v11.3.0 (Fortran); OpenMPI v3.1.4; CUDA v10.2.89; CMake v3.15.3
33-
6. Intel v2021b; Intel MPI v2021b; CUDA v11.8; CMake v3.23.1
34-
7. Intel OneAPI/LLVM v2022.2.1; Intel MPI v2022.2.1; CMake v3.18.4
35-
36-
**NOTE:** QUICK GPU builds require at least CUDA v7.x and ROCm v5.1.x for CUDA
37-
and HIP versions, respectively. Please consult the Release Notes for the
38-
respective GPU SDKs on supported GPU devices.
39-
40-
|QUICK_VERSION| CUDA version has been tested on the following GPU cards: A100,
41-
RTX3080Ti, RTX2080Ti, RTX8000, RTX6000, RTX2080, T4, V100, Titan V, P100, M40,
42-
GTX1080, K80, and K40.
43-
44-
|QUICK_VERSION| HIP version is currently disabled. Please use QUICK-23.08b if you want to use AMD GPUs.
45-
46-
.. |QUICK_VERSION| HIP version has been tested on the following GPU cards: MI100,
47-
MI210, and MI250. As of QUICK-23.03, the performance on MI210 and MI250 cards is
48-
not optimized but the code runs properly.
17+
In general QUICK works well with a range of compilers (GNU, Clang, Intel, NVHPC
18+
SDK/PGI), math libraries (Intel MKL, reference BLAS/LAPACK, MAGMA), MPI
19+
implementations (OpenMPI, MPICH, Intel MPI), and GPU SDK versions (CUDA,
20+
ROCm/HIP). |QUICK_VERSION| is automatically tested on Github with following
21+
combinations of OS versions, compilers, libraries, and tools:
22+
23+
- Ubuntu v22.04.05 (x86_64), GNU GCC v10.5.0; OpenMPI v4.1.2; CMake v3.31.6
24+
- Ubuntu v22.04.05 (x86_64), GNU GCC v11.4.0; OpenMPI v4.1.6; CMake v3.31.6
25+
- Ubuntu v24.04.2 (x86_64), GNU GCC v12.3.0; OpenMPI v4.1.6; CMake v3.31.6
26+
- Ubuntu v24.04.2 (x86_64), GNU GCC v13.3.0; OpenMPI v4.1.6; CMake v3.31.6
27+
- Ubuntu v24.04.2 (x86_64), GNU GCC v14.2.0; OpenMPI v4.1.6; CMake v3.31.6
28+
- Ubuntu v24.04.2 (x86_64), GNU GCC v14.2.0; MPICH v4.2.0; CMake v3.31.6
29+
- Ubuntu v24.04.2 (x86_64), Clang v17.0.6; OpenMPI v4.1.6; CMake v3.31.6
30+
- Ubuntu v24.04.2 (x86_64), Clang v18.1.3; OpenMPI v4.1.6; CMake v3.31.6
31+
- Ubuntu v24.04.2 (x86_64), Intel oneAPI v2024.2.1; Intel MPI (CCL) v2021.14; CMake v3.31.6
32+
- Ubuntu v24.04.2 (x86_64), Intel oneAPI v2025.0.1; Intel MPI (CCL) v2021.14; CMake v3.31.6
33+
- Ubuntu v24.04.2 (x86_64), NVIDIA HPC SDK v25.1 (PGI); OpenMPI v4.1.7rc1; CMake v3.31.6
34+
- Ubuntu v24.04.2 (ARM), GNU GCC v14.2.0; OpenMPI v4.1.6; CMake v3.31.6
35+
- Ubuntu v24.04.2 (ARM), GNU GCC v14.2.0; MPICH v4.2.0; CMake v3.31.6
36+
- MacOS 13 (x86_64), GNU GCC v14.2.0_1; OpenMPI v; CMake v3.31.6 (Homebrew)
37+
- MacOS 13 (x86_64), GNU GCC v15.0.7; OpenMPI v; CMake v3.31.6 (Homebrew)
38+
- MacOS 14 (ARM), GNU GCC v14.2.0_1; OpenMPI v; CMake v3.31.6 (Homebrew)
39+
- MacOS 14 (ARM), GNU GCC v15.0.7; OpenMPI v; CMake v3.31.6 (Homebrew)
40+
41+
**NOTE:** QUICK GPU builds require CUDA >= v7.x or ROCm <= v5.4.2, >= v6.2.1
42+
for CUDA and HIP versions, respectively. Please consult the Release Notes for
43+
the respective GPU SDKs on supported GPU devices and compatible software
44+
dependencies (compilers, etc.).
45+
46+
|QUICK_VERSION| CUDA version has been tested on the following GPUs: H200, H100,
47+
A100, RTX3080TI, RTX2080TI, RTX8000, RTX6000, RTX2080, T4, V100, Titan V, P100,
48+
M40, GTX1080, K80, and K40.
49+
50+
|QUICK_VERSION| HIP version has been tested on the following GPUs: MI100,
51+
MI210, MI250, and MI300A.
4952

5053
**NOTE:** We recommend that the CUDA/MPI+CUDA and HIP/MPI+HIP versions be
5154
executed only on dedicated GPU cards where no other tasks are being run.
52-
Performance is better on datacenter GPUs than on consumer GPUs. For MPI+CUDA
53-
and MPI+HIP versions, we also recommend that only one CPU core (MPI task) is
54-
used per GPU; this can be done by setting the number of processes (*e.g.*, in
55-
the *mpirun* command) equal to the number of GPUs.
56-
57-
**Intel-based Macbooks:**
58-
59-
Software stack (compiler installed via Macports):
60-
61-
1. macOS 11.7.3; GNU/10.4.0, 11.3.0, 12.2.0; OpenMPI 4.1.4
62-
2. macOS 13.2; GNU 12.2.0, OpenMPI 4.1.4
63-
64-
**ARM-based Macbooks (M3 Pro CPU):**
65-
66-
Software stack (compiler installed via Macports):
67-
68-
1. macOS Sonoma 14.4.1; GNU GCC 12.3.0; OpenMPI 4.1.6
69-
2. macOS Sonoma 14.4.1; GNU GCC (Fortran); Clang 17.0.6; OpenMPI 4.1.6
55+
Performance is better on datacenter GPUs than on consumer GPUs. For the
56+
MPI+CUDA and MPI+HIP versions, we also recommend that only one CPU core (MPI
57+
process) is used per GPU; this can be done by setting the number of processes
58+
(*e.g.*, in the *mpirun* command) equal to the number of GPUs.
7059

7160

7261
Installation
7362
------------
7463

75-
Installation of QUICK requires that at least CMake/3.9.0 be installed in the
64+
Installation of QUICK requires that at least CMake v3.12.0 be installed in the
7665
target machine. To install QUICK using CMake, one must first create a build
7766
directory (separate from the source directory). After installation you can
7867
safely delete this build directory if you want to save disk space. Assuming the
@@ -89,7 +78,7 @@ CUDA version
8978

9079
Assuming you have created a directory named *builddir* in the ``QUICK_HOME``
9180
directory and you want to install QUICK into directory ``QUICK_INSTALL``, use
92-
GNU compiler tool chain, and want to compile for the Nvidia Volta
81+
GNU compiler tool chain, and want to compile for the NVIDIA Volta
9382
microarchitecture, all QUICK versions can be configured and built as follows:
9483

9584
.. code-block:: none
@@ -138,8 +127,9 @@ Path to ROCm installation can be specified using ``-DHIP_TOOLKIT_ROOT_DIR`` but
138127
this is optional. Flags ``-DMAGMA`` and ``-DMAGMA_ROOT`` are used to enable
139128
MAGMA library support for matrix diagonalization and specify the MAGMA
140129
installation directory, respectively. The use of MAGMA is optional but highly
141-
recommended since the diagonalization is performed on host (CPU) by default
142-
(which can be very slow).
130+
recommended for older ROCm versions (< v5.3.0) since matrix diagonalization is
131+
performed on host (CPU) in QUICK by default due to poor performance in the ROCm
132+
math libraries (rocSOLVER).
143133

144134
If the microarchitecture is not specified (i.e. absence of the
145135
``-DQUICK_USER_ARCH`` flag), QUICK will be compiled for gfx908 architecture. As of
@@ -194,7 +184,6 @@ here: `hands-on tutorials <hands-on-tutorials.html>`_.
194184
Uninstallation and Cleaning
195185
---------------------------
196186

197-
Simply delete contents inside build and install directories and/or delete the
198-
build and install directories.
187+
Delete the build and install directories and their contents.
199188

200189
*Last updated by Andreas Goetz on 04/25/2024.*

docs/source/known-issues.rst

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ detected the issues listed below. If you find anything other than these, please
66
feel free to report any bugs or issues through our GitHub page:
77
`https://github.com/merzlab/QUICK/issues <https://github.com/merzlab/QUICK/issues>`_.
88

9-
Feel free to ask questions or start a discussion on the Discussions section of our GitHub page: `https://github.com/merzlab/QUICK/discussions <https://github.com/merzlab/QUICK/discussions>`_.
9+
Feel free to ask questions or start a discussion on the Discussions section of
10+
our GitHub page: `https://github.com/merzlab/QUICK/discussions <https://github.com/merzlab/QUICK/discussions>`_.
1011

1112
Compile time
1213
^^^^^^^^^^^^
@@ -26,13 +27,6 @@ Kepler targeted microarchitectures (<= v11.0 for sm_30, <= v11.8 for
2627
sm_35/sm_37). Please consult the Release Notes for your installed CUDA SDK
2728
version for further details on supported GPU microarchitectures.
2829

29-
2. Compiling HIP/MPI+HIP versions fails for this release (unsupported)
30-
**********************************************************************
31-
HIP/MPI+HIP support disabled for this release due to required GPU code rewrites
32-
(related to added f basis function support).
33-
34-
Solution: Use QUICK v23.08b for HIP/MPI+HIP support until support is restored.
35-
3630
Runtime
3731
^^^^^^^
3832

docs/source/performance.rst

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,12 @@ Accuracy of energies and gradients
1818
We have compared energies and gradients computed by QUICK with values computed
1919
by other quantum chemical packages. HF energies and gradients have displayed
2020
accuracies of 1.0E-6 Hartree and 1.0E-4 Hartree/Bohr or better,
21-
respectively, for test systems (see `https://github.com/merzlab/QUICK-tests
22-
<https://github.com/merzlab/QUICK-tests>`_ for test cases). DFT energies and
23-
gradients have shown similar accuracies in most cases, however, we have
24-
observed larger deviations for some molecular systems. Such deviations usually
25-
arise due to differences in the exchange correlation quadrature grid.
21+
respectively, for test systems (see
22+
`https://github.com/merzlab/QUICK-tests <https://github.com/merzlab/QUICK-tests>`_
23+
for test cases). DFT energies and gradients have shown similar accuracies in
24+
most cases, however, we have observed larger deviations for some molecular
25+
systems. Such deviations usually arise due to differences in the exchange
26+
correlation quadrature grid.
2627

2728

2829
Performance of QUICK CUDA single GPU and MPI parallel versions

docs/source/quick_docs_common.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
.. |QUICK_VERSION| replace:: QUICK-24.03
1+
.. |QUICK_VERSION| replace:: QUICK-25.03

0 commit comments

Comments
 (0)