Skip to content

Commit ef38229

Browse files
authored
Add a for loop that is unrolled at compile time (#3674)
## Summary The constexpr_for function is fully unrolled at compile time. This is useful for relatively short loops where some of the functions inside the loop are known to possible to evaluate at compile time and may be relatively expensive, so evaluating them at compile time rather than at runtime may be beneficial for performance reasons. ## Additional background This has been used in AMReX-Astro/Microphysics successfully in the context of evaluating some nuclear reaction network quantities at compile time. ## Checklist The proposed changes: - [ ] fix a bug or incorrect behavior in AMReX - [x] add new capabilities to AMReX - [ ] changes answers in the test suite to more than roundoff level - [ ] are likely to significantly affect the results of downstream AMReX users - [ ] include documentation in the code and/or rst files, if appropriate
1 parent d1e55fb commit ef38229

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

Src/Base/AMReX_Loop.H

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,30 @@ void LoopConcurrentOnCpu (Box const& bx, int ncomp, F&& f) noexcept
211211
}}}}
212212
}
213213

214+
// Implementation of "constexpr for" based on
215+
// https://artificial-mind.net/blog/2020/10/31/constexpr-for
216+
//
217+
// Approximates what one would get from a compile-time
218+
// unrolling of the loop
219+
// for (int i = 0; i < N; ++i) {
220+
// f(i);
221+
// }
222+
//
223+
// The mechanism is recursive: we evaluate f(i) at the current
224+
// i and then call the for loop at i+1. f() is a lambda function
225+
// that provides the body of the loop and takes only an integer
226+
// i as its argument.
227+
228+
template<auto I, auto N, class F>
229+
AMREX_GPU_HOST_DEVICE AMREX_INLINE
230+
constexpr void constexpr_for (F&& f)
231+
{
232+
if constexpr (I < N) {
233+
f(std::integral_constant<decltype(I), I>());
234+
constexpr_for<I+1, N>(f);
235+
}
236+
}
237+
214238
#include <AMReX_Loop.nolint.H>
215239

216240
}

0 commit comments

Comments
 (0)