Skip to content

Commit 2fd316b

Browse files
authored
Update CHANGES for 26.03 (#5102)
1 parent eb10d70 commit 2fd316b

File tree

1 file changed

+106
-0
lines changed

1 file changed

+106
-0
lines changed

CHANGES.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,109 @@
1+
# 26.03
2+
3+
## Highlights:
4+
5+
* Fix a number of issues raised by AIs. Some are real bugs, and some are
6+
are potenial bugs, detects, and robustness issues.
7+
- 1D CurlCurl: Update variable beta smoother for consistence (#5082)
8+
- FFT: Early abort for unsupported path (#5083)
9+
- R2X: Assert zero-based domain indexing (#5084)
10+
- PoissonHybrid: Validate length of dz (#5085)
11+
- Fix impicit MRI configuration calls (#5093)
12+
- SUNDIALS Stale Flag (#5092)
13+
- SUNDIALS alloc-failure-output-pointer (#5094)
14+
- AllGatherBoxes: Fix MPI type (#5081)
15+
- Print/PrintToFile: Fix potential MPI_Comm issue (#5079)
16+
- Use size_t instead of int for buffer offset in CPU fillNeighbors (#5078)
17+
- assert that pureSoA containers have at least SpaceDim real components (#5076)
18+
- Fix sub-communicator in CPU fillNeighbors() (#5077)
19+
- Use Long instead of int for maxnextid in pre/post particle IO (#5073)
20+
- Improve clarity of AMR assign grid function (#5072)
21+
- AmrMesh: Fix invalid iterator range (#5068)
22+
- EB VTK writer: Fix attribute name (#5069)
23+
- BaseFab::indexFromValue: Fix index typo (#5067)
24+
- Fix SYCL device property init (#5066)
25+
- NonLocalBC: Fix Boolean condition (#5065)
26+
- [LinearSolvers][1D Overset] Fix missing component index (#4967)
27+
- [LinearSolvers] Fix bug in setBCoeffs(Vector) (#4968)
28+
- Fix host_idcpu to use finest_level_in_file when restarting [Particles] [IO] (#4971)
29+
- Fix (currently latent) bug with Particles + tiling + GPU (#4973) (#4978)
30+
- Fix particle communication bug when calling RedistributeCPU but with USE_GPU enabled. (#4970)
31+
- Fix (currently latent) bug with pure SoA particles if periodic_shift is not zero. (#4972)
32+
33+
* prevent auto-converting const char * to bool in ParmParse::add (#4969)
34+
35+
It used to be that `ParmParse::add("key", "value")` will result in
36+
adding a boolean value (1) due to C++ converting pointer to bool. This
37+
is almost certainly not intended. This is now fixed.
38+
39+
* SYCL: Add a new path for big kernels (#4952)
40+
41+
The SYCL kernel on Intel GPUs has a kernel parameter size limit of 2KB.
42+
If this limit is exceeded, a runtime error will occur when if AOT is
43+
off, and a compile time error will occur if AOT is on. Compiling with
44+
AOT is very time consuming. Thus we usually compile with AOT off, and
45+
this often results in run time errors.
46+
47+
We have implemented a workaround for this limitation. When the kernel
48+
parameter is too large, we explicitly copy the kernel function object to
49+
device memory.
50+
51+
* Generalize SIMD Single Source Design (#4924)
52+
53+
This adds another template overload to `ParallelForSIMD`.
54+
55+
A typical user pattern for maximum controls so far is:
56+
```C++
57+
#ifdef AMREX_USE_SIMD
58+
if constexpr (amrex::simd::is_vectorized<T>) {
59+
amrex::ParallelForSIMD<T::simd_width>(np, pushSingleParticle);
60+
} else
61+
#endif
62+
{
63+
amrex::ParallelFor(np, pushSingleParticle); // GPU & non-SIMD CPU
64+
}
65+
```
66+
67+
This simplifies it to:
68+
```C++
69+
amrex::ParallelForSIMD<T>(np, pushSingleParticle);
70+
```
71+
indicating there _might_ be a SIMD path if `T` (e.g., a functor)
72+
implements it.
73+
74+
One can still call `ParallelForSIMD` with an explicit SIMD width (int),
75+
as before.
76+
77+
## Other major changes:
78+
79+
* add ReduceToPlaneMF2Patchy (#4958) (#5086)
80+
81+
* Add Gpu::SyncAtExit and Gpu::streamSyncActive (#4956)
82+
83+
* Respect NoSync region in FabArray::Copy and setBndry (#4955)
84+
85+
* Guard againt potential OOB error in ParticleContainer OK() (#4951)
86+
87+
* SENSEI: Allocator (#4949)
88+
89+
* Fix: AMReX w/ OpenMP and Ignore FFTW OMP (#4941)
90+
91+
* ParticleContainerToBlueprint: Allocator (2) (#4948)
92+
93+
* Protect from using MarchingCubes if not 3D (#4946)
94+
95+
* FFTW CMake: Hint More Lib Paths (Windows) (#4940)
96+
97+
* `FindAMReXFFTW.cmake` Support Native Threading (#4934)
98+
99+
* Reducer: New wrapper class for ReduceOps and ReduceData (#4933)
100+
101+
* Added multi-array version of `EBData` (#4929)
102+
103+
* ParmParse: Add support for AMREX_USE_GPU macro (#4927)
104+
105+
* Fix access specifier with HDF5 enabled (#4921)
106+
1107
# 26.02
2108

3109
## Highlights:

0 commit comments

Comments
 (0)