|
| 1 | +# 26.03 |
| 2 | + |
| 3 | + ## Highlights: |
| 4 | + |
| 5 | + * Fix a number of issues raised by AIs. Some are real bugs, and some are |
| 6 | + are potenial bugs, detects, and robustness issues. |
| 7 | + - 1D CurlCurl: Update variable beta smoother for consistence (#5082) |
| 8 | + - FFT: Early abort for unsupported path (#5083) |
| 9 | + - R2X: Assert zero-based domain indexing (#5084) |
| 10 | + - PoissonHybrid: Validate length of dz (#5085) |
| 11 | + - Fix impicit MRI configuration calls (#5093) |
| 12 | + - SUNDIALS Stale Flag (#5092) |
| 13 | + - SUNDIALS alloc-failure-output-pointer (#5094) |
| 14 | + - AllGatherBoxes: Fix MPI type (#5081) |
| 15 | + - Print/PrintToFile: Fix potential MPI_Comm issue (#5079) |
| 16 | + - Use size_t instead of int for buffer offset in CPU fillNeighbors (#5078) |
| 17 | + - assert that pureSoA containers have at least SpaceDim real components (#5076) |
| 18 | + - Fix sub-communicator in CPU fillNeighbors() (#5077) |
| 19 | + - Use Long instead of int for maxnextid in pre/post particle IO (#5073) |
| 20 | + - Improve clarity of AMR assign grid function (#5072) |
| 21 | + - AmrMesh: Fix invalid iterator range (#5068) |
| 22 | + - EB VTK writer: Fix attribute name (#5069) |
| 23 | + - BaseFab::indexFromValue: Fix index typo (#5067) |
| 24 | + - Fix SYCL device property init (#5066) |
| 25 | + - NonLocalBC: Fix Boolean condition (#5065) |
| 26 | + - [LinearSolvers][1D Overset] Fix missing component index (#4967) |
| 27 | + - [LinearSolvers] Fix bug in setBCoeffs(Vector) (#4968) |
| 28 | + - Fix host_idcpu to use finest_level_in_file when restarting [Particles] [IO] (#4971) |
| 29 | + - Fix (currently latent) bug with Particles + tiling + GPU (#4973) (#4978) |
| 30 | + - Fix particle communication bug when calling RedistributeCPU but with USE_GPU enabled. (#4970) |
| 31 | + - Fix (currently latent) bug with pure SoA particles if periodic_shift is not zero. (#4972) |
| 32 | + |
| 33 | + * prevent auto-converting const char * to bool in ParmParse::add (#4969) |
| 34 | + |
| 35 | + It used to be that `ParmParse::add("key", "value")` will result in |
| 36 | + adding a boolean value (1) due to C++ converting pointer to bool. This |
| 37 | + is almost certainly not intended. This is now fixed. |
| 38 | + |
| 39 | + * SYCL: Add a new path for big kernels (#4952) |
| 40 | + |
| 41 | + The SYCL kernel on Intel GPUs has a kernel parameter size limit of 2KB. |
| 42 | + If this limit is exceeded, a runtime error will occur when if AOT is |
| 43 | + off, and a compile time error will occur if AOT is on. Compiling with |
| 44 | + AOT is very time consuming. Thus we usually compile with AOT off, and |
| 45 | + this often results in run time errors. |
| 46 | + |
| 47 | + We have implemented a workaround for this limitation. When the kernel |
| 48 | + parameter is too large, we explicitly copy the kernel function object to |
| 49 | + device memory. |
| 50 | + |
| 51 | + * Generalize SIMD Single Source Design (#4924) |
| 52 | + |
| 53 | + This adds another template overload to `ParallelForSIMD`. |
| 54 | + |
| 55 | + A typical user pattern for maximum controls so far is: |
| 56 | + ```C++ |
| 57 | + #ifdef AMREX_USE_SIMD |
| 58 | + if constexpr (amrex::simd::is_vectorized<T>) { |
| 59 | + amrex::ParallelForSIMD<T::simd_width>(np, pushSingleParticle); |
| 60 | + } else |
| 61 | + #endif |
| 62 | + { |
| 63 | + amrex::ParallelFor(np, pushSingleParticle); // GPU & non-SIMD CPU |
| 64 | + } |
| 65 | + ``` |
| 66 | + |
| 67 | + This simplifies it to: |
| 68 | + ```C++ |
| 69 | + amrex::ParallelForSIMD<T>(np, pushSingleParticle); |
| 70 | + ``` |
| 71 | + indicating there _might_ be a SIMD path if `T` (e.g., a functor) |
| 72 | + implements it. |
| 73 | + |
| 74 | + One can still call `ParallelForSIMD` with an explicit SIMD width (int), |
| 75 | + as before. |
| 76 | + |
| 77 | + ## Other major changes: |
| 78 | + |
| 79 | + * add ReduceToPlaneMF2Patchy (#4958) (#5086) |
| 80 | + |
| 81 | + * Add Gpu::SyncAtExit and Gpu::streamSyncActive (#4956) |
| 82 | + |
| 83 | + * Respect NoSync region in FabArray::Copy and setBndry (#4955) |
| 84 | + |
| 85 | + * Guard againt potential OOB error in ParticleContainer OK() (#4951) |
| 86 | + |
| 87 | + * SENSEI: Allocator (#4949) |
| 88 | + |
| 89 | + * Fix: AMReX w/ OpenMP and Ignore FFTW OMP (#4941) |
| 90 | + |
| 91 | + * ParticleContainerToBlueprint: Allocator (2) (#4948) |
| 92 | + |
| 93 | + * Protect from using MarchingCubes if not 3D (#4946) |
| 94 | + |
| 95 | + * FFTW CMake: Hint More Lib Paths (Windows) (#4940) |
| 96 | + |
| 97 | + * `FindAMReXFFTW.cmake` Support Native Threading (#4934) |
| 98 | + |
| 99 | + * Reducer: New wrapper class for ReduceOps and ReduceData (#4933) |
| 100 | + |
| 101 | + * Added multi-array version of `EBData` (#4929) |
| 102 | + |
| 103 | + * ParmParse: Add support for AMREX_USE_GPU macro (#4927) |
| 104 | + |
| 105 | + * Fix access specifier with HDF5 enabled (#4921) |
| 106 | + |
1 | 107 | # 26.02 |
2 | 108 |
|
3 | 109 | ## Highlights: |
|
0 commit comments