Skip to content

v0.14.0

Latest

Choose a tag to compare

@w1th0utnam3 w1th0utnam3 released this 15 Sep 15:56
· 9 commits to main since this release

This release makes use of SIMD instructions on CPUs supporting either AVX2 and FMA or NEON instruction sets. This is currently implemented in the particle to grid levelset evaluation in the subdomain-based reconstruction. Currently, it is only supported for single-precision (f32) reconstructions. If the compact support per particle is relatively large compared to the marching cubes grid resolution (e.g. smoothing length of 2 and cube size of 0.5) speedups of ~5x with AVX and ~3x with NEON can be expected for the reconstruction itself.

  • Lib: Implement AVX and NEON variants of f32 cubic spline kernel
  • Lib: Implement AVX and NEON variants of particle to grid levelset evaluation in subdomain-based reconstruction
  • CLI: Add --simd=on/off CLI flag to enable/disable use of SIMD in kernels and levelset evaluation if supported by the CPU
  • Py: Add simd argument to reconstruction_pipeline and reconstruct_surface to enable/disable use of SIMD in kernels and levelset evaluation if supported by the CPU