Skip to content

Releases: ermig1979/Simd

Simd v6.2.158

03 Feb 12:07

Choose a tag to compare

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare3x3.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare5x5.
  • Base implementation of class SynetConvolution16bNhwcSpecV2.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare3x3.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare5x5.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare3x3.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare5x5.
Improving
  • AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.

Test framework

New features
  • Tests for verifying functionality of function MidpointFilterSquare3x3.
  • Tests for verifying functionality of function MidpointFilterSquare5x5.
  • Tests for verifying functionality of function MinFilterSquare3x3.
  • Tests for verifying functionality of function MinFilterSquare5x5.
  • Tests for verifying functionality of function MaxFilterSquare3x3.
  • Tests for verifying functionality of function MaxFilterSquare5x5.

Simd v6.2.157

02 Jan 14:23

Choose a tag to compare

Algorithms

New features
  • Function Simd::Resize for Simd::Frame.
  • Base implementation of function DrawLine.
  • Base implementation of function DrawRectangle.
  • Base implementation of function FontInit.
  • Base implementation of function FontResize.
  • Base implementation of function FontHeight.
  • Base implementation of function FontMeasure.
  • Base implementation of function FontDraw.
Improving
  • Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
  • AVX-512BW optimizations of function SynetPoolingMax32f (case of SynetPoolingMax32f2DNhwcSolid2x2).
  • AVX-512BW optimizations of function SynetMergedConvolution32f (InputConvolution1x1).
  • AVX-512BW optimizations of function SynetMergedConvolution32f (DepthwiseConvolution_k3p1d1s1w6).
  • Simd::DrawLine uses SimdDrawLine instead its own implementation.
  • Simd::DrawRectangle uses SimdDrawRectangle instead its own implementation.
  • Simd::Font uses functions SimdFontInit, SimdFontResize, SimdFontHeight, SimdFontMeasure, SimdFontDraw instead of its own implementation.

Python wrapper

New features
  • Function Simd.ResizeFrame.
  • Function Simd.ResizedFrame.
  • Yuv444p member to Simd.FrameFormat enumeration.
  • Method Simd.ImageFrame.Save.
  • Method Simd.ImageFrame.Load.
  • Function Simd.Lib.StretchGray2x2.
  • Function Simd.StretchGray2x2.
  • Function Simd.Lib.BgraToYuv444p.
  • Function Simd.Lib.Yuv444pToRgb.
  • Function Simd.Lib.ReduceGray2x2.
  • Function Simd.ReduceGray2x2.
  • Function Simd.Lib.BgrToYuv444p.
  • Function Simd.Lib.BgraToYuv444p.
  • Function Simd.Lib.Yuv444pToBgr.
  • Function Simd.Lib.Yuv444pToRgba.
  • Function Simd.Lib.DrawLine.
  • Method Simd.Image.DrawLine.
  • Function Simd.Lib.DrawRectangle.
  • Method Simd.Image.DrawRectangle.
  • Function Simd.Lib.FontInit.
  • Function Simd.Lib.FontResize.
  • Function Simd.Lib.FontHeight.
  • Function Simd.Lib.FontMeasure.
  • Function Simd.Lib.FontDraw.
  • Class Simd.TextFont.
  • Method Simd.Image.DrawFilledRectangle.
Improving
  • Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Recreate.
  • Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Convert.
Bug fixing
  • Error in method Simd.Frame.Convert.
Renaming
  • Function Simd.Resize to Simd.ResizeImage.
  • Function Simd.Resized to Simd.ResizedImage.

Test framework

New features
  • Tests for verifying functionality of function DrawLine.
  • Tests for verifying functionality of function DrawRectangle.
Bug fixing
  • Error in method Test::PerformanceMeasurerStorage::Clear.

Simd v6.2.156

01 Dec 06:38

Choose a tag to compare

Algorithms

New features
  • Enumeration SimdShiftDetectorTextureType (С API of Simd::ShiftDetector).
  • Enumeration SimdShiftDetectorDifferenceType (С API of Simd::ShiftDetector).
  • Base implementation of function SimdShiftDetectorInitBuffers (С API of Simd::ShiftDetector).
  • Base implementation of function SimdShiftDetectorSetBackground (С API of Simd::ShiftDetector).
  • Base implementation of function SimdShiftDetectorEstimate (С API of Simd::ShiftDetector).
  • Base implementation of function SimdShiftDetectorGetShift (С API of Simd::ShiftDetector).
  • Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
Improve
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveUv (some outputs can be NULL).
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveBgr (some outputs can be NULL).
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveBgra (some outputs can be NULL).
  • C++ wrapper Simd::DeinterleaveUv (support of empty outputs).
  • C++ wrapper Simd::DeinterleaveBgr (support of empty outputs).
  • C++ wrapper Simd::DeinterleaveBgra (support of empty outputs).
  • C++ wrapper Simd::DeinterleaveRgb (support of empty outputs).
  • C++ wrapper Simd::DeinterleaveRgba (support of empty outputs).
  • Parallelization in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class ResizerNearest.
Removing
  • C++ wrapper Simd::DeinterleaveBgra with 4 arguments.
  • C++ wrapper Simd::DeinterleaveRgba with 4 arguments.
Renaming
  • Class SynetConvolution16bNhwcGemm to SynetConvolution16bNhwcGemmV0.

Test framework

Improve
  • Tests for verifying functionality of function DeinterleaveUv (some outputs can be NULL).
  • Tests for verifying functionality of function DeinterleaveBgr (some outputs can be NULL).
  • Tests for verifying functionality of function DeinterleaveBgra (some outputs can be NULL).

Python wrapper

New features
  • CurrentFrequency member to Simd.CpuInfo enumeration.
  • Bf16 member to Simd.ResizeChannel enumeration.
  • Function Simd.ShiftBilinear.
  • Enumeration Simd.ShiftDetectorTexture.
  • Enumeration Simd.ShiftDetectorDifference.
  • Function Simd.Lib.ShiftDetectorInitBuffers.
  • Function Simd.Lib.ShiftDetectorSetBackground.
  • Function Simd.Lib.ShiftDetectorEstimate.
  • Function Simd.Lib.ShiftDetectorGetShift.
  • Function Simd.Lib.ShiftDetectorGetRefinedShift.
  • Function Simd.Lib.ShiftDetectorGetStability.
  • Function Simd.Lib.ShiftDetectorGetCorrelation.
  • Class Simd.ShiftingDetector.
Improve
  • Function Simd.Lib.SysInfo.

Simd v6.2.155

10 Nov 08:24

Choose a tag to compare

Algorithms

New features
  • SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedScaleLayerForward.
  • SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedPreluLayerForward.
  • Arbitrary activation function in Base implementation of class SynetQuantizedConvolutionGemm.
  • Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
  • Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
  • Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV2.
  • Arbitrary activation function in Base implementation, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
Improve
  • AMX-BF16 optimizations of class SynetConvolution16bNhwcGemm (case of small srcC).
Bug fixing
  • Performance bug in AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
  • Error in SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedInnerProductGemmNN.
  • Error in SSE4.1 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
  • Error in Base implementation of class SynetQuantizedConvolutionNhwcSpecV0.
  • Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.

Simd v6.2.154

01 Oct 08:30

Choose a tag to compare

Algorithms

New features
  • SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionCdc.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionCd.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionDc.
  • Base implementation of function SynetQuantizedScaleLayerForward.
  • Base implementation of function SynetQuantizedPreluLayerForward.
Improve
  • AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
  • Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of batch > 1).
  • Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0 (case of batch > 1).
  • Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of small srcC).
  • Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0 (case of small srcC).
Bug fixing
  • Error in AVX-512BW optimizations of function SynetQuantizedConcatLayerForward.
  • Error in function Base::CpuModel (Windows Server 2025).
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetQuantizedAddUniform.
  • Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function QuantizedMergedConvolutionAddInputToOutput.
  • Error in AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of batch > 1).

Test framework

New features
  • Tests for verifying functionality of function SynetQuantizedScaleLayerForward.
  • Tests for verifying functionality of function SynetQuantizedPreluLayerForward.

Infrastructure

Bug fixing
  • Fix bug in step 'Host Properties' in Github actions script for MSBuild.
  • Fix bug in step 'Host Properties' in Github actions script for CMake.
Removing
  • Support of Microsoft Visual Studio 2019.

Simd v6.2.153

01 Sep 12:05

Choose a tag to compare

Algorithms

New features
  • AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV0.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV1.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV2.
  • Base implementation, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedShuffleLayerForward.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedConcatLayerForward.
  • Base implementation of class SynetQuantizedMergedConvolutionRef.
  • Base implementation of class SynetQuantizedMergedConvolutionCdc.
Improve
  • SSE4.1, AVX2 optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV0.
Bug fixing
  • Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.

Test framework

New features
  • Tests for verifying functionality of function SynetQuantizedShuffleLayerForward.
  • Tests for verifying functionality of function SynetQuantizedConcatLayerForward.
  • Tests for verifying functionality of function SynetQuantizedMergedConvolutionForward.

Infrastructure

Improve
  • Performance of Test step in Github actions script for MSBuild.

Simd v6.2.152

01 Aug 08:24

Choose a tag to compare

Algorithms

New features
  • AVX2, AVX-512BW optimizations of class SynetQuantizedAddUniform.
  • Base implementation of class SynetQuantizedInnerProductRef.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedInnerProductGemmNN.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
  • Base implementation, SSE4.1, AVX2 optimizations of class SynetQuantizedConvolutionNhwcDepthwise.
Improve
  • AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
Bug fixing
  • Error in NEON optimization of function Float32ToBFloat16.
  • Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.
  • Error in Base implementation of class SynetQuantizedConvolutionGemm.

Test framework

New features
  • Tests for verifying functionality of SynetQuantizedInnerProduct framework.

Simd v6.2.151

07 Jul 10:36

Choose a tag to compare

Algorithms

New features
  • Supporting of OpenCV compatibility in Simd::Resize (SimdResizeMethodBilinearOpenCv).
  • AVX-512BW optimizations of class ResizerByteBilinearOpenCv.
  • Base implementation of class SynetQuantizedConvolutionGemm.
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetDequantizeLinear.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizeLinear.
  • Base implementation, SSE4.1 optimizations of class SynetQuantizedAddUniform.
Improve
  • AVX2 optimizations of class ResizerByteBilinearOpenCv.
Bug fixing
  • Linker error in ResizeOpenCvSpecialTest.

Test framework

New features
  • Tests for verifying functionality of class SynetQuantizedConvolution framework.
  • Tests for verifying functionality of function SynetDequantizeLinear.
  • Tests for verifying functionality of function SynetQuantizeLinear.
  • Tests for verifying functionality of class SynetQuantizedAdd framework.

Python wrapper

New features
  • BilinearOpenCv in Simd.ResizeMethod enumeration.

Infrastructure

Removing
  • Support of Microsoft Visual Studio 2015.
  • Support of Microsoft Visual Studio 2017.

Simd v6.1.150

02 Jun 08:24

Choose a tag to compare

Algorithms

New features
  • Base implementation, SSE4.1, AVX2 optimizations of class ResizerByteBilinearOpenCv.
Improve
  • Base implementation, SSE4.1, AVX2 optimizations of function SynetPoolingAverage.
  • Base implementation, SSE4.1, AVX2 optimizations of class SynetGridSample2d32fBlZ.

Test framework

New features
  • Special tests to compare Simd and OpenCV resize.

Simd v6.1.149

05 May 07:59

Choose a tag to compare

Algorithms

New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcSpecV1.
  • AMX tile config changes caching.
  • Function SimdSetAmxFull.
Improve
  • Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcSpecV0.
Bug fixing
  • Error in function Simd::SynetSetInput.
Renaming
  • Class SynetConvolution16bNhwcDirect to SynetConvolution16bNhwcSpecV0.

Infrastructure

Bug fixing
  • CMake warning (required minimal version of CMake must be greater or equal to 3.10).