Releases: ermig1979/Simd
Releases · ermig1979/Simd
Simd v6.2.158
Algorithms
New features
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare3x3.
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MidpointFilterSquare5x5.
- Base implementation of class SynetConvolution16bNhwcSpecV2.
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare3x3.
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MinFilterSquare5x5.
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare3x3.
- Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function MaxFilterSquare5x5.
Improving
- AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
Test framework
New features
- Tests for verifying functionality of function MidpointFilterSquare3x3.
- Tests for verifying functionality of function MidpointFilterSquare5x5.
- Tests for verifying functionality of function MinFilterSquare3x3.
- Tests for verifying functionality of function MinFilterSquare5x5.
- Tests for verifying functionality of function MaxFilterSquare3x3.
- Tests for verifying functionality of function MaxFilterSquare5x5.
Simd v6.2.157
Algorithms
New features
- Function Simd::Resize for Simd::Frame.
- Base implementation of function DrawLine.
- Base implementation of function DrawRectangle.
- Base implementation of function FontInit.
- Base implementation of function FontResize.
- Base implementation of function FontHeight.
- Base implementation of function FontMeasure.
- Base implementation of function FontDraw.
Improving
- Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
- AVX-512BW optimizations of function SynetPoolingMax32f (case of SynetPoolingMax32f2DNhwcSolid2x2).
- AVX-512BW optimizations of function SynetMergedConvolution32f (InputConvolution1x1).
- AVX-512BW optimizations of function SynetMergedConvolution32f (DepthwiseConvolution_k3p1d1s1w6).
- Simd::DrawLine uses SimdDrawLine instead its own implementation.
- Simd::DrawRectangle uses SimdDrawRectangle instead its own implementation.
- Simd::Font uses functions SimdFontInit, SimdFontResize, SimdFontHeight, SimdFontMeasure, SimdFontDraw instead of its own implementation.
Python wrapper
New features
- Function Simd.ResizeFrame.
- Function Simd.ResizedFrame.
- Yuv444p member to Simd.FrameFormat enumeration.
- Method Simd.ImageFrame.Save.
- Method Simd.ImageFrame.Load.
- Function Simd.Lib.StretchGray2x2.
- Function Simd.StretchGray2x2.
- Function Simd.Lib.BgraToYuv444p.
- Function Simd.Lib.Yuv444pToRgb.
- Function Simd.Lib.ReduceGray2x2.
- Function Simd.ReduceGray2x2.
- Function Simd.Lib.BgrToYuv444p.
- Function Simd.Lib.BgraToYuv444p.
- Function Simd.Lib.Yuv444pToBgr.
- Function Simd.Lib.Yuv444pToRgba.
- Function Simd.Lib.DrawLine.
- Method Simd.Image.DrawLine.
- Function Simd.Lib.DrawRectangle.
- Method Simd.Image.DrawRectangle.
- Function Simd.Lib.FontInit.
- Function Simd.Lib.FontResize.
- Function Simd.Lib.FontHeight.
- Function Simd.Lib.FontMeasure.
- Function Simd.Lib.FontDraw.
- Class Simd.TextFont.
- Method Simd.Image.DrawFilledRectangle.
Improving
- Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Recreate.
- Support of Simd.FrameFormat.Yuv444p in method Simd.ImageFrame.Convert.
Bug fixing
- Error in method Simd.Frame.Convert.
Renaming
- Function Simd.Resize to Simd.ResizeImage.
- Function Simd.Resized to Simd.ResizedImage.
Test framework
New features
- Tests for verifying functionality of function DrawLine.
- Tests for verifying functionality of function DrawRectangle.
Bug fixing
- Error in method Test::PerformanceMeasurerStorage::Clear.
Simd v6.2.156
Algorithms
New features
- Enumeration SimdShiftDetectorTextureType (С API of Simd::ShiftDetector).
- Enumeration SimdShiftDetectorDifferenceType (С API of Simd::ShiftDetector).
- Base implementation of function SimdShiftDetectorInitBuffers (С API of Simd::ShiftDetector).
- Base implementation of function SimdShiftDetectorSetBackground (С API of Simd::ShiftDetector).
- Base implementation of function SimdShiftDetectorEstimate (С API of Simd::ShiftDetector).
- Base implementation of function SimdShiftDetectorGetShift (С API of Simd::ShiftDetector).
- Base implementation, AMX-BF16 optimizations of class SynetConvolution16bNhwcGemmV1.
Improve
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveUv (some outputs can be NULL).
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveBgr (some outputs can be NULL).
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VBMI, NEON optimizations of function DeinterleaveBgra (some outputs can be NULL).
- C++ wrapper Simd::DeinterleaveUv (support of empty outputs).
- C++ wrapper Simd::DeinterleaveBgr (support of empty outputs).
- C++ wrapper Simd::DeinterleaveBgra (support of empty outputs).
- C++ wrapper Simd::DeinterleaveRgb (support of empty outputs).
- C++ wrapper Simd::DeinterleaveRgba (support of empty outputs).
- Parallelization in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class ResizerNearest.
Removing
- C++ wrapper Simd::DeinterleaveBgra with 4 arguments.
- C++ wrapper Simd::DeinterleaveRgba with 4 arguments.
Renaming
- Class SynetConvolution16bNhwcGemm to SynetConvolution16bNhwcGemmV0.
Test framework
Improve
- Tests for verifying functionality of function DeinterleaveUv (some outputs can be NULL).
- Tests for verifying functionality of function DeinterleaveBgr (some outputs can be NULL).
- Tests for verifying functionality of function DeinterleaveBgra (some outputs can be NULL).
Python wrapper
New features
- CurrentFrequency member to Simd.CpuInfo enumeration.
- Bf16 member to Simd.ResizeChannel enumeration.
- Function Simd.ShiftBilinear.
- Enumeration Simd.ShiftDetectorTexture.
- Enumeration Simd.ShiftDetectorDifference.
- Function Simd.Lib.ShiftDetectorInitBuffers.
- Function Simd.Lib.ShiftDetectorSetBackground.
- Function Simd.Lib.ShiftDetectorEstimate.
- Function Simd.Lib.ShiftDetectorGetShift.
- Function Simd.Lib.ShiftDetectorGetRefinedShift.
- Function Simd.Lib.ShiftDetectorGetStability.
- Function Simd.Lib.ShiftDetectorGetCorrelation.
- Class Simd.ShiftingDetector.
Improve
- Function Simd.Lib.SysInfo.
Simd v6.2.155
Algorithms
New features
- SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedScaleLayerForward.
- SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedPreluLayerForward.
- Arbitrary activation function in Base implementation of class SynetQuantizedConvolutionGemm.
- Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
- Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
- Arbitrary activation function in Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV2.
- Arbitrary activation function in Base implementation, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
Improve
- AMX-BF16 optimizations of class SynetConvolution16bNhwcGemm (case of small srcC).
Bug fixing
- Performance bug in AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
- Error in SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedInnerProductGemmNN.
- Error in SSE4.1 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
- Error in Base implementation of class SynetQuantizedConvolutionNhwcSpecV0.
- Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.
Simd v6.2.154
Algorithms
New features
- SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionCdc.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionCd.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedMergedConvolutionDc.
- Base implementation of function SynetQuantizedScaleLayerForward.
- Base implementation of function SynetQuantizedPreluLayerForward.
Improve
- AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
- Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of batch > 1).
- Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0 (case of batch > 1).
- Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of small srcC).
- Performance of AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0 (case of small srcC).
Bug fixing
- Error in AVX-512BW optimizations of function SynetQuantizedConcatLayerForward.
- Error in function Base::CpuModel (Windows Server 2025).
- Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of class SynetQuantizedAddUniform.
- Error in Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function QuantizedMergedConvolutionAddInputToOutput.
- Error in AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm (case of batch > 1).
Test framework
New features
- Tests for verifying functionality of function SynetQuantizedScaleLayerForward.
- Tests for verifying functionality of function SynetQuantizedPreluLayerForward.
Infrastructure
Bug fixing
- Fix bug in step 'Host Properties' in Github actions script for MSBuild.
- Fix bug in step 'Host Properties' in Github actions script for CMake.
Removing
- Support of Microsoft Visual Studio 2019.
Simd v6.2.153
Algorithms
New features
- AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV0.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV1.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV2.
- Base implementation, AVX-512VNNI optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV3.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedShuffleLayerForward.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizedConcatLayerForward.
- Base implementation of class SynetQuantizedMergedConvolutionRef.
- Base implementation of class SynetQuantizedMergedConvolutionCdc.
Improve
- SSE4.1, AVX2 optimizations of class SynetQuantizedConvolutionNhwcDepthwiseV0.
Bug fixing
- Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.
Test framework
New features
- Tests for verifying functionality of function SynetQuantizedShuffleLayerForward.
- Tests for verifying functionality of function SynetQuantizedConcatLayerForward.
- Tests for verifying functionality of function SynetQuantizedMergedConvolutionForward.
Infrastructure
Improve
- Performance of Test step in Github actions script for MSBuild.
Simd v6.2.152
Algorithms
New features
- AVX2, AVX-512BW optimizations of class SynetQuantizedAddUniform.
- Base implementation of class SynetQuantizedInnerProductRef.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedInnerProductGemmNN.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcSpecV0.
- Base implementation, SSE4.1, AVX2 optimizations of class SynetQuantizedConvolutionNhwcDepthwise.
Improve
- AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
Bug fixing
- Error in NEON optimization of function Float32ToBFloat16.
- Error in Base implementation of class SynetQuantizedConvolutionNhwcGemm.
- Error in Base implementation of class SynetQuantizedConvolutionGemm.
Test framework
New features
- Tests for verifying functionality of SynetQuantizedInnerProduct framework.
Simd v6.2.151
Algorithms
New features
- Supporting of OpenCV compatibility in Simd::Resize (SimdResizeMethodBilinearOpenCv).
- AVX-512BW optimizations of class ResizerByteBilinearOpenCv.
- Base implementation of class SynetQuantizedConvolutionGemm.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI, AMX-INT8 optimizations of class SynetQuantizedConvolutionNhwcGemm.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetDequantizeLinear.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetQuantizeLinear.
- Base implementation, SSE4.1 optimizations of class SynetQuantizedAddUniform.
Improve
- AVX2 optimizations of class ResizerByteBilinearOpenCv.
Bug fixing
- Linker error in ResizeOpenCvSpecialTest.
Test framework
New features
- Tests for verifying functionality of class SynetQuantizedConvolution framework.
- Tests for verifying functionality of function SynetDequantizeLinear.
- Tests for verifying functionality of function SynetQuantizeLinear.
- Tests for verifying functionality of class SynetQuantizedAdd framework.
Python wrapper
New features
- BilinearOpenCv in Simd.ResizeMethod enumeration.
Infrastructure
Removing
- Support of Microsoft Visual Studio 2015.
- Support of Microsoft Visual Studio 2017.
Simd v6.1.150
Algorithms
New features
- Base implementation, SSE4.1, AVX2 optimizations of class ResizerByteBilinearOpenCv.
Improve
- Base implementation, SSE4.1, AVX2 optimizations of function SynetPoolingAverage.
- Base implementation, SSE4.1, AVX2 optimizations of class SynetGridSample2d32fBlZ.
Test framework
New features
- Special tests to compare Simd and OpenCV resize.
Simd v6.1.149
Algorithms
New features
- Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcSpecV1.
- AMX tile config changes caching.
- Function SimdSetAmxFull.
Improve
- Base implementation, SSE4.1, AVX2, AVX-512BW, AMX-BF16 optimizations of class SynetConvolution16bNhwcSpecV0.
Bug fixing
- Error in function Simd::SynetSetInput.
Renaming
- Class SynetConvolution16bNhwcDirect to SynetConvolution16bNhwcSpecV0.
Infrastructure
Bug fixing
- CMake warning (required minimal version of CMake must be greater or equal to 3.10).