Skip to content

Commit 215a0f5

Browse files
committed
Merge branch 'avx512' into devel
* Implemented partial AVX-512 support
2 parents 813e6db + e737d7a commit 215a0f5

40 files changed

+3225
-258
lines changed

.cproject

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,18 @@
8787
<outputType id="org.eclipse.cdt.managedbuilder.ui.rcbs.outputtype.218466097" name="Resource Custom Build Step Output Type"/>
8888
</tool>
8989
</fileInfo>
90+
<fileInfo id="cdt.managedbuild.config.gnu.exe.debug.1330341797.281573242" name="avx512.cpp" rcbsApplicability="disable" resourcePath="src/main/x86/avx512.cpp" toolsToInvoke="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680.752248820">
91+
<tool id="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680.752248820" name="GCC C++ Compiler" superClass="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680">
92+
<option id="gnu.cpp.compiler.option.other.other.2124416907" name="Other flags" superClass="gnu.cpp.compiler.option.other.other" useByScannerDiscovery="false" value="-c -fmessage-length=0 -mavx512f -mavx512vl" valueType="string"/>
93+
<inputType id="cdt.managedbuild.tool.gnu.cpp.compiler.input.1847585465" superClass="cdt.managedbuild.tool.gnu.cpp.compiler.input"/>
94+
</tool>
95+
<tool customBuildStep="true" id="org.eclipse.cdt.managedbuilder.ui.rcbs.874817721" name="Resource Custom Build Step">
96+
<inputType id="org.eclipse.cdt.managedbuilder.ui.rcbs.inputtype.22498286" name="Resource Custom Build Step Input Type">
97+
<additionalInput kind="additionalinputdependency" paths=""/>
98+
</inputType>
99+
<outputType id="org.eclipse.cdt.managedbuilder.ui.rcbs.outputtype.1153584601" name="Resource Custom Build Step Output Type"/>
100+
</tool>
101+
</fileInfo>
90102
<fileInfo id="cdt.managedbuild.config.gnu.exe.debug.1330341797.356003736" name="sse4.cpp" rcbsApplicability="disable" resourcePath="src/main/x86/sse4.cpp" toolsToInvoke="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680.73118722">
91103
<tool id="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680.73118722" name="GCC C++ Compiler" superClass="cdt.managedbuild.tool.gnu.cpp.compiler.exe.debug.1715496680">
92104
<option id="gnu.cpp.compiler.option.other.other.824484852" name="Other flags" superClass="gnu.cpp.compiler.option.other.other" useByScannerDiscovery="false" value="-c -fmessage-length=0 -msse4 -msse4a" valueType="string"/>

CHANGELOG

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
*******************************************************************************
44

55
=== 1.0.15 ===
6+
* Several functions optimized for AVX-512 support.
67
* Fixed several issues reported by PVS Studio static analyzer.
78
* Fixed syntax error in C interface, covered with tests.
89
* Bugfix in horizontal summing functions (invalid register clobber list).

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ This library provides set of functions that perform SIMD-optimized
66
computing on several hardware architectures.
77

88
Currently supported set of SIMD extensions:
9-
* i586 architecture (32-bit): SSE, SSE2, SSE3, AVX, AVX2 and FMA3;
10-
* x86_64 architecture (64-bit): SSE, SSE2, SSE3, AVX, AVX2 and FMA3;
9+
* i586 architecture (32-bit): SSE, SSE2, SSE3, AVX, AVX2, FMA3 and AVX512;
10+
* x86_64 architecture (64-bit): SSE, SSE2, SSE3, AVX, AVX2, FMA3 and AVX512;
1111
* armv7 architecture (32-bit): NEON;
1212
* AArch64 architecture (64-bit): ASIMD.
1313

@@ -44,11 +44,11 @@ The build and correct unit test execution has been confirmed for following platf
4444
## Supported architectures
4545

4646
The support of following list of hardware architectures has been implemented:
47-
* i386 (32-bit) - full support.
48-
* x86_64 (64-bit) - full support.
47+
* i386 (32-bit) - full support (AVX-512 on the way).
48+
* x86_64 (64-bit) - full support (AVX-512 on the way).
4949
* ARMv6A - full support.
5050
* ARMv7A - full support.
51-
* AArch64 - most functions.
51+
* AArch64 - full support.
5252

5353
For all other architectures the generic implementation of algorithms is used, without any
5454
architecture-specific optimizations.

include/private/dsp/arch/x86/avx/complex.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*
2-
* Copyright (C) 2020 Linux Studio Plugins Project <https://lsp-plug.in/>
3-
* (C) 2020 Vladimir Sadovnikov <[email protected]>
2+
* Copyright (C) 2023 Linux Studio Plugins Project <https://lsp-plug.in/>
3+
* (C) 2023 Vladimir Sadovnikov <[email protected]>
44
*
55
* This file is part of lsp-dsp-lib
66
* Created on: 31 мар. 2020 г.
@@ -507,7 +507,7 @@ namespace lsp
507507
[CC] "o" (complex_div_const)
508508
: "cc", "memory",
509509
"%xmm0", "%xmm1", "%xmm2", "%xmm3",
510-
"%xmm4", "%xmm5"
510+
"%xmm4", "%xmm5", "%xmm6", "%xmm7"
511511
);
512512
}
513513

include/private/dsp/arch/x86/avx/copy.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,12 @@ namespace lsp
327327

328328
void reverse2(float *dst, const float *src, size_t count)
329329
{
330+
if (dst == src)
331+
{
332+
reverse1(dst, count);
333+
return;
334+
}
335+
330336
ARCH_X86_ASM
331337
(
332338
__ASM_EMIT("lea (%[dst], %[count], 4), %[dst]")

0 commit comments

Comments
 (0)