Skip to content

Commit 087f134

Browse files
committed
Only disable SLP autovectorization of _PyEval_EvalFrameDefault on newer
GCCs, as the optimization bug seems to exist only on GCC 12 and later, and before GCC 9 disabling the optimization has a dramatic performance impact.
1 parent 1f5682f commit 087f134

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

Python/ceval.c

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -948,11 +948,14 @@ _PyObjectArray_Free(PyObject **array, PyObject **scratch)
948948
#include "generated_cases.c.h"
949949
#endif
950950

951-
#if (defined(__GNUC__) && !defined(__clang__)) && defined(__x86_64__)
951+
#if (defined(__GNUC__) && __GNUC__ >= 10 && !defined(__clang__)) && defined(__x86_64__)
952952
/*
953-
* gh-129987: The SLP autovectorizer can cause poor code generation for opcode
954-
* dispatch, negating any benefit we get from vectorization elsewhere in the
955-
* interpreter loop.
953+
* gh-129987: The SLP autovectorizer can cause poor code generation for
954+
* opcode dispatch in some GCC versions (observed in GCCs 12 through 15),
955+
* negating any benefit we get from vectorization elsewhere in the
956+
* interpreter loop. Disabling it significantly affected older GCC versions
957+
* (prior to GCC 9, 40% performance drop), so we have to selectively disable
958+
* it.
956959
*/
957960
#define DONT_SLP_VECTORIZE __attribute__((optimize ("no-tree-slp-vectorize")))
958961
#else

0 commit comments

Comments
 (0)