-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
Building on x86-64 (amd ryzen 7) with gcc 9.4.0-1ubuntu1~20.04.2 the _mm256_loadu2_m128i intrinsic is missing:
cc -D_DEFAULT_SOURCE -O3 -g -std=c99 -Wall -march=native -funroll-loops -ftree-vectorize -fno-inline -DOBLAS_AVX -DOCTMAT_ALIGN=32 -c -o oblas.o oblas.c
In file included from oblas.c:18:
oblas_avx.c: In function ‘oaxpy’:
oblas_avx.c:56:7: warning: implicit declaration of function ‘_mm256_loadu2_m128i’; did you mean ‘_mm256_loadu_si256’? [-Wimplicit-function-declaration]
56 | _mm256_loadu2_m128i((__m128i *)OCT_MUL_HI[u], (__m128i *)OCT_MUL_HI[u]);
| ^~~~~~~~~~~~~~~~~~~
| _mm256_loadu_si256
Compiling oblas with SSE (after cleaning the old partial compile) completes:
deps/oblas/liboblas.a:
- $(MAKE) -C deps/oblas CPPFLAGS+="-DOBLAS_AVX -DOCTMAT_ALIGN=32"
+ $(MAKE) -C deps/oblas CPPFLAGS+="-DOBLAS_SSE -DOCTMAT_ALIGN=32"
But the decode process produces corrupt output in the decoded peace_and_war.txt file from offset 0x280 to 0x4ff, and from 0x780 to 0x9ff, etc.
Compiling in classic mode works fine:
deps/oblas/liboblas.a:
- $(MAKE) -C deps/oblas CPPFLAGS+="-DOBLAS_AVX -DOCTMAT_ALIGN=32"
+ $(MAKE) -C deps/oblas CPPFLAGS+="-DOCTMAT_ALIGN=32"
Diffing the data.rq files between the classic and SSE versions show that the corruption is likely happening in the encode process since the first block has the corrupted data present.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels