SSE/AVX code optimization by troosh · Pull Request #2 · sirzooro/RakeSearch

troosh · 2018-01-12T02:17:01Z

I tried to optimize SSE/AVX version of MovePairSearch::MovePairSearch().
However, I'm not sure about the correctness of the work (why does WU for test something does, but not find anything?).

… that this is better, at least on AMD A10-5800)

sirzooro · 2018-01-12T10:14:14Z

Thanks for your contribution!.

Mask uses 9 bits, so after packing vector elements to int8 you are loosing one bit. You can fix this by calling _mm_packs_epi16 on result of _mm_cmpeq_epi16.

BTW, I have few other optimizations waiting on my PC, which are not pushed here yet. One of them was to change type of elements in squareA_MaskT to uint16_t, so one SSE vector can hold whole row, like AVX2 does now. By looking on your changes I have realized that code can be optimized further, by using packs instruction. Thanks again!

troosh · 2018-01-12T12:51:28Z

It's a pity that you did not allow to create Issues in the repository. So I'll write here, sorry.

It would be useful to use a WUs from https://github.com/sirzooro/RakeSearch/releases/download/v1.0/test.tgz how default workunit with a script to check. Well, or ask @CrystalFrost about this.

sirzooro · 2018-01-17T22:29:43Z

I have enabled issues, they were disabled by default (probably inherited this when I forked this repo).

I also pushed all my new changes on branch optimizations2. Please take a look, to avoid duplicate work.

troosh added 2 commits January 12, 2018 03:39

SSE2: two options for not using the mask4to1bits array

bf00c82

SSE2/AVX: Lifting up code of compression words to bytes (not the fact…

6b35349

… that this is better, at least on AMD A10-5800)

troosh added 3 commits January 17, 2018 00:21

Fix build under ARM32

4b8ddb1

Reduce size of squareA_MaskT[][]

e0d8a83

Unification of 32 and 64 bit codes for ARM processor

346608c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSE/AVX code optimization#2

SSE/AVX code optimization#2
troosh wants to merge 5 commits intosirzooro:boincfrom
troosh:boinc

troosh commented Jan 12, 2018

Uh oh!

sirzooro commented Jan 12, 2018

Uh oh!

troosh commented Jan 12, 2018

Uh oh!

sirzooro commented Jan 17, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

troosh commented Jan 12, 2018

Uh oh!

sirzooro commented Jan 12, 2018

Uh oh!

troosh commented Jan 12, 2018

Uh oh!

sirzooro commented Jan 17, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants