Commit 945c694
committed
avx512f: implement slide_left and slide_right.
With a fast path for N = 4*i and a split version else (inspired from avx512bw).
As the vpermd actually has the lower latency on some intel cpus compared to vpermw / vpermb, also use it instead of the avx512bw and avx512vbmi implementations if trivially possible.
Also reverse the order of the slow path for lower latency. It used to be:
```
SLR -> PERM ->
OR -> PERM
SLL ->
```
Now the latency is reduced to:
```
SLR -> PERM ->
OR
SLL -> PERM ->
```
And it should now generate even better code on avx512bw with N=63 with only one PERM (as already done for N=1).
For N=16,32,48, it prefers vshufi32x4 over vpermd for lower latency on Zen4 and decreased register usage.1 parent 45cbf4b commit 945c694
File tree
4 files changed
+113
-147
lines changed- include/xsimd/arch
- test
4 files changed
+113
-147
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
484 | 484 | | |
485 | 485 | | |
486 | 486 | | |
487 | | - | |
488 | | - | |
489 | | - | |
490 | | - | |
491 | | - | |
492 | | - | |
493 | | - | |
494 | | - | |
495 | | - | |
496 | | - | |
497 | | - | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
506 | | - | |
507 | | - | |
| 487 | + | |
508 | 488 | | |
509 | 489 | | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | | - | |
517 | | - | |
518 | | - | |
519 | | - | |
520 | | - | |
521 | | - | |
522 | | - | |
523 | | - | |
524 | | - | |
525 | | - | |
526 | | - | |
| 490 | + | |
527 | 491 | | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
532 | | - | |
533 | | - | |
534 | | - | |
535 | | - | |
536 | | - | |
537 | | - | |
538 | | - | |
539 | | - | |
540 | 492 | | |
541 | | - | |
542 | | - | |
| 493 | + | |
| 494 | + | |
543 | 495 | | |
544 | 496 | | |
545 | 497 | | |
546 | | - | |
547 | | - | |
548 | | - | |
549 | | - | |
550 | | - | |
551 | | - | |
552 | | - | |
553 | | - | |
554 | | - | |
555 | | - | |
556 | | - | |
557 | | - | |
558 | | - | |
559 | | - | |
560 | | - | |
561 | | - | |
562 | | - | |
563 | | - | |
564 | | - | |
565 | | - | |
| 498 | + | |
566 | 499 | | |
567 | 500 | | |
568 | | - | |
569 | | - | |
570 | | - | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
| 501 | + | |
| 502 | + | |
591 | 503 | | |
592 | | - | |
593 | | - | |
| 504 | + | |
| 505 | + | |
594 | 506 | | |
595 | 507 | | |
596 | 508 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1859 | 1859 | | |
1860 | 1860 | | |
1861 | 1861 | | |
| 1862 | + | |
| 1863 | + | |
| 1864 | + | |
| 1865 | + | |
| 1866 | + | |
| 1867 | + | |
| 1868 | + | |
| 1869 | + | |
| 1870 | + | |
| 1871 | + | |
| 1872 | + | |
| 1873 | + | |
| 1874 | + | |
| 1875 | + | |
| 1876 | + | |
| 1877 | + | |
| 1878 | + | |
| 1879 | + | |
| 1880 | + | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
| 1888 | + | |
| 1889 | + | |
| 1890 | + | |
| 1891 | + | |
| 1892 | + | |
| 1893 | + | |
| 1894 | + | |
| 1895 | + | |
| 1896 | + | |
| 1897 | + | |
| 1898 | + | |
| 1899 | + | |
1862 | 1900 | | |
1863 | | - | |
| 1901 | + | |
1864 | 1902 | | |
1865 | | - | |
1866 | | - | |
| 1903 | + | |
| 1904 | + | |
| 1905 | + | |
| 1906 | + | |
| 1907 | + | |
| 1908 | + | |
| 1909 | + | |
| 1910 | + | |
| 1911 | + | |
| 1912 | + | |
1867 | 1913 | | |
1868 | 1914 | | |
1869 | 1915 | | |
| 1916 | + | |
| 1917 | + | |
| 1918 | + | |
| 1919 | + | |
| 1920 | + | |
| 1921 | + | |
| 1922 | + | |
| 1923 | + | |
| 1924 | + | |
| 1925 | + | |
| 1926 | + | |
| 1927 | + | |
| 1928 | + | |
| 1929 | + | |
| 1930 | + | |
| 1931 | + | |
| 1932 | + | |
| 1933 | + | |
| 1934 | + | |
| 1935 | + | |
| 1936 | + | |
| 1937 | + | |
| 1938 | + | |
| 1939 | + | |
| 1940 | + | |
| 1941 | + | |
| 1942 | + | |
| 1943 | + | |
| 1944 | + | |
| 1945 | + | |
| 1946 | + | |
| 1947 | + | |
| 1948 | + | |
| 1949 | + | |
| 1950 | + | |
| 1951 | + | |
| 1952 | + | |
1870 | 1953 | | |
1871 | | - | |
| 1954 | + | |
1872 | 1955 | | |
1873 | | - | |
1874 | | - | |
| 1956 | + | |
| 1957 | + | |
| 1958 | + | |
| 1959 | + | |
| 1960 | + | |
| 1961 | + | |
| 1962 | + | |
| 1963 | + | |
| 1964 | + | |
| 1965 | + | |
1875 | 1966 | | |
1876 | 1967 | | |
1877 | 1968 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | 27 | | |
49 | | - | |
| 28 | + | |
50 | 29 | | |
51 | 30 | | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
| 31 | + | |
60 | 32 | | |
61 | 33 | | |
62 | | - | |
| 34 | + | |
63 | 35 | | |
64 | 36 | | |
65 | 37 | | |
66 | 38 | | |
67 | | - | |
| 39 | + | |
68 | 40 | | |
69 | 41 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
| 42 | + | |
| 43 | + | |
78 | 44 | | |
79 | | - | |
| 45 | + | |
80 | 46 | | |
81 | 47 | | |
82 | 48 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
169 | | - | |
170 | 169 | | |
171 | 170 | | |
172 | 171 | | |
| |||
270 | 269 | | |
271 | 270 | | |
272 | 271 | | |
273 | | - | |
274 | | - | |
275 | 272 | | |
276 | 273 | | |
277 | 274 | | |
| |||
0 commit comments