Skip to content

Commit 1cdf795

Browse files
committed
rstdoc: Update Expr and change log
1 parent c7b6c05 commit 1cdf795

File tree

3 files changed

+34
-3
lines changed

3 files changed

+34
-3
lines changed

distrib/Readme/readme_history.txt

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,33 @@ https://avisynthplus.readthedocs.io/en/latest/avisynthdoc/changelist374.html
1111
and
1212
https://avisynthplus.readthedocs.io/en/latest/avisynthdoc/FilterSDK/FilterSDK.html#what-s-new-in-the-api-v11
1313

14+
20250309 3.7.3 r----
15+
--------------------
16+
Expr: implement tanf in JisASM
17+
18+
Benchmark script
19+
20+
BlankClip(1000,1280,720,pixel_type="Y8")
21+
# swipes the range between -Pi to +Pi
22+
s = "sxr 2 * 1 - 3.14159254 * 1 * tan 10 * 128 +"
23+
# swipes the range between -5Pi to +5Pi
24+
# s = "sxr 2 * 1 - 3.14159254 * 5 * tan 1 0 * 128 +"
25+
a= Expr(s, optSSE2 = false, optAVX2=false, OptVectorC=False)
26+
b= Expr(s, optSSE2 = false, optAVX2=false)
27+
c= Expr(s, optSSE2 = True, optAVX2=false)
28+
d= Expr(s, optSSE2 = True, optAVX2=True)
29+
30+
a # or b or c or d
31+
32+
Results:
33+
34+
MSVC Intel ICX LLVM
35+
SinglePixel C : 48 66 [fps]
36+
Vector friendly C: 122 175
37+
JitASM SSE : 345 (same for both)
38+
JitASM AVX : 727 (same for both)
39+
40+
1441
20250306 3.7.3 r4612
1542
--------------------
1643
Expr: Rewrite the C (non-Intel-JIT) path to support vectorization, if the compiler is capable.

distrib/docs/english/source/avisynthdoc/changelist374.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@ Optimizations
164164
- Expr: rewritten the C (non-Intel-JIT) path to support vectorization, if the compiler is capable.
165165
Useful for non-Intel platforms where the (Intel SSE2-AVX2) JIT compiler does not work.
166166
Expect 3-20x speedup compared to the old method.
167+
- Expr: implement ``tan`` in JITasm. Expect ~6-15x speed up for an expression like "sxr 2 * 1 - 3.14159254 * 1 * tan 10 * 128 +"
167168

168169
Documentation
169170
~~~~~~~~~~~~~

distrib/docs/english/source/avisynthdoc/corefilters/expr.rst

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,8 @@ Syntax and Parameters
188188
C code is run when
189189

190190
- non x86/x64 systems (architectures which are not supported by JIT compiler)
191-
- expression contains tan, atan, asin, acos, which are not implemented in JIT.
191+
- expression contains atan, asin, acos, which are not implemented in JIT.
192+
(and ``tan`` before 3.7.4),
192193
- JIT is intentionally disabled with optSSE2=False
193194

194195
Enables or disables a compiler friendly (more easily vectorizable) C code.
@@ -254,8 +255,9 @@ Expr language/RPN elements
254255
* Function: ``clip`` three operand function for clipping. Example: ``x 16 240
255256
clip`` means min((max(x,16),240)
256257
* Functions: ``sin cos atan2 tan asin acos atan`` |br| On Intel x86/x64 the
257-
functions ``sin``, ``cos`` and ``atan2`` have SSE2/AVX2 optimization, the others
258-
have not (they make the whole expression to evaluate without SIMD optimization).
258+
functions ``sin``, ``cos``, ``tan`` and ``atan2`` have SSE2/AVX2 optimization,
259+
the others have not (if e.g. ``acos`` is used it makes the whole expression to
260+
evaluate without JitASM optimization).
259261
* Functions: ``round, floor, ceil, trunc`` operators (nearest integer - banker's
260262
rounding, round down, round up, round to zero). |br| On Intel builds acceleration
261263
requires at least SSE4.1 capable processor or else the whole expression is
@@ -552,6 +554,7 @@ Changelog
552554
+=================+==========================================================+
553555
| 3.7.4 || Enhancement: vectorizable C implementation helps nonJIT |
554556
| || New parameter: optVectorC |
557+
| || Implement ``tan`` for JitASM |
555558
+-----------------+----------------------------------------------------------+
556559
| AviSynth+ 3.7.2 || Expr: ``scale_inputs`` to case insensitive and add |
557560
| | floatUV to error message as an allowed value. |

0 commit comments

Comments
 (0)