Skip to content

Commit 03a49f4

Browse files
committed
Some updates
1 parent 84de1a6 commit 03a49f4

File tree

11 files changed

+131
-132
lines changed

11 files changed

+131
-132
lines changed

Project.toml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,21 @@ version = "0.8.26"
55

66
[deps]
77
DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
8+
IfElse = "615f187c-cbe4-4ef1-ba3b-2fcf58d6d173"
89
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
910
OffsetArrays = "6fe1bfb0-de20-5000-8ca7-80f57d26f881"
10-
SIMDPirates = "21efa798-c60a-11e8-04d3-e1a92915a26a"
1111
SLEEFPirates = "476501e8-09a2-5ece-8869-fb82de89a1fa"
1212
UnPack = "3a884ed6-31ef-47d7-9d2a-63182c4928ed"
1313
VectorizationBase = "3d5dd08c-fd9d-11e8-17fa-ed2836048c2f"
1414

1515
[compat]
1616
DocStringExtensions = "0.8"
17+
IfElse = "0"
1718
OffsetArrays = "1"
18-
SIMDPirates = "0.8.25"
19-
SLEEFPirates = "0.5.4"
19+
SLEEFPirates = "0.6"
2020
UnPack = "0,1"
21-
VectorizationBase = "0.12.31"
22-
julia = "1.1"
21+
VectorizationBase = "0.13"
22+
julia = "1.3"
2323

2424
[extras]
2525
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

docs/src/devdocs/loopset_structure.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ julia> LoopVectorization.operations(lsAmulB)
2626
var"##reduction#260" = LoopVectorization.vfmadd_fast(var"##tempload#258", var"##tempload#259", var"##reduction#260")
2727
var"##RHS#256" = LoopVectorization.reduce_to_add(var"##reduction#260", var"##RHS#256")
2828
```
29-
The act of performing a "reduction" across a loop introduces a few extra operations that manage creating a "zero" with respect to the reduction, and then combining with the specified value using `reduce_to_add`, which performs any necessary type conversions, such as from an `SVec` vector-type to a scalar, if necessary. This simplifies code generation, by making the functions agnostic with respect to the actual vectorization decisions the library makes.
29+
The act of performing a "reduction" across a loop introduces a few extra operations that manage creating a "zero" with respect to the reduction, and then combining with the specified value using `reduce_to_add`, which performs any necessary type conversions, such as from an `Vec` vector-type to a scalar, if necessary. This simplifies code generation, by making the functions agnostic with respect to the actual vectorization decisions the library makes.
3030

3131
Each operation is listed as depending on a set of loop iteration symbols:
3232
```julia

docs/src/devdocs/lowering.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@ This task is made simpler via multiple dispatch making the lowering of the compo
55
```julia
66
vload(vptr_A, (i,j,k))
77
```
8-
with the behavior of this load determined by the types of the arguments. Vectorization is expressed by making an index a `_MM{W}` type, rather than an integer, and operations with it will either produce another `_MM{W}` when it will still correspond to contiguous loads, or an `SVec{W,<:Integer}` if the resulting loads will be discontiguous, so that a `gather` or `scatter!` will be used. If all indexes are simply integers, then this produces a scalar load or store.
8+
with the behavior of this load determined by the types of the arguments. Vectorization is expressed by making an index a `_MM{W}` type, rather than an integer, and operations with it will either produce another `_MM{W}` when it will still correspond to contiguous loads, or an `Vec{W,<:Integer}` if the resulting loads will be discontiguous, so that a `gather` or `scatter!` will be used. If all indexes are simply integers, then this produces a scalar load or store.
99

1010

src/LoopVectorization.jl

Lines changed: 22 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,25 @@
11
module LoopVectorization
22

3-
if (!isnothing(get(ENV, "TRAVIS_BRANCH", nothing)) || !isnothing(get(ENV, "APPVEYOR", nothing))) && isdefined(Base, :Experimental) && isdefined(Base.Experimental, Symbol("@optlevel"))
4-
@eval Base.Experimental.@optlevel 1
5-
end
3+
# if (!isnothing(get(ENV, "TRAVIS_BRANCH", nothing)) || !isnothing(get(ENV, "APPVEYOR", nothing))) && isdefined(Base, :Experimental) && isdefined(Base.Experimental, Symbol("@optlevel"))
4+
# @eval Base.Experimental.@optlevel 1
5+
# end
66

7-
using VectorizationBase, SIMDPirates, SLEEFPirates, UnPack, OffsetArrays
8-
using VectorizationBase: REGISTER_SIZE, extract_data, num_vector_load_expr,
9-
mask, masktable, pick_vector_width_val, valmul, valrem, valmuladd, valmulsub, valadd, valsub, _MM,
10-
maybestaticlength, maybestaticsize, staticm1, staticp1, staticmul, subsetview, vzero, stridedpointer_for_broadcast,
11-
Static, Zero, StaticUnitRange, StaticLowerUnitRange, StaticUpperUnitRange, unwrap, maybestaticrange,
12-
AbstractColumnMajorStridedPointer, AbstractRowMajorStridedPointer, AbstractSparseStridedPointer, AbstractStaticStridedPointer,
13-
PackedStridedPointer, SparseStridedPointer, RowMajorStridedPointer, StaticStridedPointer, StaticStridedStruct, offsetprecalc,
14-
maybestaticfirst, maybestaticlast, scalar_less, scalar_greater, noalias!, gesp, gepbyte, pointerforcomparison, NativeTypes, staticmul, staticmuladd
15-
using SIMDPirates: VECTOR_SYMBOLS, evadd, evsub, evmul, evfdiv, vrange,
16-
reduced_add, reduced_prod, reduce_to_add, reduced_max, reduced_min, vsum, vprod, vmaximum, vminimum,
17-
sizeequivalentfloat, sizeequivalentint, vadd!, vsub!, vmul!, vfdiv!, vfmadd!, vfnmadd!, vfmsub!, vfnmsub!,
18-
vfmadd231, vfmsub231, vfnmadd231, vfnmsub231, sizeequivalentfloat, sizeequivalentint, #prefetch,
19-
vmullog2, vmullog10, vdivlog2, vdivlog10, vmullog2add!, vmullog10add!, vdivlog2add!, vdivlog10add!, vfmaddaddone, vadd1, relu
7+
using VectorizationBase, SLEEFPirates, UnPack, OffsetArrays
8+
using VectorizationBase: REGISTER_SIZE, data,
9+
mask, pick_vector_width_val, MM,
10+
maybestaticlength, maybestaticsize, staticm1, staticp1, staticmul, vzero,
11+
Zero, maybestaticrange, offsetprecalc,
12+
maybestaticfirst, maybestaticlast, scalar_less, gesp, pointerforcomparison, NativeTypes, staticmul,
13+
relu
14+
using IfElse: ifelse
15+
16+
const Static = StaticInt
17+
# missing: subsetview, stridedpointer_for_broadcast, unwrap, StaticUnitRange, stridedpointers, noalias!, gepbyte,
18+
# using SIMDPirates: VECTOR_SYMBOLS, evadd, evsub, evmul, evfdiv, vrange,
19+
# reduced_add, reduced_prod, reduce_to_add, reduced_max, reduced_min, vsum, vprod, vmaximum, vminimum,
20+
# sizeequivalentfloat, sizeequivalentint, vadd!, vsub!, vmul!, vfdiv!, vfmadd!, vfnmadd!, vfmsub!, vfnmsub!,
21+
# vfmadd231, vfmsub231, vfnmadd231, vfnmsub231, sizeequivalentfloat, sizeequivalentint, #prefetch,
22+
# vmullog2, vmullog10, vdivlog2, vdivlog10, vmullog2add!, vmullog10add!, vdivlog2add!, vdivlog10add!, vfmaddaddone, vadd1, relu
2023
using SLEEFPirates: pow
2124
using Base.Broadcast: Broadcasted, DefaultArrayStyle
2225
using LinearAlgebra: Adjoint, Transpose
@@ -46,7 +49,7 @@ If you want good performance, DO NOT use a 32-bit build of Julia if you don't ha
4649
const REGISTER_COUNT = Sys.ARCH === :i686 ? 8 : VectorizationBase.REGISTER_COUNT
4750

4851
include("getconstindexes.jl")
49-
include("vectorizationbase_extensions.jl")
52+
# include("vectorizationbase_extensions.jl")
5053
include("predicates.jl")
5154
include("map.jl")
5255
include("filter.jl")
@@ -89,7 +92,7 @@ loop-reordering so as to improve performance:
8992
"""
9093
LoopVectorization
9194

92-
include("precompile.jl")
93-
_precompile_()
95+
# include("precompile.jl")
96+
# _precompile_()
9497

9598
end # module

src/condense_loopset.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,9 +161,9 @@ function loopset_return_value(ls::LoopSet, ::Val{extract}) where {extract}
161161
op = getop(ls, ls.outer_reductions[1])
162162
if extract
163163
if (isu₁unrolled(op) | isu₂unrolled(op))
164-
Expr(:call, :extract_data, Symbol(mangledvar(op), 0))
164+
Expr(:call, :data, Symbol(mangledvar(op), 0))
165165
else
166-
Expr(:call, :extract_data, mangledvar(op))
166+
Expr(:call, :data, mangledvar(op))
167167
end
168168
else
169169
Symbol(mangledvar(op), 0)
@@ -174,7 +174,7 @@ function loopset_return_value(ls::LoopSet, ::Val{extract}) where {extract}
174174
for or ls.outer_reductions
175175
op = ops[or]
176176
if extract
177-
push!(ret.args, Expr(:call, :extract_data, Symbol(mangledvar(op), 0)))
177+
push!(ret.args, Expr(:call, :data, Symbol(mangledvar(op), 0)))
178178
else
179179
push!(ret.args, Symbol(mangledvar(ops[or]), 0))
180180
end

0 commit comments

Comments
 (0)