Skip to content

Conversation

@wsmoses
Copy link
Collaborator

@wsmoses wsmoses commented Oct 4, 2024

Trying to fix ci deps, and also have everyone co-dev each other

@wsmoses wsmoses requested a review from vchuravy October 4, 2024 02:47
@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2024

Benchmark Results

main 40732b2... main/40732b2db20fd3...
saxpy/default/Float16/1024 2.77 ± 0.2 μs 2.78 ± 0.2 μs 0.998
saxpy/default/Float16/1048576 2.08 ± 0.0062 ms 2.08 ± 0.0064 ms 1
saxpy/default/Float16/16384 0.0328 ± 0.00014 ms 0.0328 ± 0.00014 ms 1
saxpy/default/Float16/2048 5.2 ± 0.057 μs 5.21 ± 0.043 μs 0.999
saxpy/default/Float16/256 0.991 ± 0.14 μs 0.968 ± 0.11 μs 1.02
saxpy/default/Float16/262144 0.524 ± 0.0095 ms 0.524 ± 0.0093 ms 1
saxpy/default/Float16/32768 0.065 ± 0.00018 ms 0.065 ± 0.00017 ms 0.999
saxpy/default/Float16/4096 10.1 ± 0.05 μs 10.1 ± 0.051 μs 0.998
saxpy/default/Float16/512 1.56 ± 0.16 μs 1.56 ± 0.043 μs 0.997
saxpy/default/Float16/64 0.642 ± 0.016 μs 0.605 ± 0.016 μs 1.06
saxpy/default/Float16/65536 0.129 ± 0.00033 ms 0.129 ± 0.00043 ms 1
saxpy/default/Float32/1024 1.03 ± 0.033 μs 1.01 ± 0.013 μs 1.02
saxpy/default/Float32/1048576 0.964 ± 0.008 ms 0.963 ± 0.0078 ms 1
saxpy/default/Float32/16384 15.5 ± 0.12 μs 15.4 ± 0.12 μs 1
saxpy/default/Float32/2048 1.74 ± 0.024 μs 1.71 ± 0.018 μs 1.01
saxpy/default/Float32/256 0.574 ± 0.12 μs 0.53 ± 0.13 μs 1.08
saxpy/default/Float32/262144 0.238 ± 0.0094 ms 0.238 ± 0.0094 ms 0.999
saxpy/default/Float32/32768 30.4 ± 0.17 μs 30.4 ± 0.19 μs 1
saxpy/default/Float32/4096 3.03 ± 0.025 μs 3.01 ± 0.021 μs 1
saxpy/default/Float32/512 0.788 ± 0.12 μs 0.683 ± 0.12 μs 1.15
saxpy/default/Float32/64 0.416 ± 0.0065 μs 0.395 ± 0.0063 μs 1.05
saxpy/default/Float32/65536 0.06 ± 0.00026 ms 0.06 ± 0.00023 ms 1
saxpy/default/Float64/1024 1.07 ± 0.02 μs 1.08 ± 0.021 μs 0.988
saxpy/default/Float64/1048576 1.02 ± 0.027 ms 1.03 ± 0.022 ms 0.99
saxpy/default/Float64/16384 15.7 ± 0.13 μs 16.2 ± 0.54 μs 0.969
saxpy/default/Float64/2048 1.75 ± 0.025 μs 1.74 ± 0.021 μs 1
saxpy/default/Float64/256 0.527 ± 0.014 μs 0.518 ± 0.012 μs 1.02
saxpy/default/Float64/262144 0.243 ± 0.0096 ms 0.242 ± 0.0098 ms 1
saxpy/default/Float64/32768 31.1 ± 0.73 μs 0.0322 ± 0.0017 ms 0.968
saxpy/default/Float64/4096 3.05 ± 0.04 μs 3.04 ± 0.038 μs 1
saxpy/default/Float64/512 0.716 ± 0.11 μs 0.698 ± 0.12 μs 1.03
saxpy/default/Float64/64 0.397 ± 0.0067 μs 0.383 ± 0.009 μs 1.04
saxpy/default/Float64/65536 0.0612 ± 0.00067 ms 0.0611 ± 0.0011 ms 1
saxpy/static workgroup=(1024,)/Float16/1024 2.12 ± 0.21 μs 2.08 ± 0.21 μs 1.02
saxpy/static workgroup=(1024,)/Float16/1048576 0.161 ± 0.01 ms 0.171 ± 0.019 ms 0.94
saxpy/static workgroup=(1024,)/Float16/16384 4.39 ± 0.22 μs 4.3 ± 0.21 μs 1.02
saxpy/static workgroup=(1024,)/Float16/2048 2.17 ± 0.22 μs 2.11 ± 0.22 μs 1.03
saxpy/static workgroup=(1024,)/Float16/256 2.65 ± 0.045 μs 2.63 ± 0.047 μs 1.01
saxpy/static workgroup=(1024,)/Float16/262144 0.0432 ± 0.0028 ms 0.043 ± 0.0023 ms 1
saxpy/static workgroup=(1024,)/Float16/32768 6.57 ± 0.26 μs 6.69 ± 0.23 μs 0.981
saxpy/static workgroup=(1024,)/Float16/4096 2.46 ± 0.038 μs 2.41 ± 0.039 μs 1.02
saxpy/static workgroup=(1024,)/Float16/512 3.16 ± 0.086 μs 3.15 ± 0.22 μs 1
saxpy/static workgroup=(1024,)/Float16/64 2.28 ± 0.026 μs 2.26 ± 0.024 μs 1.01
saxpy/static workgroup=(1024,)/Float16/65536 12.5 ± 0.54 μs 12.2 ± 0.32 μs 1.02
saxpy/static workgroup=(1024,)/Float32/1024 1.96 ± 0.027 μs 1.95 ± 0.029 μs 1.01
saxpy/static workgroup=(1024,)/Float32/1048576 0.232 ± 0.027 ms 0.262 ± 0.038 ms 0.883
saxpy/static workgroup=(1024,)/Float32/16384 4.09 ± 0.91 μs 4.68 ± 0.82 μs 0.874
saxpy/static workgroup=(1024,)/Float32/2048 2.3 ± 0.22 μs 2.28 ± 0.23 μs 1.01
saxpy/static workgroup=(1024,)/Float32/256 2.83 ± 1.6 μs 2.73 ± 1.6 μs 1.04
saxpy/static workgroup=(1024,)/Float32/262144 0.0578 ± 0.0063 ms 0.0583 ± 0.0065 ms 0.991
saxpy/static workgroup=(1024,)/Float32/32768 6.9 ± 0.38 μs 7.05 ± 0.4 μs 0.979
saxpy/static workgroup=(1024,)/Float32/4096 2.59 ± 0.19 μs 2.52 ± 0.23 μs 1.03
saxpy/static workgroup=(1024,)/Float32/512 2.51 ± 0.23 μs 2.47 ± 0.22 μs 1.02
saxpy/static workgroup=(1024,)/Float32/64 2.46 ± 0.069 μs 2.44 ± 0.075 μs 1.01
saxpy/static workgroup=(1024,)/Float32/65536 16.5 ± 1.8 μs 16.2 ± 2.2 μs 1.02
saxpy/static workgroup=(1024,)/Float64/1024 2.07 ± 0.029 μs 2.06 ± 0.042 μs 1
saxpy/static workgroup=(1024,)/Float64/1048576 0.516 ± 0.06 ms 0.56 ± 0.063 ms 0.922
saxpy/static workgroup=(1024,)/Float64/16384 6.98 ± 1.4 μs 7.06 ± 1 μs 0.99
saxpy/static workgroup=(1024,)/Float64/2048 2.56 ± 0.34 μs 2.54 ± 0.25 μs 1.01
saxpy/static workgroup=(1024,)/Float64/256 2.42 ± 0.056 μs 5.17 ± 7.1 μs 0.468
saxpy/static workgroup=(1024,)/Float64/262144 0.107 ± 0.016 ms 0.111 ± 0.013 ms 0.963
saxpy/static workgroup=(1024,)/Float64/32768 17.4 ± 2.1 μs 15.9 ± 2.1 μs 1.09
saxpy/static workgroup=(1024,)/Float64/4096 3.12 ± 0.35 μs 2.88 ± 0.17 μs 1.08
saxpy/static workgroup=(1024,)/Float64/512 2.4 ± 0.042 μs 2.4 ± 0.049 μs 1
saxpy/static workgroup=(1024,)/Float64/64 2.39 ± 0.12 μs 2.39 ± 0.12 μs 1
saxpy/static workgroup=(1024,)/Float64/65536 31.1 ± 3.9 μs 30.4 ± 4.3 μs 1.02
time_to_load 0.31 ± 0.00063 s 0.312 ± 0.0015 s 0.993

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@vchuravy
Copy link
Member

vchuravy commented Oct 4, 2024

CI still says "no"

ERROR: LoadError: Unsatisfiable requirements detected for package EnzymeCore [f151be2c]:
 EnzymeCore [f151be2c] log:
 ├─possible versions are: 0.1.0-0.8.4 or uninstalled
 ├─restricted to versions 0.8.1-0.8 by KernelAbstractions [63c18a36], leaving only versions 0.8.1-0.8.4
 │ └─KernelAbstractions [63c18a36] log:
 │   ├─possible versions are: 0.9.27 or uninstalled
 │   └─KernelAbstractions [63c18a36] is fixed to version 0.9.27
 └─restricted by julia compatibility requirements to versions: 0.1.0-0.7.8 or uninstalled — no versions left
Stacktrace:

@wsmoses
Copy link
Collaborator Author

wsmoses commented Oct 4, 2024 via email

@vchuravy vchuravy merged commit dd3044a into main Oct 4, 2024
20 of 36 checks passed
@vchuravy vchuravy deleted the wmci branch October 4, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants