-
Notifications
You must be signed in to change notification settings - Fork 26
test: add jaxley benchmark #1896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
avik-pal
wants to merge
3
commits into
main
Choose a base branch
from
ap/neuro_benchmark
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+670
−61
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
94c8aa8 to
9f60d0f
Compare
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: c50080b | Previous: 7a7d4f5 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / Jax / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / HLOOpt / cuda / Primal |
0.0000024 s |
0.000001984 s |
1.21 |
actmtch / PartOpt / cuda / Primal |
0.000002399 s |
0.000002015 s |
1.19 |
actmtch / IPartOpt / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / DefOpt / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / IDefOpt / cuda / Primal |
0.0000024 s |
0.000002015 s |
1.19 |
actmtch / JaXPipe / cuda / Forward |
0.000010528 s |
0.00000944 s |
1.12 |
actmtch / Jax / cuda / Forward |
0.000010272 s |
0.0000096 s |
1.07 |
actmtch / HLOOpt / cuda / Forward |
0.000010688 s |
0.000009664 s |
1.11 |
actmtch / PartOpt / cuda / Forward |
0.000010495 s |
0.000009568 s |
1.10 |
actmtch / IPartOpt / cuda / Forward |
0.000010656 s |
0.000009888 s |
1.08 |
actmtch / DefOpt / cuda / Forward |
0.000010431 s |
0.000009888 s |
1.05 |
actmtch / IDefOpt / cuda / Forward |
0.000010272 s |
0.000009824 s |
1.05 |
actmtch / JaXPipe / cuda / PreRev |
0.000010432 s |
0.000009247 s |
1.13 |
actmtch / JaXPipe / cuda / PostRev |
0.00001088 s |
0.0000096 s |
1.13 |
actmtch / JaXPipe / cuda / BothRev |
0.000010592 s |
0.000009664 s |
1.10 |
actmtch / Jax / cuda / BothRev |
0.00001072 s |
0.000009888 s |
1.08 |
actmtch / HLOOpt / cuda / PreRev |
0.000010368 s |
0.000010208 s |
1.02 |
actmtch / HLOOpt / cuda / PostRev |
0.000010432 s |
0.000009984 s |
1.04 |
actmtch / HLOOpt / cuda / BothRev |
0.000010687 s |
0.000010144 s |
1.05 |
actmtch / PartOpt / cuda / PreRev |
0.0000104 s |
0.000009823 s |
1.06 |
actmtch / PartOpt / cuda / PostRev |
0.000010656 s |
0.000009952 s |
1.07 |
actmtch / PartOpt / cuda / BothRev |
0.000010784 s |
0.00000976 s |
1.10 |
actmtch / IPartOpt / cuda / PreRev |
0.000010688 s |
0.000010176 s |
1.05 |
actmtch / IPartOpt / cuda / PostRev |
0.000012928 s |
0.000009696 s |
1.33 |
actmtch / IPartOpt / cuda / BothRev |
0.000010783 s |
0.000009568 s |
1.13 |
actmtch / DefOpt / cuda / PreRev |
0.000010528 s |
0.000009792 s |
1.08 |
actmtch / DefOpt / cuda / PostRev |
0.00001072 s |
0.00000992 s |
1.08 |
actmtch / DefOpt / cuda / BothRev |
0.000010944 s |
0.000010016 s |
1.09 |
actmtch / IDefOpt / cuda / PreRev |
0.00001072 s |
0.000010112 s |
1.06 |
actmtch / IDefOpt / cuda / PostRev |
0.000010464 s |
0.000009696 s |
1.08 |
actmtch / IDefOpt / cuda / BothRev |
0.000010304 s |
0.000009824 s |
1.05 |
actmtch / JaXPipe / tpu / Primal |
5.825e-7 s |
5.64025e-7 s |
1.03 |
actmtch / Jax / tpu / Primal |
5.63075e-7 s |
5.96725e-7 s |
0.94 |
actmtch / HLOOpt / tpu / Primal |
0.0000021651 s |
0.00000209535 s |
1.03 |
actmtch / PartOpt / tpu / Primal |
5.63225e-7 s |
5.96875e-7 s |
0.94 |
actmtch / IPartOpt / tpu / Primal |
5.752999999999999e-7 s |
5.527750000000001e-7 s |
1.04 |
actmtch / DefOpt / tpu / Primal |
0.000002062275 s |
0.0000021667 s |
0.95 |
actmtch / IDefOpt / tpu / Primal |
0.00000217415 s |
0.000002095325 s |
1.04 |
actmtch / JaXPipe / tpu / Forward |
0.000003857625 s |
0.000003823775 s |
1.01 |
actmtch / Jax / tpu / Forward |
0.0000012321749999999998 s |
0.000001206075 s |
1.02 |
actmtch / HLOOpt / tpu / Forward |
0.000003682525 s |
0.000003932450000000001 s |
0.94 |
actmtch / PartOpt / tpu / Forward |
0.000003892375 s |
0.000003922824999999999 s |
0.99 |
actmtch / IPartOpt / tpu / Forward |
0.0000036728 s |
0.00000393125 s |
0.93 |
actmtch / DefOpt / tpu / Forward |
0.000003901075 s |
0.00000391195 s |
1.00 |
actmtch / IDefOpt / tpu / Forward |
0.00000367745 s |
0.00000394 s |
0.93 |
actmtch / JaXPipe / tpu / PreRev |
0.000003750075 s |
0.000003474025 s |
1.08 |
actmtch / JaXPipe / tpu / PostRev |
0.0000016241749999999998 s |
0.000001645425 s |
0.99 |
actmtch / JaXPipe / tpu / BothRev |
0.000003748425 s |
0.000003475325 s |
1.08 |
actmtch / Jax / tpu / BothRev |
0.000001617025 s |
0.000001642275 s |
0.98 |
actmtch / HLOOpt / tpu / PreRev |
0.000003748775 s |
0.000003480675 s |
1.08 |
actmtch / HLOOpt / tpu / PostRev |
0.000003434325 s |
0.00000340625 s |
1.01 |
actmtch / HLOOpt / tpu / BothRev |
0.00000374765 s |
0.000003476675 s |
1.08 |
actmtch / PartOpt / tpu / PreRev |
0.0000034422750000000004 s |
0.000003408575 s |
1.01 |
actmtch / PartOpt / tpu / PostRev |
0.0000016636 s |
0.00000159245 s |
1.04 |
actmtch / PartOpt / tpu / BothRev |
0.00000343615 s |
0.0000034163500000000003 s |
1.01 |
actmtch / IPartOpt / tpu / PreRev |
0.0000037365 s |
0.000003466875 s |
1.08 |
actmtch / IPartOpt / tpu / PostRev |
0.00000162335 s |
0.0000016414 s |
0.99 |
actmtch / IPartOpt / tpu / BothRev |
0.00000374855 s |
0.00000348575 s |
1.08 |
actmtch / DefOpt / tpu / PreRev |
0.000003451725 s |
0.00000341775 s |
1.01 |
actmtch / DefOpt / tpu / PostRev |
0.0000036698 s |
0.000003414525 s |
1.07 |
actmtch / DefOpt / tpu / BothRev |
0.000003453625 s |
0.00000341335 s |
1.01 |
actmtch / IDefOpt / tpu / PreRev |
0.000003749975 s |
0.000003470725 s |
1.08 |
actmtch / IDefOpt / tpu / PostRev |
0.0000034603 s |
0.0000034141 s |
1.01 |
actmtch / IDefOpt / tpu / BothRev |
0.0000037466 s |
0.000003471675 s |
1.08 |
actmtch / JaXPipe / cpu / Primal |
0.000013289 s |
0.000006552539998665452 s |
2.03 |
actmtch / Jax / cpu / Primal |
0.000013356 s |
0.000006462679994001519 s |
2.07 |
actmtch / HLOOpt / cpu / Primal |
0.000014082 s |
0.000007385220033029327 s |
1.91 |
actmtch / PartOpt / cpu / Primal |
0.000013368 s |
0.000006657979920419166 s |
2.01 |
actmtch / IPartOpt / cpu / Primal |
0.000013363 s |
0.0000066309799694863614 s |
2.02 |
actmtch / DefOpt / cpu / Primal |
0.000014032 s |
0.000007643220014870166 s |
1.84 |
actmtch / IDefOpt / cpu / Primal |
0.000013959 s |
0.000006935199999134056 s |
2.01 |
actmtch / JaXPipe / cpu / Forward |
0.000019163 s |
0.000010832880034286064 s |
1.77 |
actmtch / Jax / cpu / Forward |
0.000018009 s |
0.00000932088001718512 s |
1.93 |
actmtch / HLOOpt / cpu / Forward |
0.000018973 s |
0.000010819079961947865 s |
1.75 |
actmtch / PartOpt / cpu / Forward |
0.000019031 s |
0.000010567340013949434 s |
1.80 |
actmtch / IPartOpt / cpu / Forward |
0.000018797 s |
0.000010711440027080245 s |
1.75 |
actmtch / DefOpt / cpu / Forward |
0.000019135 s |
0.000011212999943381874 s |
1.71 |
actmtch / IDefOpt / cpu / Forward |
0.000019065 s |
0.000010483019959792727 s |
1.82 |
actmtch / JaXPipe / cpu / PreRev |
0.000019475 s |
0.00001079392011888558 s |
1.80 |
actmtch / JaXPipe / cpu / PostRev |
0.000017492 s |
0.00001015964004182024 s |
1.72 |
actmtch / JaXPipe / cpu / BothRev |
0.000018777 s |
0.000011096060025010956 s |
1.69 |
actmtch / Jax / cpu / BothRev |
0.000017638 s |
0.000009553080035402672 s |
1.85 |
actmtch / HLOOpt / cpu / PreRev |
0.0000193 s |
0.000011104160057584522 s |
1.74 |
actmtch / HLOOpt / cpu / PostRev |
0.000019129 s |
0.000013026439937675604 s |
1.47 |
actmtch / HLOOpt / cpu / BothRev |
0.000019211 s |
0.000010503799985599472 s |
1.83 |
actmtch / PartOpt / cpu / PreRev |
0.000019094 s |
0.00001085412002794328 s |
1.76 |
actmtch / PartOpt / cpu / PostRev |
0.000017708 s |
0.000009770079923328013 s |
1.81 |
actmtch / PartOpt / cpu / BothRev |
0.000019192 s |
0.00001131148008425953 s |
1.70 |
actmtch / IPartOpt / cpu / PreRev |
0.000019211 s |
0.000010512220051168697 s |
1.83 |
actmtch / IPartOpt / cpu / PostRev |
0.000017601 s |
0.000009670899926277345 s |
1.82 |
actmtch / IPartOpt / cpu / BothRev |
0.000019384 s |
0.00001107394004066009 s |
1.75 |
actmtch / DefOpt / cpu / PreRev |
0.000019259 s |
0.000010589979974611196 s |
1.82 |
actmtch / DefOpt / cpu / PostRev |
0.000018828 s |
0.000011556600020412588 s |
1.63 |
actmtch / DefOpt / cpu / BothRev |
0.000019082 s |
0.000010426920071040512 s |
1.83 |
actmtch / IDefOpt / cpu / PreRev |
0.000019419 s |
0.000010652239907358308 s |
1.82 |
actmtch / IDefOpt / cpu / PostRev |
0.000018921 s |
0.00001113797998186783 s |
1.70 |
actmtch / IDefOpt / cpu / BothRev |
0.00001903 s |
0.00001050803997713956 s |
1.81 |
add_one / JaXPipe / cuda / Primal |
0.000002335 s |
0.000001919 s |
1.22 |
add_one / Jax / cuda / Primal |
0.000002335 s |
0.0000019200000000000003 s |
1.22 |
add_one / HLOOpt / cuda / Primal |
0.000002304 s |
0.0000019200000000000003 s |
1.20 |
add_one / PartOpt / cuda / Primal |
0.000002335 s |
0.000001919 s |
1.22 |
add_one / IPartOpt / cuda / Primal |
0.000002304 s |
0.0000019200000000000003 s |
1.20 |
add_one / DefOpt / cuda / Primal |
0.000002335 s |
0.000001919 s |
1.22 |
add_one / IDefOpt / cuda / Primal |
0.000002335 s |
0.0000019200000000000003 s |
1.22 |
add_one / JaXPipe / cuda / Forward |
0.000010784 s |
0.0000096 s |
1.12 |
add_one / Jax / cuda / Forward |
0.000010592 s |
0.000010303 s |
1.03 |
add_one / HLOOpt / cuda / Forward |
0.00001056 s |
0.000010016 s |
1.05 |
add_one / PartOpt / cuda / Forward |
0.000010592 s |
0.000009248 s |
1.15 |
add_one / IPartOpt / cuda / Forward |
0.000010656 s |
0.000009856 s |
1.08 |
add_one / DefOpt / cuda / Forward |
0.000010656 s |
0.00000944 s |
1.13 |
add_one / IDefOpt / cuda / Forward |
0.000010529 s |
0.000009536 s |
1.10 |
add_one / JaXPipe / cuda / PreRev |
0.000024511000000000003 s |
0.000024704 s |
0.99 |
add_one / JaXPipe / cuda / PostRev |
0.0000248 s |
0.000024415 s |
1.02 |
add_one / JaXPipe / cuda / BothRev |
0.000024672 s |
0.000024576 s |
1.00 |
add_one / Jax / cuda / BothRev |
0.000025407 s |
0.000024255 s |
1.05 |
add_one / HLOOpt / cuda / PreRev |
0.000025664 s |
0.000024896 s |
1.03 |
add_one / HLOOpt / cuda / PostRev |
0.000025151 s |
0.000023936 s |
1.05 |
add_one / HLOOpt / cuda / BothRev |
0.00002512 s |
0.00002448 s |
1.03 |
add_one / PartOpt / cuda / PreRev |
0.000025248 s |
0.000024159 s |
1.05 |
add_one / PartOpt / cuda / PostRev |
0.000025472000000000003 s |
0.000024448 s |
1.04 |
add_one / PartOpt / cuda / BothRev |
0.000029152 s |
0.000024479 s |
1.19 |
add_one / IPartOpt / cuda / PreRev |
0.000025567 s |
0.000024864 s |
1.03 |
add_one / IPartOpt / cuda / PostRev |
0.000025312 s |
0.00002464 s |
1.03 |
add_one / IPartOpt / cuda / BothRev |
0.000028608 s |
0.000025312 s |
1.13 |
add_one / DefOpt / cuda / PreRev |
0.000025247 s |
0.000024768 s |
1.02 |
add_one / DefOpt / cuda / PostRev |
0.000024832 s |
0.000024704 s |
1.01 |
add_one / DefOpt / cuda / BothRev |
0.000025345 s |
0.000024575 s |
1.03 |
add_one / IDefOpt / cuda / PreRev |
0.000024608 s |
0.000025024 s |
0.98 |
add_one / IDefOpt / cuda / PostRev |
0.000025152 s |
0.000024352 s |
1.03 |
add_one / IDefOpt / cuda / BothRev |
0.000025056 s |
0.000024416 s |
1.03 |
add_one / JaXPipe / tpu / Primal |
0.0000014445250000000002 s |
0.0000014285000000000002 s |
1.01 |
add_one / Jax / tpu / Primal |
0.000001445025 s |
0.00000140035 s |
1.03 |
add_one / HLOOpt / tpu / Primal |
0.0000014499 s |
0.0000014297499999999995 s |
1.01 |
add_one / PartOpt / tpu / Primal |
0.0000014558499999999998 s |
0.0000014069999999999998 s |
1.03 |
add_one / IPartOpt / tpu / Primal |
0.0000014522999999999998 s |
0.00000142595 s |
1.02 |
add_one / DefOpt / tpu / Primal |
0.0000014532 s |
0.000001403875 s |
1.04 |
add_one / IDefOpt / tpu / Primal |
0.000001451075 s |
0.00000142315 s |
1.02 |
add_one / JaXPipe / tpu / Forward |
0.0000019067 s |
0.000001847025 s |
1.03 |
add_one / Jax / tpu / Forward |
0.00000186305 s |
0.000001833725 s |
1.02 |
add_one / HLOOpt / tpu / Forward |
0.000001909625 s |
0.000001849325 s |
1.03 |
add_one / PartOpt / tpu / Forward |
0.000001876975 s |
0.00000184025 s |
1.02 |
add_one / IPartOpt / tpu / Forward |
0.000001903575 s |
0.0000018542 s |
1.03 |
add_one / DefOpt / tpu / Forward |
0.000001864725 s |
0.000001840125 s |
1.01 |
add_one / IDefOpt / tpu / Forward |
0.000001904225 s |
0.0000018431 s |
1.03 |
add_one / JaXPipe / tpu / PreRev |
0.0000022687 s |
0.000002231 s |
1.02 |
add_one / JaXPipe / tpu / PostRev |
0.0000022959 s |
0.0000022341000000000003 s |
1.03 |
add_one / JaXPipe / tpu / BothRev |
0.000002259 s |
0.00000223525 s |
1.01 |
add_one / Jax / tpu / BothRev |
0.00000230265 s |
0.000002241375 s |
1.03 |
add_one / HLOOpt / tpu / PreRev |
0.0000022548 s |
0.000002234475 s |
1.01 |
add_one / HLOOpt / tpu / PostRev |
0.0000022925 s |
0.0000022342 s |
1.03 |
add_one / HLOOpt / tpu / BothRev |
0.0000022592000000000003 s |
0.000002238875 s |
1.01 |
add_one / PartOpt / tpu / PreRev |
0.000002293675 s |
0.000002242175 s |
1.02 |
add_one / PartOpt / tpu / PostRev |
0.000002279675 s |
0.00000223735 s |
1.02 |
add_one / PartOpt / tpu / BothRev |
0.000002292 s |
0.000002239825 s |
1.02 |
add_one / IPartOpt / tpu / PreRev |
0.000002259675 s |
0.0000022381750000000004 s |
1.01 |
add_one / IPartOpt / tpu / PostRev |
0.00000228755 s |
0.0000022416 s |
1.02 |
add_one / IPartOpt / tpu / BothRev |
0.00000225615 s |
0.000002238225 s |
1.01 |
add_one / DefOpt / tpu / PreRev |
0.000002300375 s |
0.000002236725 s |
1.03 |
add_one / DefOpt / tpu / PostRev |
0.0000022613 s |
0.0000022327 s |
1.01 |
add_one / DefOpt / tpu / BothRev |
0.0000022898 s |
0.00000224765 s |
1.02 |
add_one / IDefOpt / tpu / PreRev |
0.0000022613 s |
0.00000223625 s |
1.01 |
add_one / IDefOpt / tpu / PostRev |
0.000002292575 s |
0.0000022397 s |
1.02 |
add_one / IDefOpt / tpu / BothRev |
0.00000226415 s |
0.000002248125 s |
1.01 |
add_one / JaXPipe / cpu / Primal |
0.000013037 s |
0.000006873860002087895 s |
1.90 |
add_one / Jax / cpu / Primal |
0.000013089 s |
0.000006641540094278752 s |
1.97 |
add_one / HLOOpt / cpu / Primal |
0.000012676 s |
0.000006734419948770665 s |
1.88 |
add_one / PartOpt / cpu / Primal |
0.000012891 s |
0.000006703060043946607 s |
1.92 |
add_one / IPartOpt / cpu / Primal |
0.000012876 s |
0.000007102279996615835 s |
1.81 |
add_one / DefOpt / cpu / Primal |
0.000012823 s |
0.000006567140026163543 s |
1.95 |
add_one / IDefOpt / cpu / Primal |
0.000012738 s |
0.000006652760075667174 s |
1.91 |
add_one / JaXPipe / cpu / Forward |
0.000017662 s |
0.000009910600037983386 s |
1.78 |
add_one / Jax / cpu / Forward |
0.000017507 s |
0.0000099407600282575 s |
1.76 |
add_one / HLOOpt / cpu / Forward |
0.000017684 s |
0.000010102879969053902 s |
1.75 |
add_one / PartOpt / cpu / Forward |
0.000017622 s |
0.000010000159909395734 s |
1.76 |
add_one / IPartOpt / cpu / Forward |
0.000017911 s |
0.000010069060044770594 s |
1.78 |
add_one / DefOpt / cpu / Forward |
0.000017597 s |
0.000010091080039273947 s |
1.74 |
add_one / IDefOpt / cpu / Forward |
0.000017406999999999998 s |
0.000010010940022766593 s |
1.74 |
add_one / JaXPipe / cpu / PreRev |
0.000020017 s |
0.00001154700001279707 s |
1.73 |
add_one / JaXPipe / cpu / PostRev |
0.00001979 s |
0.000010986779925588052 s |
1.80 |
add_one / JaXPipe / cpu / BothRev |
0.000019905 s |
0.00001186152005175245 s |
1.68 |
add_one / Jax / cpu / BothRev |
0.000019558 s |
0.000011301159975118936 s |
1.73 |
add_one / HLOOpt / cpu / PreRev |
0.000019616 s |
0.000011737559998437065 s |
1.67 |
add_one / HLOOpt / cpu / PostRev |
0.000019794 s |
0.000013497480031219311 s |
1.47 |
add_one / HLOOpt / cpu / BothRev |
0.000019735 s |
0.000011520519910845904 s |
1.71 |
add_one / PartOpt / cpu / PreRev |
0.000019721 s |
0.000011383839992049615 s |
1.73 |
add_one / PartOpt / cpu / PostRev |
0.000019936 s |
0.0000113441400208103 s |
1.76 |
add_one / PartOpt / cpu / BothRev |
0.000019643 s |
0.000011659819992928531 s |
1.68 |
add_one / IPartOpt / cpu / PreRev |
0.000019797 s |
0.00001109857996198116 s |
1.78 |
add_one / IPartOpt / cpu / PostRev |
0.000019677 s |
0.000011195859951840247 s |
1.76 |
add_one / IPartOpt / cpu / BothRev |
0.000019518 s |
0.000011110880095657192 s |
1.76 |
add_one / DefOpt / cpu / PreRev |
0.000019648 s |
0.00001151215998106636 s |
1.71 |
add_one / DefOpt / cpu / PostRev |
0.000019787 s |
0.00001111472003685776 s |
1.78 |
add_one / DefOpt / cpu / BothRev |
0.000019901 s |
0.00001123659993027104 s |
1.77 |
add_one / IDefOpt / cpu / PreRev |
0.000019775 s |
0.000011117619960714364 s |
1.78 |
add_one / IDefOpt / cpu / PostRev |
0.000019846 s |
0.000012493279918999178 s |
1.59 |
add_one / IDefOpt / cpu / BothRev |
0.000019596 s |
0.00001183342010335764 s |
1.66 |
add_two / JaXPipe / cuda / Primal |
0.000002431 s |
0.000001887 s |
1.29 |
add_two / Jax / cuda / Primal |
0.000002432 s |
0.000001887 s |
1.29 |
add_two / HLOOpt / cuda / Primal |
0.000002431 s |
0.000001887 s |
1.29 |
add_two / PartOpt / cuda / Primal |
0.000002431 s |
0.000001887 s |
1.29 |
add_two / IPartOpt / cuda / Primal |
0.000002432 s |
0.000001887 s |
1.29 |
add_two / DefOpt / cuda / Primal |
0.000002431 s |
0.000001887 s |
1.29 |
add_two / IDefOpt / cuda / Primal |
0.000002431 s |
0.000001887 s |
1.29 |
add_two / JaXPipe / cuda / Forward |
0.000011329 s |
0.00000992 s |
1.14 |
add_two / Jax / cuda / Forward |
0.000011104 s |
0.000009472 s |
1.17 |
add_two / HLOOpt / cuda / Forward |
0.000011328 s |
0.000009696 s |
1.17 |
add_two / PartOpt / cuda / Forward |
0.000010592 s |
0.00000976 s |
1.09 |
add_two / IPartOpt / cuda / Forward |
0.000010272 s |
0.00000944 s |
1.09 |
add_two / DefOpt / cuda / Forward |
0.000010528 s |
0.000009791 s |
1.08 |
add_two / IDefOpt / cuda / Forward |
0.000012864 s |
0.000009408 s |
1.37 |
add_two / JaXPipe / cuda / PreRev |
0.000031743 s |
0.000032063 s |
0.99 |
add_two / JaXPipe / cuda / PostRev |
0.000031999 s |
0.000031231 s |
1.02 |
add_two / JaXPipe / cuda / BothRev |
0.000031904000000000005 s |
0.000031903 s |
1.00 |
add_two / Jax / cuda / BothRev |
0.000032192 s |
0.00003184 s |
1.01 |
add_two / HLOOpt / cuda / PreRev |
0.000032672 s |
0.000032352 s |
1.01 |
add_two / HLOOpt / cuda / PostRev |
0.000032160000000000004 s |
0.000031456 s |
1.02 |
add_two / HLOOpt / cuda / BothRev |
0.000031967 s |
0.000032928 s |
0.97 |
add_two / PartOpt / cuda / PreRev |
0.000032704 s |
0.000032864 s |
1.00 |
add_two / PartOpt / cuda / PostRev |
0.000031487 s |
0.000032544 s |
0.97 |
add_two / PartOpt / cuda / BothRev |
0.000032095 s |
0.000031743 s |
1.01 |
add_two / IPartOpt / cuda / PreRev |
0.000033087 s |
0.000031968 s |
1.04 |
add_two / IPartOpt / cuda / PostRev |
0.000031936 s |
0.000032672 s |
0.98 |
add_two / IPartOpt / cuda / BothRev |
0.000032384 s |
0.00003648 s |
0.89 |
add_two / DefOpt / cuda / PreRev |
0.000032416 s |
0.000036671 s |
0.88 |
add_two / DefOpt / cuda / PostRev |
0.000031776 s |
0.000035839 s |
0.89 |
add_two / DefOpt / cuda / BothRev |
0.000031839 s |
0.000035776000000000004 s |
0.89 |
add_two / IDefOpt / cuda / PreRev |
0.000032672 s |
0.000036064 s |
0.91 |
add_two / IDefOpt / cuda / PostRev |
0.00003184 s |
0.000036415 s |
0.87 |
add_two / IDefOpt / cuda / BothRev |
0.000032351 s |
0.000032096 s |
1.01 |
add_two / JaXPipe / tpu / Primal |
0.0000013981 s |
0.00000143495 s |
0.97 |
add_two / Jax / tpu / Primal |
0.000001456275 s |
0.000001478325 s |
0.99 |
add_two / HLOOpt / tpu / Primal |
0.00000139505 s |
0.000001433025 s |
0.97 |
add_two / PartOpt / tpu / Primal |
0.0000014438999999999998 s |
0.00000147025 s |
0.98 |
add_two / IPartOpt / tpu / Primal |
0.000001387125 s |
0.0000014277500000000002 s |
0.97 |
add_two / DefOpt / tpu / Primal |
0.0000014413 s |
0.00000147415 s |
0.98 |
add_two / IDefOpt / tpu / Primal |
0.0000013894249999999998 s |
0.0000014370250000000002 s |
0.97 |
add_two / JaXPipe / tpu / Forward |
0.00000180235 s |
0.00000183055 s |
0.98 |
add_two / Jax / tpu / Forward |
0.0000017926 s |
0.000001831125 s |
0.98 |
add_two / HLOOpt / tpu / Forward |
0.000001808325 s |
0.00000182945 s |
0.99 |
add_two / PartOpt / tpu / Forward |
0.0000017868249999999998 s |
0.0000018309 s |
0.98 |
add_two / IPartOpt / tpu / Forward |
0.000001808875 s |
0.000001826875 s |
0.99 |
add_two / DefOpt / tpu / Forward |
0.00000179785 s |
0.00000182775 s |
0.98 |
add_two / IDefOpt / tpu / Forward |
0.00000181895 s |
0.000001824025 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.000002802125 s |
0.00000284975 s |
0.98 |
add_two / JaXPipe / tpu / PostRev |
0.00000272 s |
0.0000027693000000000003 s |
0.98 |
add_two / JaXPipe / tpu / BothRev |
0.000002796175 s |
0.0000028433000000000004 s |
0.98 |
add_two / Jax / tpu / BothRev |
0.00000273065 s |
0.0000027424 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.000002800275 s |
0.000002847575 s |
0.98 |
add_two / HLOOpt / tpu / PostRev |
0.000002728175 s |
0.0000027530500000000003 s |
0.99 |
add_two / HLOOpt / tpu / BothRev |
0.0000027923500000000005 s |
0.000002843875 s |
0.98 |
add_two / PartOpt / tpu / PreRev |
0.000002728075 s |
0.000002747275 s |
0.99 |
add_two / PartOpt / tpu / PostRev |
0.000002802025 s |
0.0000028402500000000005 s |
0.99 |
add_two / PartOpt / tpu / BothRev |
0.000002723475 s |
0.00000274815 s |
0.99 |
add_two / IPartOpt / tpu / PreRev |
0.0000028019 s |
0.00000284545 s |
0.98 |
add_two / IPartOpt / tpu / PostRev |
0.0000027226 s |
0.000002744725 s |
0.99 |
add_two / IPartOpt / tpu / BothRev |
0.000002793775 s |
0.000002844925 s |
0.98 |
add_two / DefOpt / tpu / PreRev |
0.0000027136500000000003 s |
0.000002748175 s |
0.99 |
add_two / DefOpt / tpu / PostRev |
0.00000281775 s |
0.000002833925 s |
0.99 |
add_two / DefOpt / tpu / BothRev |
0.000002731175 s |
0.000002753225 s |
0.99 |
add_two / IDefOpt / tpu / PreRev |
0.0000027999 s |
0.00000283975 s |
0.99 |
add_two / IDefOpt / tpu / PostRev |
0.0000027187 s |
0.000002749175 s |
0.99 |
add_two / IDefOpt / tpu / BothRev |
0.000002792825 s |
0.0000028428750000000003 s |
0.98 |
add_two / JaXPipe / cpu / Primal |
0.000013538 s |
0.000006639200055360561 s |
2.04 |
add_two / Jax / cpu / Primal |
0.000013323 s |
0.000006919120005477453 s |
1.93 |
add_two / HLOOpt / cpu / Primal |
0.000013342 s |
0.000007400899976346409 s |
1.80 |
add_two / PartOpt / cpu / Primal |
0.000013227 s |
0.000007223580032587052 s |
1.83 |
add_two / IPartOpt / cpu / Primal |
0.000013258 s |
0.000007824360091035487 s |
1.69 |
add_two / DefOpt / cpu / Primal |
0.000013431 s |
0.000006683000028715469 s |
2.01 |
add_two / IDefOpt / cpu / Primal |
0.000013081 s |
0.00000732392003556015 s |
1.79 |
add_two / JaXPipe / cpu / Forward |
0.000017773 s |
0.00001016425998386694 s |
1.75 |
add_two / Jax / cpu / Forward |
0.000017926 s |
0.000010257819976686733 s |
1.75 |
add_two / HLOOpt / cpu / Forward |
0.000017763000000000003 s |
0.000010419079990242608 s |
1.70 |
add_two / PartOpt / cpu / Forward |
0.000017887 s |
0.00001018144001136534 s |
1.76 |
add_two / IPartOpt / cpu / Forward |
0.000017976 s |
0.000010250659961457132 s |
1.75 |
add_two / DefOpt / cpu / Forward |
0.000018078 s |
0.00001075334002962336 s |
1.68 |
add_two / IDefOpt / cpu / Forward |
0.000017896 s |
0.000010367140021116938 s |
1.73 |
add_two / JaXPipe / cpu / PreRev |
0.000023586 s |
0.000013726959950872695 s |
1.72 |
add_two / JaXPipe / cpu / PostRev |
0.000023318 s |
0.000013312379996932575 s |
1.75 |
add_two / JaXPipe / cpu / BothRev |
0.000022992 s |
0.000013630859866680112 s |
1.69 |
add_two / Jax / cpu / BothRev |
0.000022975 s |
0.00001391522002450074 s |
1.65 |
add_two / HLOOpt / cpu / PreRev |
0.000023049 s |
0.000014176519925968025 s |
1.63 |
add_two / HLOOpt / cpu / PostRev |
0.000022574 s |
0.00001600007994056796 s |
1.41 |
add_two / HLOOpt / cpu / BothRev |
0.000023493 s |
0.000013237259991001335 s |
1.77 |
add_two / PartOpt / cpu / PreRev |
0.000022773 s |
0.000013702699980058242 s |
1.66 |
add_two / PartOpt / cpu / PostRev |
0.000023463 s |
0.000013647719970322216 s |
1.72 |
add_two / PartOpt / cpu / BothRev |
0.000023389 s |
0.000013649460015585646 s |
1.71 |
add_two / IPartOpt / cpu / PreRev |
0.000023527 s |
0.00001414162001310615 s |
1.66 |
add_two / IPartOpt / cpu / PostRev |
0.00002308 s |
0.000013527139999496284 s |
1.71 |
add_two / IPartOpt / cpu / BothRev |
0.000023363 s |
0.000013481519945344189 s |
1.73 |
add_two / DefOpt / cpu / PreRev |
0.000023199 s |
0.000013913919956394238 s |
1.67 |
add_two / DefOpt / cpu / PostRev |
0.000023264 s |
0.00001358887988317292 s |
1.71 |
add_two / DefOpt / cpu / BothRev |
0.000023274 s |
0.000013515859882318182 s |
1.72 |
add_two / IDefOpt / cpu / PreRev |
0.000023478 s |
0.000013658319894602754 s |
1.72 |
add_two / IDefOpt / cpu / PostRev |
0.000023484 s |
0.000013398179926298323 s |
1.75 |
add_two / IDefOpt / cpu / BothRev |
0.000023215 s |
0.000013483259972417726 s |
1.72 |
cache / JaXPipe / cuda / Primal |
0.000002335 s |
0.000002335 s |
1 |
cache / Jax / cuda / Primal |
0.000002336 s |
0.000002336 s |
1 |
cache / HLOOpt / cuda / Primal |
0.000002335 s |
0.00000224 s |
1.04 |
cache / PartOpt / cuda / Primal |
0.000002335 s |
0.000002304 s |
1.01 |
cache / IPartOpt / cuda / Primal |
0.000002335 s |
0.000002335 s |
1 |
cache / DefOpt / cuda / Primal |
0.000002335 s |
0.000002273 s |
1.03 |
cache / IDefOpt / cuda / Primal |
0.000002335 s |
0.000002272 s |
1.03 |
cache / JaXPipe / cuda / Forward |
0.0000023670000000000004 s |
0.000002336 s |
1.01 |
cache / Jax / cuda / Forward |
0.000002336 s |
0.0000023670000000000004 s |
0.99 |
cache / HLOOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / PartOpt / cuda / Forward |
0.000002336 s |
0.0000023670000000000004 s |
0.99 |
cache / IPartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / DefOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002272 s |
1.04 |
cache / IDefOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / JaXPipe / cuda / PreRev |
0.000010816 s |
0.000010144 s |
1.07 |
cache / JaXPipe / cuda / PostRev |
0.0000104 s |
0.000010272 s |
1.01 |
cache / JaXPipe / cuda / BothRev |
0.0000112 s |
0.00000992 s |
1.13 |
cache / Jax / cuda / BothRev |
0.00001072 s |
0.000010176 s |
1.05 |
cache / HLOOpt / cuda / PreRev |
0.00001376 s |
0.000013183 s |
1.04 |
cache / HLOOpt / cuda / PostRev |
0.000013696 s |
0.00001312 s |
1.04 |
cache / HLOOpt / cuda / BothRev |
0.000013728 s |
0.000013184 s |
1.04 |
cache / PartOpt / cuda / PreRev |
0.000010816 s |
0.000010623 s |
1.02 |
cache / PartOpt / cuda / PostRev |
0.000010688 s |
0.00001024 s |
1.04 |
cache / PartOpt / cuda / BothRev |
0.000010656 s |
0.000011647 s |
0.91 |
cache / IPartOpt / cuda / PreRev |
0.000010752 s |
0.00001072 s |
1.00 |
cache / IPartOpt / cuda / PostRev |
0.000011008 s |
0.000011936 s |
0.92 |
cache / IPartOpt / cuda / BothRev |
0.0000112 s |
0.000010464 s |
1.07 |
cache / DefOpt / cuda / PreRev |
0.000010784 s |
0.000011775 s |
0.92 |
cache / DefOpt / cuda / PostRev |
0.000010752 s |
0.000011327 s |
0.95 |
cache / DefOpt / cuda / BothRev |
0.000010623 s |
0.000010527 s |
1.01 |
cache / IDefOpt / cuda / PreRev |
0.000010816 s |
0.000010112 s |
1.07 |
cache / IDefOpt / cuda / PostRev |
0.000010593 s |
0.000010304 s |
1.03 |
cache / IDefOpt / cuda / BothRev |
0.000011520000000000002 s |
0.00001056 s |
1.09 |
cache / JaXPipe / tpu / Primal |
0.0000024534 s |
0.000002465925 s |
0.99 |
cache / Jax / tpu / Primal |
0.000002475825 s |
0.000002454575 s |
1.01 |
cache / HLOOpt / tpu / Primal |
0.000002467125 s |
0.00000246235 s |
1.00 |
cache / PartOpt / tpu / Primal |
0.0000024798 s |
0.000002465875 s |
1.01 |
cache / IPartOpt / tpu / Primal |
0.000002467675 s |
0.0000024584 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.0000024774750000000004 s |
0.0000024616250000000003 s |
1.01 |
cache / IDefOpt / tpu / Primal |
0.000002471725 s |
0.000002461 s |
1.00 |
cache / JaXPipe / tpu / Forward |
0.000003537575 s |
0.000003532375 s |
1.00 |
cache / Jax / tpu / Forward |
0.000003540275 s |
0.0000035432750000000004 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.0000035617749999999995 s |
0.000003569075 s |
1.00 |
cache / PartOpt / tpu / Forward |
0.000003529175 s |
0.00000353285 s |
1.00 |
cache / IPartOpt / tpu / Forward |
0.00000355885 s |
0.0000035633 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.000003523725 s |
0.000003526175 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.0000035503750000000003 s |
0.000003552625 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.000004943874999999999 s |
0.00000494295 s |
1.00 |
cache / JaXPipe / tpu / PostRev |
0.0000050378 s |
0.000004941975 s |
1.02 |
cache / JaXPipe / tpu / BothRev |
0.00000498595 s |
0.000004965924999999999 s |
1.00 |
cache / Jax / tpu / BothRev |
0.000005028175 s |
0.000004960275 s |
1.01 |
cache / HLOOpt / tpu / PreRev |
0.0000041347250000000005 s |
0.00000392735 s |
1.05 |
cache / HLOOpt / tpu / PostRev |
0.000004146525 s |
0.000004120599999999999 s |
1.01 |
cache / HLOOpt / tpu / BothRev |
0.000004134025000000001 s |
0.000003920175000000001 s |
1.05 |
cache / PartOpt / tpu / PreRev |
0.000005009475 s |
0.000004993325 s |
1.00 |
cache / PartOpt / tpu / PostRev |
0.000004989075 s |
0.00000495125 s |
1.01 |
cache / PartOpt / tpu / BothRev |
0.000005021025 s |
0.000004992050000000001 s |
1.01 |
cache / IPartOpt / tpu / PreRev |
0.00000501345 s |
0.00000496495 s |
1.01 |
cache / IPartOpt / tpu / PostRev |
0.0000049997 s |
0.000004982175 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.000005003675 s |
0.000004973575 s |
1.01 |
cache / DefOpt / tpu / PreRev |
0.0000050145 s |
0.00000498995 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.000004980425 s |
0.000004962525 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.000005020274999999999 s |
0.000004980075 s |
1.01 |
cache / IDefOpt / tpu / PreRev |
0.000004991575 s |
0.0000049623 s |
1.01 |
cache / IDefOpt / tpu / PostRev |
0.00000499815 s |
0.00000495285 s |
1.01 |
cache / IDefOpt / tpu / BothRev |
0.000004972 s |
0.00000495055 s |
1.00 |
cache / JaXPipe / cpu / Primal |
0.000012717 s |
0.000006346800018945942 s |
2.00 |
cache / Jax / cpu / Primal |
0.000012586 s |
0.0000064703600401117 s |
1.95 |
cache / HLOOpt / cpu / Primal |
0.000012348 s |
0.000006257859986362746 s |
1.97 |
cache / PartOpt / cpu / Primal |
0.00001285 s |
0.000006558360055350931 s |
1.96 |
cache / IPartOpt / cpu / Primal |
0.00001273 s |
0.0000060225199558772144 s |
2.11 |
cache / DefOpt / cpu / Primal |
0.000012586 s |
0.0000061135000214562754 s |
2.06 |
cache / IDefOpt / cpu / Primal |
0.000012676 s |
0.000005915639994782396 s |
2.14 |
cache / JaXPipe / cpu / Forward |
0.000017131 s |
0.000014798280026298015 s |
1.16 |
cache / Jax / cpu / Forward |
0.000017222000000000002 s |
0.000014563039967470103 s |
1.18 |
cache / HLOOpt / cpu / Forward |
0.000016896999999999998 s |
0.00001559456002723891 s |
1.08 |
cache / PartOpt / cpu / Forward |
0.000017187 s |
0.000015037659995869034 s |
1.14 |
cache / IPartOpt / cpu / Forward |
0.000017068 s |
0.000015018099966255249 s |
1.14 |
cache / DefOpt / cpu / Forward |
0.000016887 s |
0.000015332019993365977 s |
1.10 |
cache / IDefOpt / cpu / Forward |
0.000017208000000000002 s |
0.000014673120003863004 s |
1.17 |
cache / JaXPipe / cpu / PreRev |
0.000018005 s |
0.000016887860037968494 s |
1.07 |
cache / JaXPipe / cpu / PostRev |
0.000020005 s |
0.0000211999999373802 s |
0.94 |
cache / JaXPipe / cpu / BothRev |
0.000017589 s |
0.000016905980064620963 s |
1.04 |
cache / Jax / cpu / BothRev |
0.000019737 s |
0.000021262679965730057 s |
0.93 |
cache / HLOOpt / cpu / PreRev |
0.00001749 s |
0.000016826700029923813 s |
1.04 |
cache / HLOOpt / cpu / PostRev |
0.000017494 s |
0.000019113939961243885 s |
0.92 |
cache / HLOOpt / cpu / BothRev |
0.000017955 s |
0.000016683340036252047 s |
1.08 |
cache / PartOpt / cpu / PreRev |
0.000017956 s |
0.00001720952001051046 s |
1.04 |
cache / PartOpt / cpu / PostRev |
0.000018847 s |
0.00002065606000542175 s |
0.91 |
cache / PartOpt / cpu / BothRev |
0.000017697 s |
0.000016254839956673094 s |
1.09 |
cache / IPartOpt / cpu / PreRev |
0.000017590000000000003 s |
0.000016843620032886973 s |
1.04 |
cache / IPartOpt / cpu / PostRev |
0.00001905 s |
0.000021748199960711643 s |
0.88 |
cache / IPartOpt / cpu / BothRev |
0.000017755 s |
0.00001659785995798302 s |
1.07 |
cache / DefOpt / cpu / PreRev |
0.000017371 s |
0.000016926499993132892 s |
1.03 |
cache / DefOpt / cpu / PostRev |
0.000017179 s |
0.00001647445998969488 s |
1.04 |
cache / DefOpt / cpu / BothRev |
0.000017275 s |
0.000016036959987104638 s |
1.08 |
cache / IDefOpt / cpu / PreRev |
0.000017738000000000002 s |
0.000016723539974918822 s |
1.06 |
cache / IDefOpt / cpu / PostRev |
0.000017343 s |
0.000016724459928809665 s |
1.04 |
cache / IDefOpt / cpu / BothRev |
0.000017718000000000002 s |
0.00001608033997399616 s |
1.10 |
Concat / JaXPipe / cuda / Primal |
0.000002463 s |
0.000001919 s |
1.28 |
Concat / Jax / cuda / Primal |
0.000002463 s |
0.0000019200000000000003 s |
1.28 |
Concat / HLOOpt / cuda / Primal |
0.000002463 s |
0.000001919 s |
1.28 |
Concat / PartOpt / cuda / Primal |
0.000002463 s |
0.0000019200000000000003 s |
1.28 |
Concat / IPartOpt / cuda / Primal |
0.000002463 s |
0.000001919 s |
1.28 |
Concat / DefOpt / cuda / Primal |
0.000002463 s |
0.0000019200000000000003 s |
1.28 |
Concat / IDefOpt / cuda / Primal |
0.000002463 s |
0.000001919 s |
1.28 |
Concat / JaXPipe / cuda / Forward |
0.000010655 s |
0.000009696 s |
1.10 |
Concat / Jax / cuda / Forward |
0.000010592 s |
0.00000992 s |
1.07 |
Concat / HLOOpt / cuda / Forward |
0.000010656 s |
0.000009664 s |
1.10 |
Concat / PartOpt / cuda / Forward |
0.000010464 s |
0.000009536 s |
1.10 |
Concat / IPartOpt / cuda / Forward |
0.000010592 s |
0.000009824 s |
1.08 |
Concat / DefOpt / cuda / Forward |
0.00001072 s |
0.000009951 s |
1.08 |
Concat / IDefOpt / cuda / Forward |
0.000010464 s |
0.0000096 s |
1.09 |
Concat / JaXPipe / cuda / PreRev |
0.000017119 s |
0.000015487 s |
1.11 |
Concat / JaXPipe / cuda / PostRev |
0.00001696 s |
0.00001632 s |
1.04 |
Concat / JaXPipe / cuda / BothRev |
0.000016864 s |
0.000016576000000000002 s |
1.02 |
Concat / Jax / cuda / BothRev |
0.000016895 s |
0.000015968 s |
1.06 |
Concat / HLOOpt / cuda / PreRev |
0.000016832 s |
0.000016128 s |
1.04 |
Concat / HLOOpt / cuda / PostRev |
0.000017184 s |
0.000015744 s |
1.09 |
Concat / HLOOpt / cuda / BothRev |
0.000016832 s |
0.000016416 s |
1.03 |
Concat / PartOpt / cuda / PreRev |
0.000017216 s |
0.000016288 s |
1.06 |
Concat / PartOpt / cuda / PostRev |
0.00001696 s |
0.000015935999999999998 s |
1.06 |
Concat / PartOpt / cuda / BothRev |
0.000016864 s |
0.000016063999999999997 s |
1.05 |
Concat / IPartOpt / cuda / PreRev |
0.000017184 s |
0.000016768000000000003 s |
1.02 |
Concat / IPartOpt / cuda / PostRev |
0.000017024 s |
0.00001616 s |
1.05 |
Concat / IPartOpt / cuda / BothRev |
0.000016864 s |
0.000015711 s |
1.07 |
Concat / DefOpt / cuda / PreRev |
0.000016512 s |
0.000015649 s |
1.06 |
Concat / DefOpt / cuda / PostRev |
0.000016544 s |
0.000016383999999999998 s |
1.01 |
Concat / DefOpt / cuda / BothRev |
0.000016864 s |
0.00001616 s |
1.04 |
Concat / IDefOpt / cuda / PreRev |
0.000016705 s |
0.000015999 s |
1.04 |
Concat / IDefOpt / cuda / PostRev |
0.000016704 s |
0.000016448000000000002 s |
1.02 |
Concat / IDefOpt / cuda / BothRev |
0.000018176 s |
0.000016288 s |
1.12 |
Concat / JaXPipe / tpu / Primal |
0.0000015104 s |
0.000001534825 s |
0.98 |
Concat / Jax / tpu / Primal |
0.000001526125 s |
0.000001540325 s |
0.99 |
Concat / HLOOpt / tpu / Primal |
0.0000015064500000000002 s |
0.000001529325 s |
0.99 |
Concat / PartOpt / tpu / Primal |
0.0000015173 s |
0.000001534825 s |
0.99 |
Concat / IPartOpt / tpu / Primal |
0.00000151935 s |
0.000001545625 s |
0.98 |
Concat / DefOpt / tpu / Primal |
0.0000015237500000000002 s |
0.0000015349250000000002 s |
0.99 |
Concat / IDefOpt / tpu / Primal |
0.0000015082 s |
0.00000153315 s |
0.98 |
Concat / JaXPipe / tpu / Forward |
0.000001560725 s |
0.000001576825 s |
0.99 |
Concat / Jax / tpu / Forward |
0.0000015666749999999997 s |
0.00000155415 s |
1.01 |
Concat / HLOOpt / tpu / Forward |
0.00000153905 s |
0.0000015822 s |
0.97 |
Concat / PartOpt / tpu / Forward |
0.00000155415 s |
0.00000154945 s |
1.00 |
Concat / IPartOpt / tpu / Forward |
0.000001567825 s |
0.0000015807 s |
0.99 |
Concat / DefOpt / tpu / Forward |
0.000001556375 s |
0.0000015530250000000005 s |
1.00 |
Concat / IDefOpt / tpu / Forward |
0.00000156495 s |
0.000001589775 s |
0.98 |
Concat / JaXPipe / tpu / PreRev |
0.0000020222 s |
0.0000020067 s |
1.01 |
Concat / JaXPipe / tpu / PostRev |
0.000001998875 s |
0.000002110625 s |
0.95 |
Concat / JaXPipe / tpu / BothRev |
0.000002041 s |
0.00000201985 s |
1.01 |
Concat / Jax / tpu / BothRev |
0.0000020119500000000003 s |
0.0000020861 s |
0.96 |
Concat / HLOOpt / tpu / PreRev |
0.00000201785 s |
0.0000020079250000000003 s |
1.00 |
Concat / HLOOpt / tpu / PostRev |
0.00000199085 s |
0.0000020871000000000003 s |
0.95 |
Concat / HLOOpt / tpu / BothRev |
0.000002012625 s |
0.00000200515 s |
1.00 |
Concat / PartOpt / tpu / PreRev |
0.00000199345 s |
0.0000020804 s |
0.96 |
Concat / PartOpt / tpu / PostRev |
0.000002009125 s |
0.00000200155 s |
1.00 |
Concat / PartOpt / tpu / BothRev |
0.000001997625 s |
0.0000020847 s |
0.96 |
Concat / IPartOpt / tpu / PreRev |
0.0000020133499999999995 s |
0.000002007975 s |
1.00 |
Concat / IPartOpt / tpu / PostRev |
0.000001996425 s |
0.0000020867250000000003 s |
0.96 |
Concat / IPartOpt / tpu / BothRev |
0.0000020108000000000003 s |
0.0000020111 s |
1.00 |
Concat / DefOpt / tpu / PreRev |
0.0000019943 s |
0.0000020867 s |
0.96 |
Concat / DefOpt / tpu / PostRev |
0.000002015175 s |
0.000002003125 s |
1.01 |
Concat / DefOpt / tpu / BothRev |
0.0000019901250000000003 s |
0.00000209595 s |
0.95 |
Concat / IDefOpt / tpu / PreRev |
0.0000020090500000000003 s |
0.000002009925 s |
1.00 |
Concat / IDefOpt / tpu / PostRev |
0.000001993125 s |
0.0000020858 s |
0.96 |
Concat / IDefOpt / tpu / BothRev |
0.0000020124 s |
0.000002015175 s |
1.00 |
Concat / JaXPipe / cpu / Primal |
0.000013034 s |
0.000006445700018957723 s |
2.02 |
Concat / Jax / cpu / Primal |
0.000012749 s |
0.0000068173200452292805 s |
1.87 |
Concat / HLOOpt / cpu / Primal |
0.000012753 s |
0.000006353679873427609 s |
2.01 |
Concat / PartOpt / cpu / Primal |
0.000012671 s |
0.00000617892001173459 s |
2.05 |
Concat / IPartOpt / cpu / Primal |
0.000012678 s |
0.00000728348004486179 s |
1.74 |
Concat / DefOpt / cpu / Primal |
0.000012904 s |
0.000006313840094662737 s |
2.04 |
Concat / IDefOpt / cpu / Primal |
0.000012828000000000002 s |
0.000006743359972460894 s |
1.90 |
Concat / JaXPipe / cpu / Forward |
0.000017791 s |
0.000009786760056158528 s |
1.82 |
Concat / Jax / cpu / Forward |
0.000017551000000000002 s |
0.000009296239986724688 s |
1.89 |
Concat / HLOOpt / cpu / Forward |
0.000017335 s |
0.000009474220005358804 s |
1.83 |
Concat / PartOpt / cpu / Forward |
0.000017281999999999998 s |
0.000009963219999917785 s |
1.73 |
Concat / IPartOpt / cpu / Forward |
0.000017193 s |
0.000009927740084094694 s |
1.73 |
Concat / DefOpt / cpu / Forward |
0.000017605 s |
0.000010052360012196003 s |
1.75 |
Concat / IDefOpt / cpu / Forward |
0.000017515 s |
0.000009728680070111296 s |
1.80 |
Concat / JaXPipe / cpu / PreRev |
0.000020419 s |
0.000011505700040288504 s |
1.77 |
Concat / JaXPipe / cpu / PostRev |
0.00001972 s |
0.000011536420024640392 s |
1.71 |
Concat / JaXPipe / cpu / BothRev |
0.00001972 s |
0.000011523979974299437 s |
1.71 |
Concat / Jax / cpu / BothRev |
0.000020046 s |
0.00001130902002842049 s |
1.77 |
Concat / HLOOpt / cpu / PreRev |
0.000019767 s |
0.000011424400054238505 s |
1.73 |
Concat / HLOOpt / cpu / PostRev |
0.000019802 s |
0.000013188979974074756 s |
1.50 |
Concat / HLOOpt / cpu / BothRev |
0.000020002 s |
0.000011646559942164458 s |
1.72 |
Concat / PartOpt / cpu / PreRev |
0.000019935 s |
0.000011037480016966582 s |
1.81 |
Concat / PartOpt / cpu / PostRev |
0.000019641 s |
0.000011795680056820855 s |
1.67 |
Concat / PartOpt / cpu / BothRev |
0.000019886 s |
0.000011574739964999024 s |
1.72 |
Concat / IPartOpt / cpu / PreRev |
0.00001985 s |
0.000011574820055102464 s |
1.71 |
Concat / IPartOpt / cpu / PostRev |
0.000019982 s |
0.000011695620014506855 s |
1.71 |
Concat / IPartOpt / cpu / BothRev |
0.000020201 s |
0.00001153209996118676 s |
1.75 |
Concat / DefOpt / cpu / PreRev |
0.000020038 s |
0.000011297619967081118 s |
1.77 |
Concat / DefOpt / cpu / PostRev |
0.00002008 s |
0.000011271200000919635 s |
1.78 |
Concat / DefOpt / cpu / BothRev |
0.000019707 s |
0.0000117719400623173 s |
1.67 |
Concat / IDefOpt / cpu / PreRev |
0.000020147 s |
0.00001126107988966396 s |
1.79 |
Concat / IDefOpt / cpu / PostRev |
0.000019981 s |
0.00001109713999539963 s |
1.80 |
Concat / IDefOpt / cpu / BothRev |
0.000019595 s |
0.000011643640009424416 s |
1.68 |
const_scatter / JaXPipe / cuda / Primal |
0.000002464 s |
0.000001887 s |
1.31 |
const_scatter / Jax / cuda / Primal |
0.000002463 s |
0.000001887 s |
1.31 |
const_scatter / HLOOpt / cuda / Primal |
0.000002463 s |
0.000001887 s |
1.31 |
const_scatter / PartOpt / cuda / Primal |
0.000002463 s |
0.000001887 s |
1.31 |
const_scatter / IPartOpt / cuda / Primal |
0.000002464 s |
0.000001888 s |
1.31 |
const_scatter / DefOpt / cuda / Primal |
0.000002464 s |
0.000001888 s |
1.31 |
const_scatter / IDefOpt / cuda / Primal |
0.000002463 s |
0.000001887 s |
1.31 |
const_scatter / JaXPipe / cuda / Forward |
0.000010913 s |
0.000010112 s |
1.08 |
const_scatter / Jax / cuda / Forward |
0.000010368 s |
0.000009856 s |
1.05 |
const_scatter / HLOOpt / cuda / Forward |
0.000010592 s |
0.000009696 s |
1.09 |
const_scatter / PartOpt / cuda / Forward |
0.00001072 s |
0.000009919 s |
1.08 |
const_scatter / IPartOpt / cuda / Forward |
0.000011328 s |
0.000009568 s |
1.18 |
const_scatter / DefOpt / cuda / Forward |
0.000011776 s |
0.000009824 s |
1.20 |
const_scatter / IDefOpt / cuda / Forward |
0.000010623 s |
0.000009984 s |
1.06 |
const_scatter / JaXPipe / cuda / PreRev |
0.000017568000000000002 s |
0.000016416 s |
1.07 |
const_scatter / JaXPipe / cuda / PostRev |
0.0000192 s |
0.000016063999999999997 s |
1.20 |
const_scatter / JaXPipe / cuda / BothRev |
0.000016704 s |
0.000015871 s |
1.05 |
const_scatter / Jax / cuda / BothRev |
0.000017024 s |
0.000016576000000000002 s |
1.03 |
const_scatter / HLOOpt / cuda / PreRev |
0.000018176 s |
0.000015872 s |
1.15 |
const_scatter / HLOOpt / cuda / PostRev |
0.000018816 s |
0.000016352 s |
1.15 |
const_scatter / HLOOpt / cuda / BothRev |
0.000018528 s |
0.000015648 s |
1.18 |
const_scatter / PartOpt / cuda / PreRev |
0.000018752000000000003 s |
0.000015518999999999998 s |
1.21 |
const_scatter / PartOpt / cuda / PostRev |
0.000016799000000000003 s |
0.00001568 s |
1.07 |
const_scatter / PartOpt / cuda / BothRev |
0.000017344 s |
0.000015553 s |
1.12 |
const_scatter / IPartOpt / cuda / PreRev |
0.000016576000000000002 s |
0.000016031 s |
1.03 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016831 s |
0.000016544 s |
1.02 |
const_scatter / IPartOpt / cuda / BothRev |
0.0000168 s |
0.000015968 s |
1.05 |
const_scatter / DefOpt / cuda / PreRev |
0.000017088 s |
0.000016383999999999998 s |
1.04 |
const_scatter / DefOpt / cuda / PostRev |
0.000016670999999999997 s |
0.000015904000000000002 s |
1.05 |
const_scatter / DefOpt / cuda / BothRev |
0.000016736 s |
0.00001632 s |
1.03 |
const_scatter / IDefOpt / cuda / PreRev |
0.000016832 s |
0.000016192 s |
1.04 |
const_scatter / IDefOpt / cuda / PostRev |
0.000017247999999999998 s |
0.000015743 s |
1.10 |
const_scatter / IDefOpt / cuda / BothRev |
0.000017024 s |
0.000015711 s |
1.08 |
const_scatter / JaXPipe / tpu / Primal |
0.000003827 s |
0.0000038184 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.000003841125 s |
0.000003835175 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
0.000003819125 s |
0.000003798625 s |
1.01 |
const_scatter / PartOpt / tpu / Primal |
0.000003849175 s |
0.000003819275 s |
1.01 |
const_scatter / IPartOpt / tpu / Primal |
0.00000379835 s |
0.000003795925 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
0.00000381755 s |
0.000003832675 s |
1.00 |
const_scatter / IDefOpt / tpu / Primal |
0.000003792775 s |
0.00000380515 s |
1.00 |
const_scatter / JaXPipe / tpu / Forward |
0.000006495 s |
0.000006451825 s |
1.01 |
const_scatter / Jax / tpu / Forward |
0.00000645745 s |
0.000006502525 s |
0.99 |
const_scatter / HLOOpt / tpu / Forward |
0.000006493 s |
0.00000645655 s |
1.01 |
const_scatter / PartOpt / tpu / Forward |
0.000006487825 s |
0.0000065193 s |
1.00 |
const_scatter / IPartOpt / tpu / Forward |
0.000006495925000000001 s |
0.0000064571 s |
1.01 |
const_scatter / DefOpt / tpu / Forward |
0.0000064614000000000005 s |
0.000006503475 s |
0.99 |
const_scatter / IDefOpt / tpu / Forward |
0.000006489525 s |
0.000006458125 s |
1.00 |
const_scatter / JaXPipe / tpu / PreRev |
0.000006703474999999999 s |
0.000006637475 s |
1.01 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006703799999999999 s |
0.000006622749999999999 s |
1.01 |
const_scatter / JaXPipe / tpu / BothRev |
0.000006667825 s |
0.0000066325250000000005 s |
1.01 |
const_scatter / Jax / tpu / BothRev |
0.000006706475 s |
0.000006621949999999999 s |
1.01 |
const_scatter / HLOOpt / tpu / PreRev |
0.000006680475 s |
0.00000664045 s |
1.01 |
const_scatter / HLOOpt / tpu / PostRev |
0.000006698575 s |
0.000006628874999999999 s |
1.01 |
const_scatter / HLOOpt / tpu / BothRev |
0.000006672225 s |
0.000006645124999999999 s |
1.00 |
const_scatter / PartOpt / tpu / PreRev |
0.00000668665 s |
0.000006635125 s |
1.01 |
const_scatter / PartOpt / tpu / PostRev |
0.0000066813 s |
0.000006629475 s |
1.01 |
const_scatter / PartOpt / tpu / BothRev |
0.0000066946 s |
0.000006631924999999999 s |
1.01 |
const_scatter / IPartOpt / tpu / PreRev |
0.000006676225 s |
0.0000066217000000000005 s |
1.01 |
const_scatter / IPartOpt / tpu / PostRev |
0.000006686025 s |
0.000006610425 s |
1.01 |
const_scatter / IPartOpt / tpu / BothRev |
0.0000066669 s |
0.000006637600000000001 s |
1.00 |
const_scatter / DefOpt / tpu / PreRev |
0.000006677125000000001 s |
0.0000066379 s |
1.01 |
const_scatter / DefOpt / tpu / PostRev |
0.000006686675 s |
0.000006617724999999999 s |
1.01 |
const_scatter / DefOpt / tpu / BothRev |
0.000006690199999999999 s |
0.00000663095 s |
1.01 |
const_scatter / IDefOpt / tpu / PreRev |
0.000006698225 s |
0.000006621325 s |
1.01 |
const_scatter / IDefOpt / tpu / PostRev |
0.000006685175 s |
0.000006636525 s |
1.01 |
const_scatter / IDefOpt / tpu / BothRev |
0.000006666450000000001 s |
0.00000660165 s |
1.01 |
const_scatter / JaXPipe / cpu / Primal |
0.000012946 s |
0.000006296179999480955 s |
2.06 |
const_scatter / Jax / cpu / Primal |
0.000012615 s |
0.000006478699924628018 s |
1.95 |
const_scatter / HLOOpt / cpu / Primal |
0.000013249 s |
0.000007021259953035042 s |
1.89 |
const_scatter / PartOpt / cpu / Primal |
0.000012846 s |
0.00000653694001812255 s |
1.97 |
const_scatter / IPartOpt / cpu / Primal |
0.000012794 s |
0.000006497859976661857 s |
1.97 |
const_scatter / DefOpt / cpu / Primal |
0.000013355 s |
0.000006806539895478636 s |
1.96 |
const_scatter / IDefOpt / cpu / Primal |
0.000013262 s |
0.000006674100004602223 s |
1.99 |
const_scatter / JaXPipe / cpu / Forward |
0.000018088 s |
0.000010450200006744126 s |
1.73 |
const_scatter / Jax / cpu / Forward |
0.000016955000000000003 s |
0.000009259560010832502 s |
1.83 |
const_scatter / HLOOpt / cpu / Forward |
0.000017845 s |
0.000010598320095596136 s |
1.68 |
const_scatter / PartOpt / cpu / Forward |
0.000017654 s |
0.000010302739956387088 s |
1.71 |
const_scatter / IPartOpt / cpu / Forward |
0.000017745 s |
0.000010669560033420568 s |
1.66 |
const_scatter / DefOpt / cpu / Forward |
0.000017854 s |
0.000010220859985565769 s |
1.75 |
const_scatter / IDefOpt / cpu / Forward |
0.000017997 s |
0.0000100798799576296 s |
1.79 |
const_scatter / JaXPipe / cpu / PreRev |
0.0004868759999999 s |
0.0002859397599968 s |
1.70 |
const_scatter / JaXPipe / cpu / PostRev |
0.00050926 s |
0.0002769920399259 s |
1.84 |
const_scatter / JaXPipe / cpu / BothRev |
0.000507781 s |
0.0002800957999716 s |
1.81 |
const_scatter / Jax / cpu / BothRev |
0.00049152 s |
0.0002796387400485 s |
1.76 |
const_scatter / HLOOpt / cpu / PreRev |
0.000487235 s |
0.0002858592999837 s |
1.70 |
const_scatter / HLOOpt / cpu / PostRev |
0.000512209 s |
0.0002851575600652 s |
1.80 |
const_scatter / HLOOpt / cpu / BothRev |
0.000508304 s |
0.0002833786999508 s |
1.79 |
const_scatter / PartOpt / cpu / PreRev |
0.000502916 s |
0.0002815988000656 s |
1.79 |
const_scatter / PartOpt / cpu / PostRev |
0.000501018 s |
0.0002802312200219 s |
1.79 |
const_scatter / PartOpt / cpu / BothRev |
0.000495407 s |
0.0002805860599619 s |
1.77 |
const_scatter / IPartOpt / cpu / PreRev |
0.000509577 s |
0.0002813582600538 s |
1.81 |
const_scatter / IPartOpt / cpu / PostRev |
0.000501245 s |
0.0002807098400262 s |
1.79 |
const_scatter / IPartOpt / cpu / BothRev |
0.000493155 s |
0.0002834572800384 s |
1.74 |
const_scatter / DefOpt / cpu / PreRev |
0.000491804 s |
0.0002803923399551 s |
1.75 |
const_scatter / DefOpt / cpu / PostRev |
0.000510077 s |
0.0002814791199307 s |
1.81 |
const_scatter / DefOpt / cpu / BothRev |
0.000493616 s |
0.000281565200039 s |
1.75 |
const_scatter / IDefOpt / cpu / PreRev |
0.000511866 s |
0.0002804958800152 s |
1.82 |
const_scatter / IDefOpt / cpu / PostRev |
0.000490943 s |
0.0002823369399993 s |
1.74 |
const_scatter / IDefOpt / cpu / BothRev |
0.000511223 s |
0.0002790884600472 s |
1.83 |
GenDot / JaXPipe / cuda / Primal |
0.000002528 s |
0.000002016 s |
1.25 |
GenDot / Jax / cuda / Primal |
0.000002528 s |
0.000002016 s |
1.25 |
GenDot / HLOOpt / cuda / Primal |
0.000002527 s |
0.000002016 s |
1.25 |
GenDot / PartOpt / cuda / Primal |
0.00000256 s |
0.000002016 s |
1.27 |
GenDot / IPartOpt / cuda / Primal |
0.000002559 s |
0.000002015 s |
1.27 |
GenDot / DefOpt / cuda / Primal |
0.000002527 s |
0.000002015 s |
1.25 |
GenDot / IDefOpt / cuda / Primal |
0.00000256 s |
0.000002015 s |
1.27 |
GenDot / JaXPipe / cuda / Forward |
0.000010464 s |
0.000009824 s |
1.07 |
GenDot / Jax / cuda / Forward |
0.00001056 s |
0.000009344 s |
1.13 |
GenDot / HLOOpt / cuda / Forward |
0.000010688 s |
0.00000976 s |
1.10 |
GenDot / PartOpt / cuda / Forward |
0.000010752 s |
0.000009695 s |
1.11 |
GenDot / IPartOpt / cuda / Forward |
0.000010625 s |
0.00000976 s |
1.09 |
GenDot / DefOpt / cuda / Forward |
0.000010752 s |
0.00000992 s |
1.08 |
GenDot / IDefOpt / cuda / Forward |
0.00001072 s |
0.0000096 s |
1.12 |
GenDot / JaXPipe / cuda / PreRev |
0.000010272 s |
0.000009408 s |
1.09 |
GenDot / JaXPipe / cuda / PostRev |
0.000010656 s |
0.000009569 s |
1.11 |
GenDot / JaXPipe / cuda / BothRev |
0.00001056 s |
0.000009696 s |
1.09 |
GenDot / Jax / cuda / BothRev |
0.000010591 s |
0.000009568 s |
1.11 |
GenDot / HLOOpt / cuda / PreRev |
0.00001056 s |
0.000009632 s |
1.10 |
GenDot / HLOOpt / cuda / PostRev |
0.000010528 s |
0.000010016 s |
1.05 |
GenDot / HLOOpt / cuda / BothRev |
0.000010527 s |
0.000009567 s |
1.10 |
GenDot / PartOpt / cuda / PreRev |
0.000010592 s |
0.000009344 s |
1.13 |
GenDot / PartOpt / cuda / PostRev |
0.000010528 s |
0.000009504 s |
1.11 |
GenDot / PartOpt / cuda / BothRev |
0.000010624 s |
0.000010592 s |
1.00 |
GenDot / IPartOpt / cuda / PreRev |
0.000010464 s |
0.000010112 s |
1.03 |
GenDot / IPartOpt / cuda / PostRev |
0.000010592 s |
0.000009727 s |
1.09 |
GenDot / IPartOpt / cuda / BothRev |
0.000010304 s |
0.000009696 s |
1.06 |
GenDot / DefOpt / cuda / PreRev |
0.00001056 s |
0.000009696 s |
1.09 |
GenDot / DefOpt / cuda / PostRev |
0.000010496 s |
0.000009696 s |
1.08 |
GenDot / DefOpt / cuda / BothRev |
0.000010752 s |
0.000009632 s |
1.12 |
GenDot / IDefOpt / cuda / PreRev |
0.000010624 s |
0.000010048 s |
1.06 |
GenDot / IDefOpt / cuda / PostRev |
0.000010816 s |
0.0000096 s |
1.13 |
GenDot / IDefOpt / cuda / BothRev |
0.000010592 s |
0.00000944 s |
1.12 |
GenDot / JaXPipe / tpu / Primal |
9.435e-7 s |
9.3015e-7 s |
1.01 |
GenDot / Jax / tpu / Primal |
9.30275e-7 s |
9.25925e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.0000016002 s |
0.0000015801499999999998 s |
1.01 |
GenDot / PartOpt / tpu / Primal |
9.298e-7 s |
9.25925e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.434e-7 s |
9.30225e-7 s |
1.01 |
GenDot / DefOpt / tpu / Primal |
0.000001502175 s |
0.000001514625 s |
0.99 |
GenDot / IDefOpt / tpu / Primal |
0.00000159905 s |
0.000001579175 s |
1.01 |
GenDot / JaXPipe / tpu / Forward |
0.000003048925 s |
0.000003185025 s |
0.96 |
GenDot / Jax / tpu / Forward |
0.00000227975 s |
0.00000232695 s |
0.98 |
GenDot / HLOOpt / tpu / Forward |
0.000003114175 s |
0.000003133375 s |
0.99 |
GenDot / PartOpt / tpu / Forward |
0.000003135425 s |
0.000003229475 s |
0.97 |
GenDot / IPartOpt / tpu / Forward |
0.0000031227500000000003 s |
0.0000031208000000000003 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.0000031314 s |
0.000003228025 s |
0.97 |
GenDot / IDefOpt / tpu / Forward |
0.0000031149 s |
0.0000031296 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.000003027025 s |
0.00000298095 s |
1.02 |
GenDot / JaXPipe / tpu / PostRev |
0.00000237295 s |
0.000002402975 s |
0.99 |
GenDot / JaXPipe / tpu / BothRev |
0.000003012025 s |
0.000002991025 s |
1.01 |
GenDot / Jax / tpu / BothRev |
0.000002379625 s |
0.000002403 s |
0.99 |
GenDot / HLOOpt / tpu / PreRev |
0.0000030072 s |
0.000002973925 s |
1.01 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029346 s |
0.000002938625 s |
1.00 |
GenDot / HLOOpt / tpu / BothRev |
0.000003007175000000001 s |
0.000002979075 s |
1.01 |
GenDot / PartOpt / tpu / PreRev |
0.000002934175 s |
0.0000029275 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.000002416 s |
0.0000023966 s |
1.01 |
GenDot / PartOpt / tpu / BothRev |
0.000002935625 s |
0.0000029288500000000003 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.00000301045 s |
0.000002976525 s |
1.01 |
GenDot / IPartOpt / tpu / PostRev |
0.00000237535 s |
0.00000240625 s |
0.99 |
GenDot / IPartOpt / tpu / BothRev |
0.0000030095 s |
0.0000029808 s |
1.01 |
GenDot / DefOpt / tpu / PreRev |
0.000002938475 s |
0.0000029376000000000005 s |
1.00 |
GenDot / DefOpt / tpu / PostRev |
0.00000301595 s |
0.000002976 s |
1.01 |
GenDot / DefOpt / tpu / BothRev |
0.0000029443000000000003 s |
0.000002935325 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.000003014825 s |
0.000002978425 s |
1.01 |
GenDot / IDefOpt / tpu / PostRev |
0.0000029408 s |
0.00000292695 s |
1.00 |
GenDot / IDefOpt / tpu / BothRev |
0.00000300985 s |
0.0000029819250000000003 s |
1.01 |
GenDot / JaXPipe / cpu / Primal |
0.000014426 s |
0.000007302779995370656 s |
1.98 |
GenDot / Jax / cpu / Primal |
0.000014649 s |
0.000006674979922536295 s |
2.19 |
GenDot / HLOOpt / cpu / Primal |
0.000013994 s |
0.000007559800014860229 s |
1.85 |
GenDot / PartOpt / cpu / Primal |
0.000014818 s |
0.000006457319959736196 s |
2.29 |
GenDot / IPartOpt / cpu / Primal |
0.000014749 s |
0.0000068904400359315336 s |
2.14 |
GenDot / DefOpt / cpu / Primal |
0.000014132 s |
0.0000070318200778274335 s |
2.01 |
GenDot / IDefOpt / cpu / Primal |
0.000014077 s |
0.000007177160059654853 s |
1.96 |
GenDot / JaXPipe / cpu / Forward |
0.000019369 s |
0.00001050371996825561 s |
1.84 |
GenDot / Jax / cpu / Forward |
0.000020224 s |
0.000010276840002916288 s |
1.97 |
GenDot / HLOOpt / cpu / Forward |
0.000019254 s |
0.000010662299937393982 s |
1.81 |
GenDot / PartOpt / cpu / Forward |
0.000019312 s |
0.000010708300105761735 s |
1.80 |
GenDot / IPartOpt / cpu / Forward |
0.000019625 s |
0.00001087337996068527 s |
1.80 |
GenDot / DefOpt / cpu / Forward |
0.000019033 s |
0.0000105288599661435 s |
1.81 |
GenDot / IDefOpt / cpu / Forward |
0.000018991 s |
0.000010731519978435244 s |
1.77 |
GenDot / JaXPipe / cpu / PreRev |
0.000019282 s |
0.000010836680030479328 s |
1.78 |
GenDot / JaXPipe / cpu / PostRev |
0.00002047 s |
0.000009931940003298224 s |
2.06 |
GenDot / JaXPipe / cpu / BothRev |
0.000019124 s |
0.000011299920006422324 s |
1.69 |
GenDot / Jax / cpu / BothRev |
0.000019995 s |
0.000010131740000360878 s |
1.97 |
GenDot / HLOOpt / cpu / PreRev |
0.000019051 s |
0.000010890920038946206 s |
1.75 |
GenDot / HLOOpt / cpu / PostRev |
0.000019553 s |
0.000012772999980370514 s |
1.53 |
GenDot / HLOOpt / cpu / BothRev |
0.000019489 s |
0.000010551199993642513 s |
1.85 |
GenDot / PartOpt / cpu / PreRev |
0.000019056 s |
0.00001108557997213211 s |
1.72 |
GenDot / PartOpt / cpu / PostRev |
0.000020461 s |
0.00000987782004813198 s |
2.07 |
GenDot / PartOpt / cpu / BothRev |
0.000019325 s |
0.000011596880067372697 s |
1.67 |
GenDot / IPartOpt / cpu / PreRev |
0.000019075000000000003 s |
0.000010663179964467415 s |
1.79 |
GenDot / IPartOpt / cpu / PostRev |
0.000020707 s |
0.00000973392006926588 s |
2.13 |
GenDot / IPartOpt / cpu / BothRev |
0.000019336 s |
0.000010919259948423132 s |
1.77 |
GenDot / DefOpt / cpu / PreRev |
0.000019382 s |
0.000010918499992840224 s |
1.78 |
GenDot / DefOpt / cpu / PostRev |
0.00001962 s |
0.00001102416010326124 s |
1.78 |
GenDot / DefOpt / cpu / BothRev |
0.000019734 s |
0.000010676539968699217 s |
1.85 |
GenDot / IDefOpt / cpu / PreRev |
0.000018944 s |
0.000010241140007565263 s |
1.85 |
GenDot / IDefOpt / cpu / PostRev |
0.000019565 s |
0.000010610380031721434 s |
1.84 |
GenDot / IDefOpt / cpu / BothRev |
0.000019067 s |
0.000010912239995377604 s |
1.75 |
hlo_ffi / JaXPipe / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / Jax / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000002368 s |
0.000001952 s |
1.21 |
hlo_ffi / PartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001952 s |
1.21 |
hlo_ffi / IPartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001983 s |
1.19 |
hlo_ffi / DefOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001952 s |
1.21 |
hlo_ffi / IDefOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000001952 s |
1.21 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / Jax / cuda / Forward |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / PartOpt / cuda / Forward |
0.000002464 s |
0.000002047 s |
1.20 |
hlo_ffi / IPartOpt / cuda / Forward |
0.000002464 s |
0.000002047 s |
1.20 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / IDefOpt / cuda / Forward |
0.000002464 s |
0.000002047 s |
1.20 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / Jax / cuda / BothRev |
0.000002433 s |
0.000002047 s |
1.19 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002463 s |
0.000002016 s |
1.22 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002463 s |
0.000002047 s |
1.20 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002432 s |
0.000002047 s |
1.19 |
hlo_ffi / JaXPipe / tpu / Primal |
9.198e-7 s |
9.28675e-7 s |
0.99 |
hlo_ffi / Jax / tpu / Primal |
9.5005e-7 s |
9.53925e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
8.958249999999999e-7 s |
9.07475e-7 s |
0.99 |
hlo_ffi / PartOpt / tpu / Primal |
9.50575e-7 s |
9.4965e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Primal |
9.00825e-7 s |
9.09325e-7 s |
0.99 |
hlo_ffi / DefOpt / tpu / Primal |
9.49675e-7 s |
9.52175e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Primal |
8.94775e-7 s |
9.059e-7 s |
0.99 |
hlo_ffi / JaXPipe / tpu / Forward |
9.49475e-7 s |
9.49325e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.81825e-7 s |
9.81775e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.74e-7 s |
9.7415e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.345e-7 s |
9.33875e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.74675e-7 s |
9.739499999999998e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.34125e-7 s |
9.33825e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.74375e-7 s |
9.74475e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.3225e-7 s |
9.3735e-7 s |
0.99 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.648e-7 s |
9.65e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.6195e-7 s |
9.61925e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.653e-7 s |
9.643e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.62175e-7 s |
9.616e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.65125e-7 s |
9.6495e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.61375e-7 s |
9.6135e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.65225e-7 s |
9.64825e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.61625e-7 s |
9.6165e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.647e-7 s |
9.64875e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.61675e-7 s |
9.619e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.65275e-7 s |
9.646e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.6175e-7 s |
9.61775e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.6505e-7 s |
9.647e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.624e-7 s |
9.62225e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.6505e-7 s |
9.645e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.61175e-7 s |
9.617e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.6525e-7 s |
9.6485e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.62125e-7 s |
9.62e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000016992 s |
0.000011233720051677663 s |
1.51 |
hlo_ffi / Jax / cpu / Primal |
0.000017114 s |
0.000010917700001300543 s |
1.57 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000016933 s |
0.00001101467994885752 s |
1.54 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017257 s |
0.000010988760004693175 s |
1.57 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017173 s |
0.000011737320073734736 s |
1.46 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017196 s |
0.000010840160084626403 s |
1.59 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000016784 s |
0.00001093057993784896 s |
1.54 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000024274 s |
0.000015760040041641333 s |
1.54 |
hlo_ffi / Jax / cpu / Forward |
0.000023215 s |
0.00001529287999801454 s |
1.52 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000023543 s |
0.00001593963992490899 s |
1.48 |
hlo_ffi / PartOpt / cpu / Forward |
0.000023531 s |
0.000015218719981930916 s |
1.55 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000023047 s |
0.00001521984002465615 s |
1.51 |
hlo_ffi / DefOpt / cpu / Forward |
0.000023179000000000003 s |
0.000015428400020027765 s |
1.50 |
hlo_ffi / IDefOpt / cpu / Forward |
0.00002318 s |
0.000015707859984104288 s |
1.48 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000023691 s |
0.000015456880028068554 s |
1.53 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000023497 s |
0.00001708776000668877 s |
1.38 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000022992 s |
0.000014944239974283844 s |
1.54 |
hlo_ffi / Jax / cpu / BothRev |
0.000023301 s |
0.000015678040035709273 s |
1.49 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000023554 s |
0.000016419940020568902 s |
1.43 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.00002294 s |
0.00001703794001514325 s |
1.35 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000023581 s |
0.000014703899996675318 s |
1.60 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000023766 s |
0.000015609500005666634 s |
1.52 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000023075 s |
0.000015246840011968744 s |
1.51 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00002294 s |
0.000015681820041208994 s |
1.46 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000023703 s |
0.000015859059949434595 s |
1.49 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000023015 s |
0.00001532091999251861 s |
1.50 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000023025 s |
0.000014792100046179256 s |
1.56 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000023265 s |
0.00001618226009668433 s |
1.44 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023557 s |
0.000015356060011981753 s |
1.53 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000023214 s |
0.000015247140036080964 s |
1.52 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000023432 s |
0.00001594363999174675 s |
1.47 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000023152 s |
0.000015076339986990204 s |
1.54 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000023514 s |
0.000015262999950209634 s |
1.54 |
jaxmd20 / JaXPipe / cuda / Primal |
0.0014987099999999 s |
0.001487797 s |
1.01 |
jaxmd20 / Jax / cuda / Primal |
0.001515256 s |
0.001502389 s |
1.01 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001406967 s |
0.001337142 s |
1.05 |
jaxmd20 / PartOpt / cuda / Primal |
0.001371705 s |
0.001336182 s |
1.03 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001437721 s |
0.001332982 s |
1.08 |
jaxmd20 / DefOpt / cuda / Primal |
0.00095033 s |
0.000924473 s |
1.03 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000964795 s |
0.000986297 s |
0.98 |
jaxmd20 / JaXPipe / cuda / Forward |
0.001645912 s |
0.001555061 s |
1.06 |
jaxmd20 / Jax / cuda / Forward |
0.00189711 s |
0.0017823869999999 s |
1.06 |
jaxmd20 / HLOOpt / cuda / Forward |
0.001745366 s |
0.0016205 s |
1.08 |
jaxmd20 / PartOpt / cuda / Forward |
0.001725943 s |
0.001677841 s |
1.03 |
jaxmd20 / IPartOpt / cuda / Forward |
0.001724663 s |
0.001613684 s |
1.07 |
jaxmd20 / DefOpt / cuda / Forward |
0.001739703 s |
0.001646644 s |
1.06 |
jaxmd20 / IDefOpt / cuda / Forward |
0.001743126 s |
0.001617236 s |
1.08 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.002767599 s |
0.002683692 s |
1.03 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005485442 s |
0.005353332 s |
1.02 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.002759665 s |
0.0026781229999999 s |
1.03 |
jaxmd20 / Jax / cuda / BothRev |
0.0054966749999999 s |
0.005319223 s |
1.03 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.0028658719999999 s |
0.002741707 s |
1.05 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005489508 s |
0.00534098 s |
1.03 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.002834129 s |
0.0027194669999999 s |
1.04 |
jaxmd20 / PartOpt / cuda / PreRev |
0.002903921 s |
0.002839114 s |
1.02 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005585346 s |
0.005380503 s |
1.04 |
jaxmd20 / PartOpt / cuda / BothRev |
0.002809936 s |
0.002758475 s |
1.02 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.0029408159999999 s |
0.002806923 s |
1.05 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005604355 s |
0.005431319 s |
1.03 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.002861937 s |
0.002746092 s |
1.04 |
jaxmd20 / DefOpt / cuda / PreRev |
0.002919311 s |
0.002836201 s |
1.03 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002856753 s |
0.002780043 s |
1.03 |
jaxmd20 / DefOpt / cuda / BothRev |
0.00298264 s |
0.002767851 s |
1.08 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.002914193 s |
0.002789162 s |
1.04 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.002356563 s |
0.002326062 s |
1.01 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0028880159999999 s |
0.002757003 s |
1.05 |
jaxmd20 / JaXPipe / tpu / Primal |
0.0092835 s |
0.009271920625 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.0092628912499999 s |
0.00926478375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.009153585625 s |
0.009154375 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.00919604125 s |
0.0091968425 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.009199874375 s |
0.00920241 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.008794388125 s |
0.00879217375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.00870145125 s |
0.008697745625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.017430190625 s |
0.01741725375 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.01873101125 s |
0.01872633625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.017404078125 s |
0.017394088125 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.017422699375 s |
0.01740757375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.0174257087499999 s |
0.0174110075 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.01742289375 s |
0.01741526125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.01742419125 s |
0.017414086875 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.02544791 s |
0.0254551675 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.021853091875 s |
0.021894850625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.0254696 s |
0.02547269 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.0218608162499999 s |
0.021891873125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.0255818937499999 s |
0.02558601875 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.02070597125 s |
0.020728129375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.025678291875 s |
0.02567900375 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.02547572625 s |
0.025504339375 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.02151574875 s |
0.021511336875 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.025555076875 s |
0.025595189375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.0254716393749999 s |
0.025476863125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.0215189925 s |
0.021535506875 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.025553548125 s |
0.025550848125 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.025477053125 s |
0.0255030593749999 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.018808345 s |
0.01880534625 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.025561075625 s |
0.025599089375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.025476626875 s |
0.02547722375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.01833314125 s |
0.018347999375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.025554355625 s |
0.025558546875 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.069864041 s |
0.070455611 s |
0.99 |
jaxmd40 / Jax / cpu / Primal |
0.070985658 s |
0.070977207 s |
1.00 |
jaxmd40 / HLOOpt / cpu / Primal |
0.090561247 s |
0.090696345 s |
1.00 |
jaxmd40 / PartOpt / cpu / Primal |
0.072999064 s |
0.071462105 s |
1.02 |
jaxmd40 / IPartOpt / cpu / Primal |
0.071307804 s |
0.0697529569999999 s |
1.02 |
jaxmd40 / DefOpt / cpu / Primal |
0.085994243 s |
0.090211465 s |
0.95 |
jaxmd40 / IDefOpt / cpu / Primal |
0.089946447 s |
0.091006028 s |
0.99 |
jaxmd40 / JaXPipe / cpu / Forward |
0.157225393 s |
0.160325353 s |
0.98 |
jaxmd40 / Jax / cpu / Forward |
0.090850408 s |
0.086059653 s |
1.06 |
jaxmd40 / HLOOpt / cpu / Forward |
0.163614447 s |
0.163041726 s |
1.00 |
jaxmd40 / PartOpt / cpu / Forward |
0.157494823 s |
0.158902308 s |
0.99 |
jaxmd40 / IPartOpt / cpu / Forward |
0.156424391 s |
0.156841914 s |
1.00 |
jaxmd40 / DefOpt / cpu / Forward |
0.159538319 s |
0.150396513 s |
1.06 |
jaxmd40 / IDefOpt / cpu / Forward |
0.158084739 s |
0.159234883 s |
0.99 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.220397539 s |
0.2329717379999999 s |
0.95 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.133936695 s |
0.136318554 s |
0.98 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.220050767 s |
0.2388221069999999 s |
0.92 |
jaxmd40 / Jax / cpu / BothRev |
0.123031299 s |
0.139153379 s |
0.88 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.223328987 s |
0.21605981 s |
1.03 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.176871989 s |
0.174469936 s |
1.01 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.256407197 s |
0.246942098 s |
1.04 |
jaxmd40 / PartOpt / cpu / PreRev |
0.210923072 s |
0.2335476779999999 s |
0.90 |
jaxmd40 / PartOpt / cpu / PostRev |
0.1298896269999999 s |
0.1233247329999999 s |
1.05 |
jaxmd40 / PartOpt / cpu / BothRev |
0.249815782 s |
0.265456492 s |
0.94 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.216901774 s |
0.219451872 s |
0.99 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.125544352 s |
0.128084786 s |
0.98 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.240492003 s |
0.243523231 s |
0.99 |
jaxmd40 / DefOpt / cpu / PreRev |
0.227645213 s |
0.217161572 s |
1.05 |
jaxmd40 / DefOpt / cpu / PostRev |
0.166643279 s |
0.191683424 s |
0.87 |
jaxmd40 / DefOpt / cpu / BothRev |
0.232830793 s |
0.229506267 s |
1.01 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.221408361 s |
0.204105072 s |
1.08 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.169468681 s |
0.167389919 s |
1.01 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.236044521 s |
0.263771102 s |
0.89 |
jaxley_l5pc / JaXPipe / cuda / Primal |
3.0925719955048407 s |
||
jaxley_l5pc / Jax / cuda / Primal |
3.091246781499649 s |
||
jaxley_l5pc / HLOOpt / cuda / Primal |
3.641045484997449 s |
||
jaxley_l5pc / PartOpt / cuda / Primal |
3.4375655574986013 s |
||
jaxley_l5pc / IPartOpt / cuda / Primal |
3.4381210490028025 s |
||
jaxley_l5pc / DefOpt / cuda / Primal |
3.1555231930033187 s |
||
jaxley_l5pc / IDefOpt / cuda / Primal |
3.3513392664972343 s |
||
jaxley_l5pc / JaXPipe / cuda / Forward |
6.33260042549955 s |
||
jaxley_l5pc / Jax / cuda / Forward |
5.855309637503524 s |
||
jaxley_l5pc / HLOOpt / cuda / Forward |
6.116221485499409 s |
||
jaxley_l5pc / PartOpt / cuda / Forward |
6.331197531995713 s |
||
jaxley_l5pc / IPartOpt / cuda / Forward |
6.331824204004079 s |
||
jaxley_l5pc / DefOpt / cuda / Forward |
6.331567016495683 s |
||
jaxley_l5pc / IDefOpt / cuda / Forward |
6.331120531001943 s |
||
jaxley_l5pc / JaXPipe / cpu / Primal |
1.0277289004998238 s |
||
jaxley_l5pc / Jax / cpu / Primal |
1.0022906139997758 s |
||
jaxley_l5pc / HLOOpt / cpu / Primal |
1.052065173500523 s |
||
jaxley_l5pc / PartOpt / cpu / Primal |
0.8358459615001266 s |
||
jaxley_l5pc / IPartOpt / cpu / Primal |
0.987979125000038 s |
||
jaxley_l5pc / DefOpt / cpu / Primal |
0.9394538745000318 s |
||
jaxley_l5pc / IDefOpt / cpu / Primal |
0.980509082000026 s |
||
jaxley_l5pc / JaXPipe / cpu / Forward |
21.198061596499883 s |
||
jaxley_l5pc / Jax / cpu / Forward |
26.150097345000177 s |
||
jaxley_l5pc / HLOOpt / cpu / Forward |
21.25610011299977 s |
||
jaxley_l5pc / PartOpt / cpu / Forward |
21.31876329099987 s |
||
jaxley_l5pc / IPartOpt / cpu / Forward |
21.608647604 s |
||
jaxley_l5pc / DefOpt / cpu / Forward |
21.42443357550019 s |
||
jaxley_l5pc / IDefOpt / cpu / Forward |
21.19515658599994 s |
||
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000295582 s |
0.000283262 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000295518 s |
0.000282558 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000302238 s |
0.000290237 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000295359 s |
0.000282238 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000295263 s |
0.000283134 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.000303422 s |
0.00029011 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.0003018859999999 s |
0.000290237 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000582461 s |
0.000559131 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000566589 s |
0.0005419159999999 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000583261 s |
0.000559963 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.00058278 s |
0.000558619 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000582365 s |
0.0005594829999999 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000582109 s |
0.000558172 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.0005830039999999 s |
0.0005583949999999 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001054362 s |
0.001028888 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.001011802 s |
0.000987672 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001048538 s |
0.001027992 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.001004026 s |
0.000991256 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001035002 s |
0.00101852 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001060699 s |
0.001043415 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001033626 s |
0.001014904 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.0010476749999999 s |
0.001031736 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000998715 s |
0.000982233 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001048186 s |
0.001030904 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001050107 s |
0.00103148 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.0009986819999999 s |
0.000977753 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.00105081 s |
0.001031 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.0010502979999999 s |
0.00102556 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000984539 s |
0.000965176 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.0010487299999999 s |
0.001027641 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001048762 s |
0.0010238 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001051802 s |
0.001025368 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001051995 s |
0.001026904 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.00012932075 s |
0.00012361825 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.00012362925 s |
0.00012670775 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.000158851 s |
0.00015277825 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013093625 s |
0.00013410125 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.00013707525 s |
0.00013109675 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.00014527725 s |
0.00014807925 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.00015678025 s |
0.00015087025 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.000213655 s |
0.0002119264999999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.00026234575 s |
0.00026120825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00021999375 s |
0.00021193525 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.0002138494999999 s |
0.000218387 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.00021629575 s |
0.00021170925 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.0002179435 s |
0.0002185024999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.00021621525 s |
0.0002119279999999 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035764825 s |
0.0003552015 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.0002559782499999 s |
0.00025893075 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.0003571572499999 s |
0.00035548025 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.00025769525 s |
0.00025925425 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.00035793225 s |
0.0003552615 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.0002910579999999 s |
0.00029143275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035788775 s |
0.000355109 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.00035559275 s |
0.000356625 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.00027471125 s |
0.0002719432499999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.0003556765 s |
0.000356626 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.00035784925 s |
0.0003552165 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.0002721227499999 s |
0.00027480525 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.0003575155 s |
0.000355203 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.0003577875 s |
0.0003589354999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.00028508275 s |
0.000283826 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.0003575419999999 s |
0.000359149 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.00035971475 s |
0.000357337 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.000301159 s |
0.0003017189999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.000360002 s |
0.0003577595 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.00139711 s |
0.0009588552000423 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.001357834 s |
0.0009516709998933 s |
1.43 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.001523989 s |
0.0010057729999971 s |
1.52 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0014595629999999 s |
0.0009291818001656 s |
1.57 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0013188599999999 s |
0.0009374562001539 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.001504152 s |
0.0009786819997316 s |
1.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.001549951 s |
0.0009964631999537 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0042980219999999 s |
0.0022665190001134 s |
1.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.004218632 s |
0.002419118599937 s |
1.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.004227222 s |
0.0022202353999091 s |
1.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.003932014 s |
0.002213902999938 s |
1.78 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.00384483 s |
0.0022291920000498 s |
1.72 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.004096659 s |
0.0022840097997686 s |
1.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.004299596 s |
0.0023154484002589 s |
1.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0075002969999999 s |
0.0059770792000563 s |
1.25 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.00869369 s |
0.006564533799974 s |
1.32 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.007583365 s |
0.0062062592000074 s |
1.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.006975437 s |
0.0070674812001016 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.007455173 s |
0.0052356914000483 s |
1.42 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.005981815 s |
0.0062103693997414 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.007810482 s |
0.006651248199887 s |
1.17 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.007870404 s |
0.0060678497999106 s |
1.30 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.008260557 s |
0.0062865141995644 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.007379295 s |
0.005630325400125 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.008052593 s |
0.005550784800107 s |
1.45 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.008406844 s |
0.0056538740001997 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.00782498 s |
0.0065768796001066 s |
1.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.008337991 s |
0.0056750671999907 s |
1.47 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.00687821 s |
0.0068473344001176 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.008554965 s |
0.0057122905998767 s |
1.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.007913134 s |
0.005576557400127 s |
1.42 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.008546929 s |
0.0068063960001381 s |
1.26 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.008348635 s |
0.0055609175997233 s |
1.50 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.702040499 s |
1.702951013 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.705016895 s |
1.705186629 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.71504903 s |
1.717234453 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.696890089 s |
1.696901907 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.694657655 s |
1.694897228 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.66591526 s |
1.66528891 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.911819131 s |
1.914906271 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.03812712875 s |
3.038658345625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.038645011875 s |
3.03917762375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.12088705875 s |
3.12155321875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.0594905100000003 s |
3.0599546075 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.0596811125 s |
3.060167825 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.102192178125 s |
2.10236119125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
2.94652327875 s |
2.948083710625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
5.7309540740000005 s |
5.872614181 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
5.873826306 s |
6.02566513 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
5.945230124 s |
6.010850407 s |
0.99 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
5.813805289 s |
5.953050324 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
5.916455158 s |
6.100518878 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.1836341860000004 s |
2.279712861 s |
0.96 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.294406859 s |
6.451860342 s |
0.98 |
scatter_sum / JaXPipe / cuda / Primal |
0.000010688 s |
0.000010016 s |
1.07 |
scatter_sum / Jax / cuda / Primal |
0.000010432 s |
0.000009568 s |
1.09 |
scatter_sum / HLOOpt / cuda / Primal |
0.000010912 s |
0.0000096 s |
1.14 |
scatter_sum / PartOpt / cuda / Primal |
0.0000104 s |
0.000009856 s |
1.06 |
scatter_sum / IPartOpt / cuda / Primal |
0.00001072 s |
0.00000992 s |
1.08 |
scatter_sum / DefOpt / cuda / Primal |
0.000010752 s |
0.000009504 s |
1.13 |
scatter_sum / IDefOpt / cuda / Primal |
0.0000104 s |
0.000009696 s |
1.07 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017343 s |
0.00001664 s |
1.04 |
scatter_sum / Jax / cuda / Forward |
0.000017311 s |
0.000016609 s |
1.04 |
scatter_sum / HLOOpt / cuda / Forward |
0.000017760000000000003 s |
0.000016607 s |
1.07 |
scatter_sum / PartOpt / cuda / Forward |
0.000017472 s |
0.00001728 s |
1.01 |
scatter_sum / IPartOpt / cuda / Forward |
0.000017439 s |
0.000016385 s |
1.06 |
scatter_sum / DefOpt / cuda / Forward |
0.0000176 s |
0.000017056 s |
1.03 |
scatter_sum / IDefOpt / cuda / Forward |
0.000017568000000000002 s |
0.000016927999999999998 s |
1.04 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000017534999999999997 s |
0.000016768000000000003 s |
1.05 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000017408 s |
0.000016223 s |
1.07 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000017152 s |
0.000016383999999999998 s |
1.05 |
scatter_sum / Jax / cuda / BothRev |
0.000017406999999999998 s |
0.000016288 s |
1.07 |
scatter_sum / HLOOpt / cuda / PreRev |
0.00001744 s |
0.00001696 s |
1.03 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000017215 s |
0.000016128 s |
1.07 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017536 s |
0.000016576000000000002 s |
1.06 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017919999999999998 s |
0.000017184 s |
1.04 |
scatter_sum / PartOpt / cuda / PostRev |
0.000016992 s |
0.000016832 s |
1.01 |
scatter_sum / PartOpt / cuda / BothRev |
0.0000176 s |
0.000016896000000000002 s |
1.04 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000017056 s |
0.000017152 s |
0.99 |
scatter_sum / IPartOpt / cuda / PostRev |
0.00001728 s |
0.000016704 s |
1.03 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000017856 s |
0.00001648 s |
1.08 |
scatter_sum / DefOpt / cuda / PreRev |
0.000017887 s |
0.000017088 s |
1.05 |
scatter_sum / DefOpt / cuda / PostRev |
0.00001632 s |
0.000016768000000000003 s |
0.97 |
scatter_sum / DefOpt / cuda / BothRev |
0.00001712 s |
0.000016768000000000003 s |
1.02 |
scatter_sum / IDefOpt / cuda / PreRev |
0.00001744 s |
0.000016767 s |
1.04 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000017247999999999998 s |
0.000016991 s |
1.02 |
scatter_sum / IDefOpt / cuda / BothRev |
0.00001728 s |
0.00001664 s |
1.04 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001343125 s |
0.0000013433999999999995 s |
1.00 |
scatter_sum / Jax / tpu / Primal |
0.000001357475 s |
0.000001403875 s |
0.97 |
scatter_sum / HLOOpt / tpu / Primal |
0.00000134345 s |
0.00000134345 s |
1 |
scatter_sum / PartOpt / tpu / Primal |
0.00000135735 s |
0.000001404025 s |
0.97 |
scatter_sum / IPartOpt / tpu / Primal |
0.0000013437 s |
0.0000013430500000000002 s |
1.00 |
scatter_sum / DefOpt / tpu / Primal |
0.0000013573250000000002 s |
0.000001404625 s |
0.97 |
scatter_sum / IDefOpt / tpu / Primal |
0.000001343975 s |
0.000001343275 s |
1.00 |
scatter_sum / JaXPipe / tpu / Forward |
0.0000027404 s |
0.0000027061 s |
1.01 |
scatter_sum / Jax / tpu / Forward |
0.00000276875 s |
0.0000027175250000000003 s |
1.02 |
scatter_sum / HLOOpt / tpu / Forward |
0.0000027389999999999995 s |
0.000002700425 s |
1.01 |
scatter_sum / PartOpt / tpu / Forward |
0.00000272055 s |
0.000002684175 s |
1.01 |
scatter_sum / IPartOpt / tpu / Forward |
0.000002738325 s |
0.0000027013 s |
1.01 |
scatter_sum / DefOpt / tpu / Forward |
0.000002717875 s |
0.000002689025 s |
1.01 |
scatter_sum / IDefOpt / tpu / Forward |
0.0000027376000000000003 s |
0.00000271025 s |
1.01 |
scatter_sum / JaXPipe / tpu / PreRev |
0.000002712225 s |
0.000002679725 s |
1.01 |
scatter_sum / JaXPipe / tpu / PostRev |
0.000002740725 s |
0.00000268195 s |
1.02 |
scatter_sum / JaXPipe / tpu / BothRev |
0.000002730375 s |
0.00000269985 s |
1.01 |
scatter_sum / Jax / tpu / BothRev |
0.000002790875 s |
0.0000027357 s |
1.02 |
scatter_sum / HLOOpt / tpu / PreRev |
0.000002729475 s |
0.00000269875 s |
1.01 |
scatter_sum / HLOOpt / tpu / PostRev |
0.00000278675 s |
0.000002735775 s |
1.02 |
scatter_sum / HLOOpt / tpu / BothRev |
0.0000027249 s |
0.0000027005249999999994 s |
1.01 |
scatter_sum / PartOpt / tpu / PreRev |
0.00000278905 s |
0.00000274035 s |
1.02 |
scatter_sum / PartOpt / tpu / PostRev |
0.0000027276000000000004 s |
0.0000026911 s |
1.01 |
scatter_sum / PartOpt / tpu / BothRev |
0.00000279165 s |
0.000002741575 s |
1.02 |
scatter_sum / IPartOpt / tpu / PreRev |
0.00000273165 s |
0.0000026922 s |
1.01 |
scatter_sum / IPartOpt / tpu / PostRev |
0.00000278895 s |
0.0000027361 s |
1.02 |
scatter_sum / IPartOpt / tpu / BothRev |
0.000002732925 s |
0.000002695175 s |
1.01 |
scatter_sum / DefOpt / tpu / PreRev |
0.0000027963500000000003 s |
0.000002741475 s |
1.02 |
scatter_sum / DefOpt / tpu / PostRev |
0.00000273155 s |
0.000002697025 s |
1.01 |
scatter_sum / DefOpt / tpu / BothRev |
0.000002787375000000001 s |
0.00000274035 s |
1.02 |
scatter_sum / IDefOpt / tpu / PreRev |
0.0000027260250000000003 s |
0.000002692925 s |
1.01 |
scatter_sum / IDefOpt / tpu / PostRev |
0.000002785325 s |
0.00000273545 s |
1.02 |
scatter_sum / IDefOpt / tpu / BothRev |
0.0000027261000000000005 s |
0.0000026958 s |
1.01 |
scatter_sum / JaXPipe / cpu / Primal |
0.000015418 s |
0.000007860860059736296 s |
1.96 |
scatter_sum / Jax / cpu / Primal |
0.000015472 s |
0.000007609080003021518 s |
2.03 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015353 s |
0.000007795760029694066 s |
1.97 |
scatter_sum / PartOpt / cpu / Primal |
0.000015461 s |
0.000007391420022031525 s |
2.09 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015388 s |
0.000007779839907016139 s |
1.98 |
scatter_sum / DefOpt / cpu / Primal |
0.0000155 s |
0.000007508239996241171 s |
2.06 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015673 s |
0.000007569679928565165 s |
2.07 |
scatter_sum / JaXPipe / cpu / Forward |
0.000022455 s |
0.000011678260034386766 s |
1.92 |
scatter_sum / Jax / cpu / Forward |
0.000022273 s |
0.000011107439968327526 s |
2.01 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022358 s |
0.000011391500029276358 s |
1.96 |
scatter_sum / PartOpt / cpu / Forward |
0.000022228 s |
0.000011152620045322692 s |
1.99 |
scatter_sum / IPartOpt / cpu / Forward |
0.000022412 s |
0.000012152039926149882 s |
1.84 |
scatter_sum / DefOpt / cpu / Forward |
0.000022413 s |
0.000011344800022925482 s |
1.98 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022692 s |
0.000011099660077888985 s |
2.04 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000022773 s |
0.000011580820009839954 s |
1.97 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000022632 s |
0.000011251139967498605 s |
2.01 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000022338 s |
0.000011530240008141846 s |
1.94 |
scatter_sum / Jax / cpu / BothRev |
0.000022398 s |
0.000011377560058463132 s |
1.97 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022701 s |
0.000011670840012811823 s |
1.95 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000022489 s |
0.000013719059916184051 s |
1.64 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000022015 s |
0.000011614680006459822 s |
1.90 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022609 s |
0.000011115040051663528 s |
2.03 |
scatter_sum / PartOpt / cpu / PostRev |
0.000022607 s |
0.000011269699989497894 s |
2.01 |
scatter_sum / PartOpt / cpu / BothRev |
0.000022302 s |
0.0000121184199815616 s |
1.84 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000022283 s |
0.000011329860008117976 s |
1.97 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000022752 s |
0.000011477400021249197 s |
1.98 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000022606 s |
0.00001145995996921556 s |
1.97 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022718 s |
0.00001141110002208734 s |
1.99 |
scatter_sum / DefOpt / cpu / PostRev |
0.000022317 s |
0.000011709120044542944 s |
1.91 |
scatter_sum / DefOpt / cpu / BothRev |
0.000022342 s |
0.000011378699946362758 s |
1.96 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022687 s |
0.000011255140034336364 s |
2.02 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000021917 s |
0.000011675160167214926 s |
1.88 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000022432 s |
0.00001117430001613684 s |
2.01 |
slicing / JaXPipe / cuda / Primal |
0.000002303 s |
0.000001887 s |
1.22 |
slicing / Jax / cuda / Primal |
0.000002303 s |
0.000001887 s |
1.22 |
slicing / HLOOpt / cuda / Primal |
0.000002303 s |
0.000001887 s |
1.22 |
slicing / PartOpt / cuda / Primal |
0.000002304 s |
0.000001887 s |
1.22 |
slicing / IPartOpt / cuda / Primal |
0.000002303 s |
0.000001887 s |
1.22 |
slicing / DefOpt / cuda / Primal |
0.000002303 s |
0.000001887 s |
1.22 |
slicing / IDefOpt / cuda / Primal |
0.000002304 s |
0.000001887 s |
1.22 |
slicing / JaXPipe / cuda / Forward |
0.00001008 s |
0.0000112 s |
0.90 |
slicing / Jax / cuda / Forward |
0.00001072 s |
0.000010784 s |
0.99 |
slicing / HLOOpt / cuda / Forward |
0.000010559 s |
0.000009696 s |
1.09 |
slicing / PartOpt / cuda / Forward |
0.000011136 s |
0.0000096 s |
1.16 |
slicing / IPartOpt / cuda / Forward |
0.000011231 s |
0.0000096 s |
1.17 |
slicing / DefOpt / cuda / Forward |
0.000011712 s |
0.000009344 s |
1.25 |
slicing / IDefOpt / cuda / Forward |
0.0000104 s |
0.000009888 s |
1.05 |
slicing / JaXPipe / cuda / PreRev |
0.00001024 s |
0.00000976 s |
1.05 |
slicing / JaXPipe / cuda / PostRev |
0.000010336 s |
0.000009248 s |
1.12 |
slicing / JaXPipe / cuda / BothRev |
0.000010816 s |
0.000009408 s |
1.15 |
slicing / Jax / cuda / BothRev |
0.00001024 s |
0.00000976 s |
1.05 |
slicing / HLOOpt / cuda / PreRev |
0.0000104 s |
0.000009311 s |
1.12 |
slicing / HLOOpt / cuda / PostRev |
0.000010208 s |
0.000009184 s |
1.11 |
slicing / HLOOpt / cuda / BothRev |
0.00001008 s |
0.000009407 s |
1.07 |
slicing / PartOpt / cuda / PreRev |
0.000010337 s |
0.000009632 s |
1.07 |
slicing / PartOpt / cuda / PostRev |
0.000010209 s |
0.000009536 s |
1.07 |
slicing / PartOpt / cuda / BothRev |
0.0000104 s |
0.000009696 s |
1.07 |
slicing / IPartOpt / cuda / PreRev |
0.000010528 s |
0.000010176 s |
1.03 |
slicing / IPartOpt / cuda / PostRev |
0.000009856 s |
0.000009152 s |
1.08 |
slicing / IPartOpt / cuda / BothRev |
0.000010304 s |
0.000009376 s |
1.10 |
slicing / DefOpt / cuda / PreRev |
0.000010048 s |
0.000009792 s |
1.03 |
slicing / DefOpt / cuda / PostRev |
0.000010335 s |
0.000009536 s |
1.08 |
slicing / DefOpt / cuda / BothRev |
0.00001024 s |
0.000009824 s |
1.04 |
slicing / IDefOpt / cuda / PreRev |
0.000010336 s |
0.000009664 s |
1.07 |
slicing / IDefOpt / cuda / PostRev |
0.000010176 s |
0.00000928 s |
1.10 |
slicing / IDefOpt / cuda / BothRev |
0.00001024 s |
0.000009407 s |
1.09 |
slicing / JaXPipe / tpu / Primal |
9.46775e-7 s |
0.0000010327749999999998 s |
0.92 |
slicing / Jax / tpu / Primal |
9.6475e-7 s |
9.6835e-7 s |
1.00 |
slicing / HLOOpt / tpu / Primal |
9.49225e-7 s |
0.000001028125 s |
0.92 |
slicing / PartOpt / tpu / Primal |
9.62925e-7 s |
9.7225e-7 s |
0.99 |
slicing / IPartOpt / tpu / Primal |
9.53675e-7 s |
0.00000102785 s |
0.93 |
slicing / DefOpt / tpu / Primal |
9.6465e-7 s |
9.73525e-7 s |
0.99 |
slicing / IDefOpt / tpu / Primal |
9.52575e-7 s |
0.00000102885 s |
0.93 |
slicing / JaXPipe / tpu / Forward |
0.00000140235 s |
0.000001412225 s |
0.99 |
slicing / Jax / tpu / Forward |
0.00000139985 s |
0.0000014785 s |
0.95 |
slicing / HLOOpt / tpu / Forward |
0.0000015031 s |
0.00000152365 s |
0.99 |
slicing / PartOpt / tpu / Forward |
0.00000141785 s |
0.0000014948749999999998 s |
0.95 |
slicing / IPartOpt / tpu / Forward |
0.000001504775 s |
0.000001518125 s |
0.99 |
slicing / DefOpt / tpu / Forward |
0.00000142385 s |
0.00000149345 s |
0.95 |
slicing / IDefOpt / tpu / Forward |
0.0000015093 s |
0.000001518 s |
0.99 |
slicing / JaXPipe / tpu / PreRev |
0.000002337975 s |
0.000002574175 s |
0.91 |
slicing / JaXPipe / tpu / PostRev |
0.0000025166500000000003 s |
0.00000252945 s |
0.99 |
slicing / JaXPipe / tpu / BothRev |
0.0000023508 s |
0.00000258505 s |
0.91 |
slicing / Jax / tpu / BothRev |
0.00000252575 s |
0.000002541075 s |
0.99 |
slicing / HLOOpt / tpu / PreRev |
0.000002357475 s |
0.000002582225 s |
0.91 |
slicing / HLOOpt / tpu / PostRev |
0.0000025261250000000003 s |
0.000002545775 s |
0.99 |
slicing / HLOOpt / tpu / BothRev |
0.00000236105 s |
0.000002588025 s |
0.91 |
slicing / PartOpt / tpu / PreRev |
0.0000025314 s |
0.000002544625 s |
0.99 |
slicing / PartOpt / tpu / PostRev |
0.000002341825 s |
0.000002578375 s |
0.91 |
slicing / PartOpt / tpu / BothRev |
0.0000025297 s |
0.00000254555 s |
0.99 |
slicing / IPartOpt / tpu / PreRev |
0.00000234325 s |
0.0000025965 s |
0.90 |
slicing / IPartOpt / tpu / PostRev |
0.0000025286 s |
0.0000025475999999999995 s |
0.99 |
slicing / IPartOpt / tpu / BothRev |
0.0000023469000000000003 s |
0.000002588025 s |
0.91 |
slicing / DefOpt / tpu / PreRev |
0.00000253315 s |
0.0000025353750000000003 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.00000234965 s |
0.000002575875 s |
0.91 |
slicing / DefOpt / tpu / BothRev |
0.0000025242249999999995 s |
0.0000025388 s |
0.99 |
slicing / IDefOpt / tpu / PreRev |
0.00000234065 s |
0.00000258995 s |
0.90 |
slicing / IDefOpt / tpu / PostRev |
0.000002533925 s |
0.000002540325 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.0000023498250000000004 s |
0.0000025786 s |
0.91 |
slicing / JaXPipe / cpu / Primal |
0.000012667 s |
0.000006055780013412004 s |
2.09 |
slicing / Jax / cpu / Primal |
0.00001259 s |
0.000006401219998224406 s |
1.97 |
slicing / HLOOpt / cpu / Primal |
0.000012482 s |
0.000006648099879384972 s |
1.88 |
slicing / PartOpt / cpu / Primal |
0.000012418 s |
0.00000627523999355617 s |
1.98 |
slicing / IPartOpt / cpu / Primal |
0.000012505 s |
0.000006483800007117679 s |
1.93 |
slicing / DefOpt / cpu / Primal |
0.000012631 s |
0.000006234299999050563 s |
2.03 |
slicing / IDefOpt / cpu / Primal |
0.000012385 s |
0.00000625420003416366 s |
1.98 |
slicing / JaXPipe / cpu / Forward |
0.000017035999999999997 s |
0.00000923432007766678 s |
1.84 |
slicing / Jax / cpu / Forward |
0.000016667 s |
0.00000918981999348034 s |
1.81 |
slicing / HLOOpt / cpu / Forward |
0.000016545 s |
0.000009896999945340212 s |
1.67 |
slicing / PartOpt / cpu / Forward |
0.000016739 s |
0.000009109320035349811 s |
1.84 |
slicing / IPartOpt / cpu / Forward |
0.000016833 s |
0.000009506240094196982 s |
1.77 |
slicing / DefOpt / cpu / Forward |
0.000016561 s |
0.000009059960066224448 s |
1.83 |
slicing / IDefOpt / cpu / Forward |
0.000016723 s |
0.000009192240067932287 s |
1.82 |
slicing / JaXPipe / cpu / PreRev |
0.000017347000000000002 s |
0.000010620739940350176 s |
1.63 |
slicing / JaXPipe / cpu / PostRev |
0.000017295000000000003 s |
0.000010175260104006155 s |
1.70 |
slicing / JaXPipe / cpu / BothRev |
0.000017391 s |
0.00001053578003848088 s |
1.65 |
slicing / Jax / cpu / BothRev |
0.00001717 s |
0.000010063680038001622 s |
1.71 |
slicing / HLOOpt / cpu / PreRev |
0.000017249 s |
0.000010681059993657982 s |
1.61 |
slicing / HLOOpt / cpu / PostRev |
0.00001737 s |
0.000011751160000130767 s |
1.48 |
slicing / HLOOpt / cpu / BothRev |
0.000017427 s |
0.000009757120024005417 s |
1.79 |
slicing / PartOpt / cpu / PreRev |
0.000017458 s |
0.000009965299923351268 s |
1.75 |
slicing / PartOpt / cpu / PostRev |
0.000017454999999999998 s |
0.000009642119948694017 s |
1.81 |
slicing / PartOpt / cpu / BothRev |
0.000017281000000000003 s |
0.00001038146005157614 s |
1.66 |
slicing / IPartOpt / cpu / PreRev |
0.000017149 s |
0.00001028291997499764 s |
1.67 |
slicing / IPartOpt / cpu / PostRev |
0.000017128999999999998 s |
0.000009681240007921588 s |
1.77 |
slicing / IPartOpt / cpu / BothRev |
0.000017454 s |
0.000009983480067603525 s |
1.75 |
slicing / DefOpt / cpu / PreRev |
0.000017446 s |
0.000009870160047285026 s |
1.77 |
slicing / DefOpt / cpu / PostRev |
0.000017372 s |
0.000009663760047260438 s |
1.80 |
slicing / DefOpt / cpu / BothRev |
0.000017346 s |
0.00000990344004094368 s |
1.75 |
slicing / IDefOpt / cpu / PreRev |
0.000017514 s |
0.00000993309995465097 s |
1.76 |
slicing / IDefOpt / cpu / PostRev |
0.000017101 s |
0.000010138960005861008 s |
1.69 |
slicing / IDefOpt / cpu / BothRev |
0.000017193 s |
0.000009660899995651564 s |
1.78 |
sum / JaXPipe / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / Jax / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / HLOOpt / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / PartOpt / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / IPartOpt / cuda / Primal |
0.000002496 s |
0.000002079 s |
1.20 |
sum / DefOpt / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / IDefOpt / cuda / Primal |
0.000002496 s |
0.00000208 s |
1.20 |
sum / JaXPipe / cuda / Forward |
0.000010943 s |
0.00001024 s |
1.07 |
sum / Jax / cuda / Forward |
0.00001072 s |
0.000010015 s |
1.07 |
sum / HLOOpt / cuda / Forward |
0.000010592 s |
0.000009791 s |
1.08 |
sum / PartOpt / cuda / Forward |
0.0000104 s |
0.00001024 s |
1.02 |
sum / IPartOpt / cuda / Forward |
0.00001056 s |
0.000010144 s |
1.04 |
sum / DefOpt / cuda / Forward |
0.00001072 s |
0.000010144 s |
1.06 |
sum / IDefOpt / cuda / Forward |
0.000010815 s |
0.000009952 s |
1.09 |
sum / JaXPipe / cuda / PreRev |
0.000010112 s |
0.000009504 s |
1.06 |
sum / JaXPipe / cuda / PostRev |
0.000010272 s |
0.000009312000000000002 s |
1.10 |
sum / JaXPipe / cuda / BothRev |
0.0000104 s |
0.000009824 s |
1.06 |
sum / Jax / cuda / BothRev |
0.000010304 s |
0.000009504 s |
1.08 |
sum / HLOOpt / cuda / PreRev |
0.000010752 s |
0.000009472 s |
1.14 |
sum / HLOOpt / cuda / PostRev |
0.000010272 s |
0.00000944 s |
1.09 |
sum / HLOOpt / cuda / BothRev |
0.000010207 s |
0.000009472 s |
1.08 |
sum / PartOpt / cuda / PreRev |
0.000008959999999999999 s |
0.000009824 s |
0.91 |
sum / PartOpt / cuda / PostRev |
0.000010336 s |
0.0000096 s |
1.08 |
sum / PartOpt / cuda / BothRev |
0.000010463 s |
0.00000976 s |
1.07 |
sum / IPartOpt / cuda / PreRev |
0.000010399 s |
0.000010016 s |
1.04 |
sum / IPartOpt / cuda / PostRev |
0.000009984 s |
0.000009376 s |
1.06 |
sum / IPartOpt / cuda / BothRev |
0.00001008 s |
0.000009408 s |
1.07 |
sum / DefOpt / cuda / PreRev |
0.000010144 s |
0.000010048 s |
1.01 |
sum / DefOpt / cuda / PostRev |
0.0000104 s |
0.000009248 s |
1.12 |
sum / DefOpt / cuda / BothRev |
0.000010367 s |
0.000009472 s |
1.09 |
sum / IDefOpt / cuda / PreRev |
0.000010432 s |
0.000009729 s |
1.07 |
sum / IDefOpt / cuda / PostRev |
0.000010336 s |
0.00000976 s |
1.06 |
sum / IDefOpt / cuda / BothRev |
0.00001056 s |
0.000009376 s |
1.13 |
sum / JaXPipe / tpu / Primal |
5.170999999999999e-7 s |
5.104e-7 s |
1.01 |
sum / Jax / tpu / Primal |
5.468e-7 s |
5.4715e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.1685e-7 s |
5.106499999999999e-7 s |
1.01 |
sum / PartOpt / tpu / Primal |
5.47125e-7 s |
5.47325e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.1715e-7 s |
5.10375e-7 s |
1.01 |
sum / DefOpt / tpu / Primal |
5.4755e-7 s |
5.4725e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.1685e-7 s |
5.10875e-7 s |
1.01 |
sum / JaXPipe / tpu / Forward |
0.00000155165 s |
0.00000154625 s |
1.00 |
sum / Jax / tpu / Forward |
0.0000015031 s |
0.000001501025 s |
1.00 |
sum / HLOOpt / tpu / Forward |
0.000001527925 s |
0.000001532 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.000001499825 s |
0.000001497225 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.000001528325 s |
0.000001537975 s |
0.99 |
sum / DefOpt / tpu / Forward |
0.000001504075 s |
0.00000150195 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.0000015416750000000002 s |
0.000001539375 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.0000010075749999999998 s |
0.000001047525 s |
0.96 |
sum / JaXPipe / tpu / PostRev |
0.0000010301749999999998 s |
0.000001085425 s |
0.95 |
sum / JaXPipe / tpu / BothRev |
0.000001004125 s |
0.0000010469749999999998 s |
0.96 |
sum / Jax / tpu / BothRev |
0.000001030775 s |
0.00000108705 s |
0.95 |
sum / HLOOpt / tpu / PreRev |
0.00000100665 s |
0.0000010459 s |
0.96 |
sum / HLOOpt / tpu / PostRev |
0.000001034025 s |
0.000001083625 s |
0.95 |
sum / HLOOpt / tpu / BothRev |
0.0000010108250000000002 s |
0.000001051 s |
0.96 |
sum / PartOpt / tpu / PreRev |
0.000001037825 s |
0.000001084875 s |
0.96 |
sum / PartOpt / tpu / PostRev |
0.00000100115 s |
0.00000105225 s |
0.95 |
sum / PartOpt / tpu / BothRev |
0.000001029925 s |
0.000001105025 s |
0.93 |
sum / IPartOpt / tpu / PreRev |
0.0000010099 s |
0.000001056775 s |
0.96 |
sum / IPartOpt / tpu / PostRev |
0.000001037725 s |
0.0000011002 s |
0.94 |
sum / IPartOpt / tpu / BothRev |
9.9985e-7 s |
0.000001069775 s |
0.93 |
sum / DefOpt / tpu / PreRev |
0.0000010390499999999998 s |
0.000001087875 s |
0.96 |
sum / DefOpt / tpu / PostRev |
9.996249999999998e-7 s |
0.000001063325 s |
0.94 |
sum / DefOpt / tpu / BothRev |
0.0000010315 s |
0.0000010876 s |
0.95 |
sum / IDefOpt / tpu / PreRev |
0.000001011875 s |
0.00000105495 s |
0.96 |
sum / IDefOpt / tpu / PostRev |
0.000001029875 s |
0.0000010968 s |
0.94 |
sum / IDefOpt / tpu / BothRev |
0.0000010097 s |
0.000001060625 s |
0.95 |
sum / JaXPipe / cpu / Primal |
0.000014758 s |
0.000007892679968790617 s |
1.87 |
sum / Jax / cpu / Primal |
0.000014287 s |
0.00000730996003767359 s |
1.95 |
sum / HLOOpt / cpu / Primal |
0.000014439 s |
0.000008103620002657408 s |
1.78 |
sum / PartOpt / cpu / Primal |
0.00001451 s |
0.000007371400042757159 s |
1.97 |
sum / IPartOpt / cpu / Primal |
0.00001448 s |
0.000007986760138010141 s |
1.81 |
sum / DefOpt / cpu / Primal |
0.000014525 s |
0.00000800075993538485 s |
1.82 |
sum / IDefOpt / cpu / Primal |
0.000014306 s |
0.000007718659908277914 s |
1.85 |
sum / JaXPipe / cpu / Forward |
0.000020017 s |
0.000011231939952267568 s |
1.78 |
sum / Jax / cpu / Forward |
0.000020133 s |
0.000011011899896402613 s |
1.83 |
sum / HLOOpt / cpu / Forward |
0.000019511 s |
0.000011633680114755407 s |
1.68 |
sum / PartOpt / cpu / Forward |
0.000019969 s |
0.000010882500027946662 s |
1.83 |
sum / IPartOpt / cpu / Forward |
0.00001986 s |
0.000011480140037747334 s |
1.73 |
sum / DefOpt / cpu / Forward |
0.000019588000000000003 s |
0.00001102809999792953 s |
1.78 |
sum / IDefOpt / cpu / Forward |
0.000019379 s |
0.000010877019994950388 s |
1.78 |
sum / JaXPipe / cpu / PreRev |
0.000019235 s |
0.000011482460013212404 s |
1.68 |
sum / JaXPipe / cpu / PostRev |
0.000018746 s |
0.000010741920032160124 s |
1.75 |
sum / JaXPipe / cpu / BothRev |
0.000018625 s |
0.000010569679961918154 s |
1.76 |
sum / Jax / cpu / BothRev |
0.000018579 s |
0.000010753660044429123 s |
1.73 |
sum / HLOOpt / cpu / PreRev |
0.000019088 s |
0.00001103539996620384 s |
1.73 |
sum / HLOOpt / cpu / PostRev |
0.000018509 s |
0.000012518980001914317 s |
1.48 |
sum / HLOOpt / cpu / BothRev |
0.000018686 s |
0.00001066204007656779 s |
1.75 |
sum / PartOpt / cpu / PreRev |
0.000018904 s |
0.00001123383997764904 s |
1.68 |
sum / PartOpt / cpu / PostRev |
0.000018763 s |
0.000010826800044014816 s |
1.73 |
sum / PartOpt / cpu / BothRev |
0.000018801 s |
0.000010874180097744102 s |
1.73 |
sum / IPartOpt / cpu / PreRev |
0.000018903 s |
0.00001092020002033678 s |
1.73 |
sum / IPartOpt / cpu / PostRev |
0.000018672 s |
0.000010615059964038664 s |
1.76 |
sum / IPartOpt / cpu / BothRev |
0.000018671 s |
0.000010592939961497903 s |
1.76 |
sum / DefOpt / cpu / PreRev |
0.000018645 s |
0.00001073278008334455 s |
1.74 |
sum / DefOpt / cpu / PostRev |
0.00001915 s |
0.000010550599963607963 s |
1.82 |
sum / DefOpt / cpu / BothRev |
0.000018634 s |
0.00001068278010279755 s |
1.74 |
sum / IDefOpt / cpu / PreRev |
0.000018907 s |
0.00001057109999237582 s |
1.79 |
sum / IDefOpt / cpu / PostRev |
0.00001902 s |
0.000010405599969089964 s |
1.83 |
sum / IDefOpt / cpu / BothRev |
0.000018892 s |
0.000010484559952601558 s |
1.80 |
value_and_grad / JaXPipe / cuda / Primal |
0.000032896000000000005 s |
0.000033792000000000004 s |
0.97 |
value_and_grad / Jax / cuda / Primal |
0.000032 s |
0.000033184 s |
0.96 |
value_and_grad / HLOOpt / cuda / Primal |
0.000032096 s |
0.000032767999999999995 s |
0.98 |
value_and_grad / PartOpt / cuda / Primal |
0.000032096 s |
0.00003232 s |
0.99 |
value_and_grad / IPartOpt / cuda / Primal |
0.000032063 s |
0.000033152000000000004 s |
0.97 |
value_and_grad / DefOpt / cuda / Primal |
0.000032992 s |
0.000033343 s |
0.99 |
value_and_grad / IDefOpt / cuda / Primal |
0.000032672 s |
0.000032992 s |
0.99 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000023551 s |
0.000013518900032067904 s |
1.74 |
value_and_grad / Jax / cpu / Primal |
0.000022913 s |
0.000013619000092148782 s |
1.68 |
value_and_grad / HLOOpt / cpu / Primal |
0.000023013 s |
0.00001315354005782865 s |
1.75 |
value_and_grad / PartOpt / cpu / Primal |
0.000023037 s |
0.000013153380095900502 s |
1.75 |
value_and_grad / IPartOpt / cpu / Primal |
0.000023052 s |
0.00001324175998888677 s |
1.74 |
value_and_grad / DefOpt / cpu / Primal |
0.000023083 s |
0.000013427060039248318 s |
1.72 |
value_and_grad / IDefOpt / cpu / Primal |
0.000022897 s |
0.00001322780008194968 s |
1.73 |
This comment was automatically generated by workflow using github-action-benchmark.
c50080b to
37aa1e7
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.