-
Notifications
You must be signed in to change notification settings - Fork 28
Add RecognizeMultiSlice #2023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RecognizeMultiSlice #2023
Conversation
|
|
||
| // Propagate sharding if present | ||
| if (auto shard = sdy::getShardingPerValue(op)) { | ||
| sdy::setShardings(newOp, shard); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avik-pal can you support here I think this [and also recognizemultirotate, in retrospect] may be wrong. Specifically the original slice has only one result, whereas this has multiple, so we need a sharding which replicates it across all values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we need to set for each of the results https://openxla.org/shardy/sdy_dialect#tensorshardingpervalueattr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will try this in a bit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried here but unsure if this makes sense: a02f53d (#2023)
6cce4ac to
a02f53d
Compare
| sdy::TensorShardingAttr singleShard = shardings[0]; | ||
| SmallVector<sdy::TensorShardingAttr> newShardings(totalResults, | ||
| singleShard); | ||
| sdy::setShardings(newOp, sdy::TensorShardingPerValueAttr::get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, @jumerckx can you also port this to recognizemultirotate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: a02f53d | Previous: 3a9f897 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.00000756341999704091 s |
0.000007733859993095393 s |
0.98 |
actmtch / Jax / cpu / Primal |
0.000007082039994656952 s |
0.000007217959982881439 s |
0.98 |
actmtch / HLOOpt / cpu / Primal |
0.000010064340012831962 s |
0.00001069129997631535 s |
0.94 |
actmtch / PartOpt / cpu / Primal |
0.00000649998000199048 s |
0.000007246099958138075 s |
0.90 |
actmtch / IPartOpt / cpu / Primal |
0.000006439779992888361 s |
0.000007042160050332314 s |
0.91 |
actmtch / DefOpt / cpu / Primal |
0.000007414620010877116 s |
0.000011662419983622386 s |
0.64 |
actmtch / IDefOpt / cpu / Primal |
0.000007664599997951882 s |
0.000007531500014010817 s |
1.02 |
actmtch / JaXPipe / cpu / Forward |
0.00001123953999240257 s |
0.0000115438599823392 s |
0.97 |
actmtch / Jax / cpu / Forward |
0.00000972533999629377 s |
0.000010759360029624076 s |
0.90 |
actmtch / HLOOpt / cpu / Forward |
0.00001574148001282083 s |
0.00001625256000806985 s |
0.97 |
actmtch / PartOpt / cpu / Forward |
0.0000153223200118191 s |
0.000016039159936553916 s |
0.96 |
actmtch / IPartOpt / cpu / Forward |
0.000010404920003566075 s |
0.000010876940004891366 s |
0.96 |
actmtch / DefOpt / cpu / Forward |
0.000014776780012653037 s |
0.000016272039974865038 s |
0.91 |
actmtch / IDefOpt / cpu / Forward |
0.0000107448200037652 s |
0.000011126260033051948 s |
0.97 |
actmtch / JaXPipe / cpu / PreRev |
0.000011881699992954964 s |
0.000012882199980595033 s |
0.92 |
actmtch / JaXPipe / cpu / PostRev |
0.000011111459998573992 s |
0.000011677599968606956 s |
0.95 |
actmtch / JaXPipe / cpu / BothRev |
0.00001541293999252957 s |
0.000014915620004103403 s |
1.03 |
actmtch / Jax / cpu / BothRev |
0.00000990509999837741 s |
0.000011236720019951465 s |
0.88 |
actmtch / HLOOpt / cpu / PreRev |
0.000011278160006895631 s |
0.000012836999976570953 s |
0.88 |
actmtch / HLOOpt / cpu / PostRev |
0.000011307839997698466 s |
0.000012310039946896725 s |
0.92 |
actmtch / HLOOpt / cpu / BothRev |
0.000013429919995360251 s |
0.000015116180020413594 s |
0.89 |
actmtch / PartOpt / cpu / PreRev |
0.000011267820002558438 s |
0.00001279665995753021 s |
0.88 |
actmtch / PartOpt / cpu / PostRev |
0.000010799159999805852 s |
0.00001154188003965828 s |
0.94 |
actmtch / PartOpt / cpu / BothRev |
0.000011575799992442628 s |
0.000012728359988614102 s |
0.91 |
actmtch / IPartOpt / cpu / PreRev |
0.000014993680013049016 s |
0.000013585119968411164 s |
1.10 |
actmtch / IPartOpt / cpu / PostRev |
0.000010603139994600497 s |
0.000011491159984871046 s |
0.92 |
actmtch / IPartOpt / cpu / BothRev |
0.000011769900008857804 s |
0.000012637340014407529 s |
0.93 |
actmtch / DefOpt / cpu / PreRev |
0.000011963880001530923 s |
0.00001295805999689037 s |
0.92 |
actmtch / DefOpt / cpu / PostRev |
0.000011982520013589238 s |
0.000012994319995414116 s |
0.92 |
actmtch / DefOpt / cpu / BothRev |
0.000011685580000175831 s |
0.00001293110000915476 s |
0.90 |
actmtch / IDefOpt / cpu / PreRev |
0.00001180694000368021 s |
0.000013025639982515712 s |
0.91 |
actmtch / IDefOpt / cpu / PostRev |
0.000012207980005314313 s |
0.000012815459986086352 s |
0.95 |
actmtch / IDefOpt / cpu / BothRev |
0.000011668719996578148 s |
0.000012133060017731625 s |
0.96 |
actmtch / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002047 s |
0.98 |
actmtch / Jax / cuda / Primal |
0.000002047 s |
0.000002015 s |
1.02 |
actmtch / HLOOpt / cuda / Primal |
0.000002047 s |
0.000002016 s |
1.02 |
actmtch / PartOpt / cuda / Primal |
0.000002047 s |
0.000002047 s |
1 |
actmtch / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / DefOpt / cuda / Primal |
0.000002047 s |
0.000002016 s |
1.02 |
actmtch / IDefOpt / cuda / Primal |
0.000002047 s |
0.000002047 s |
1 |
actmtch / JaXPipe / cuda / Forward |
0.000011776 s |
0.000010369 s |
1.14 |
actmtch / Jax / cuda / Forward |
0.000011264 s |
0.000010496 s |
1.07 |
actmtch / HLOOpt / cuda / Forward |
0.000010304 s |
0.0000104 s |
0.99 |
actmtch / PartOpt / cuda / Forward |
0.000010464 s |
0.000010176 s |
1.03 |
actmtch / IPartOpt / cuda / Forward |
0.00001024 s |
0.000010496 s |
0.98 |
actmtch / DefOpt / cuda / Forward |
0.000010336 s |
0.000010112 s |
1.02 |
actmtch / IDefOpt / cuda / Forward |
0.000010113 s |
0.000009888 s |
1.02 |
actmtch / JaXPipe / cuda / PreRev |
0.00001024 s |
0.000010527 s |
0.97 |
actmtch / JaXPipe / cuda / PostRev |
0.00001024 s |
0.000009888 s |
1.04 |
actmtch / JaXPipe / cuda / BothRev |
0.000009248 s |
0.000010112 s |
0.91 |
actmtch / Jax / cuda / BothRev |
0.00001008 s |
0.000010304 s |
0.98 |
actmtch / HLOOpt / cuda / PreRev |
0.000009696 s |
0.000010176 s |
0.95 |
actmtch / HLOOpt / cuda / PostRev |
0.00001024 s |
0.000010144 s |
1.01 |
actmtch / HLOOpt / cuda / BothRev |
0.000010336 s |
0.000009984 s |
1.04 |
actmtch / PartOpt / cuda / PreRev |
0.000010112 s |
0.000010336 s |
0.98 |
actmtch / PartOpt / cuda / PostRev |
0.000010368 s |
0.000010112 s |
1.03 |
actmtch / PartOpt / cuda / BothRev |
0.000010016 s |
0.000010208 s |
0.98 |
actmtch / IPartOpt / cuda / PreRev |
0.000010336 s |
0.000010176 s |
1.02 |
actmtch / IPartOpt / cuda / PostRev |
0.000010016 s |
0.000010209 s |
0.98 |
actmtch / IPartOpt / cuda / BothRev |
0.000009888 s |
0.000010143 s |
0.97 |
actmtch / DefOpt / cuda / PreRev |
0.000010176 s |
0.00001024 s |
0.99 |
actmtch / DefOpt / cuda / PostRev |
0.000010336 s |
0.00001056 s |
0.98 |
actmtch / DefOpt / cuda / BothRev |
0.000010208 s |
0.000010144 s |
1.01 |
actmtch / IDefOpt / cuda / PreRev |
0.000010271 s |
0.00001056 s |
0.97 |
actmtch / IDefOpt / cuda / PostRev |
0.000010176 s |
0.000010112 s |
1.01 |
actmtch / IDefOpt / cuda / BothRev |
0.000010432 s |
0.000010272 s |
1.02 |
actmtch / JaXPipe / tpu / Primal |
5.82925e-7 s |
5.632000000000001e-7 s |
1.04 |
actmtch / Jax / tpu / Primal |
5.735e-7 s |
6.06825e-7 s |
0.95 |
actmtch / HLOOpt / tpu / Primal |
0.000002169775 s |
0.0000021014 s |
1.03 |
actmtch / PartOpt / tpu / Primal |
5.7325e-7 s |
6.06525e-7 s |
0.95 |
actmtch / IPartOpt / tpu / Primal |
5.854e-7 s |
5.623750000000001e-7 s |
1.04 |
actmtch / DefOpt / tpu / Primal |
0.00000205475 s |
0.0000021606 s |
0.95 |
actmtch / IDefOpt / tpu / Primal |
0.0000021649 s |
0.000002102475 s |
1.03 |
actmtch / JaXPipe / tpu / Forward |
0.000003862975 s |
0.0000038357 s |
1.01 |
actmtch / Jax / tpu / Forward |
0.000001241925 s |
0.000001215625 s |
1.02 |
actmtch / HLOOpt / tpu / Forward |
0.000003674625 s |
0.0000039359 s |
0.93 |
actmtch / PartOpt / tpu / Forward |
0.000003893825 s |
0.0000039167 s |
0.99 |
actmtch / IPartOpt / tpu / Forward |
0.0000036619 s |
0.00000393865 s |
0.93 |
actmtch / DefOpt / tpu / Forward |
0.000003897375 s |
0.00000391775 s |
0.99 |
actmtch / IDefOpt / tpu / Forward |
0.000003675925 s |
0.000003933225 s |
0.93 |
actmtch / JaXPipe / tpu / PreRev |
0.0000037605 s |
0.00000347695 s |
1.08 |
actmtch / JaXPipe / tpu / PostRev |
0.00000162295 s |
0.0000016438 s |
0.99 |
actmtch / JaXPipe / tpu / BothRev |
0.0000037561 s |
0.0000034815 s |
1.08 |
actmtch / Jax / tpu / BothRev |
0.000001616925 s |
0.0000016392 s |
0.99 |
actmtch / HLOOpt / tpu / PreRev |
0.00000374915 s |
0.00000348105 s |
1.08 |
actmtch / HLOOpt / tpu / PostRev |
0.000003448125 s |
0.00000340075 s |
1.01 |
actmtch / HLOOpt / tpu / BothRev |
0.0000037474 s |
0.0000034776 s |
1.08 |
actmtch / PartOpt / tpu / PreRev |
0.0000034447 s |
0.0000034090500000000004 s |
1.01 |
actmtch / PartOpt / tpu / PostRev |
0.000001675975 s |
0.0000015899 s |
1.05 |
actmtch / PartOpt / tpu / BothRev |
0.0000034446 s |
0.000003414025 s |
1.01 |
actmtch / IPartOpt / tpu / PreRev |
0.000003744925 s |
0.0000034937750000000003 s |
1.07 |
actmtch / IPartOpt / tpu / PostRev |
0.0000016282749999999997 s |
0.00000164185 s |
0.99 |
actmtch / IPartOpt / tpu / BothRev |
0.00000374005 s |
0.0000034793750000000005 s |
1.07 |
actmtch / DefOpt / tpu / PreRev |
0.000003454525 s |
0.0000033998 s |
1.02 |
actmtch / DefOpt / tpu / PostRev |
0.000003668975 s |
0.000003417725 s |
1.07 |
actmtch / DefOpt / tpu / BothRev |
0.0000034479 s |
0.0000034063500000000004 s |
1.01 |
actmtch / IDefOpt / tpu / PreRev |
0.000003745925 s |
0.000003474375 s |
1.08 |
actmtch / IDefOpt / tpu / PostRev |
0.000003440325 s |
0.000003406225 s |
1.01 |
actmtch / IDefOpt / tpu / BothRev |
0.000003752575 s |
0.00000346605 s |
1.08 |
actmtch / JaXPipe / cpu / Primal |
0.000012929 s |
0.000007733859993095393 s |
1.67 |
actmtch / Jax / cpu / Primal |
0.000013449 s |
0.000007217959982881439 s |
1.86 |
actmtch / HLOOpt / cpu / Primal |
0.000013907 s |
0.00001069129997631535 s |
1.30 |
actmtch / PartOpt / cpu / Primal |
0.000013117 s |
0.000007246099958138075 s |
1.81 |
actmtch / IPartOpt / cpu / Primal |
0.000013181999999999998 s |
0.000007042160050332314 s |
1.87 |
actmtch / DefOpt / cpu / Primal |
0.000013717 s |
0.000011662419983622386 s |
1.18 |
actmtch / IDefOpt / cpu / Primal |
0.000013907 s |
0.000007531500014010817 s |
1.85 |
actmtch / JaXPipe / cpu / Forward |
0.000019495 s |
0.0000115438599823392 s |
1.69 |
actmtch / Jax / cpu / Forward |
0.000017757 s |
0.000010759360029624076 s |
1.65 |
actmtch / HLOOpt / cpu / Forward |
0.000018749 s |
0.00001625256000806985 s |
1.15 |
actmtch / PartOpt / cpu / Forward |
0.000019507 s |
0.000016039159936553916 s |
1.22 |
actmtch / IPartOpt / cpu / Forward |
0.00001963 s |
0.000010876940004891366 s |
1.80 |
actmtch / DefOpt / cpu / Forward |
0.000019148 s |
0.000016272039974865038 s |
1.18 |
actmtch / IDefOpt / cpu / Forward |
0.000019428 s |
0.000011126260033051948 s |
1.75 |
actmtch / JaXPipe / cpu / PreRev |
0.000019074 s |
0.000012882199980595033 s |
1.48 |
actmtch / JaXPipe / cpu / PostRev |
0.000018121 s |
0.000011677599968606956 s |
1.55 |
actmtch / JaXPipe / cpu / BothRev |
0.000019900000000000003 s |
0.000014915620004103403 s |
1.33 |
actmtch / Jax / cpu / BothRev |
0.000017576000000000002 s |
0.000011236720019951465 s |
1.56 |
actmtch / HLOOpt / cpu / PreRev |
0.000018922000000000003 s |
0.000012836999976570953 s |
1.47 |
actmtch / HLOOpt / cpu / PostRev |
0.000019004 s |
0.000012310039946896725 s |
1.54 |
actmtch / HLOOpt / cpu / BothRev |
0.000019718 s |
0.000015116180020413594 s |
1.30 |
actmtch / PartOpt / cpu / PreRev |
0.000018567 s |
0.00001279665995753021 s |
1.45 |
actmtch / PartOpt / cpu / PostRev |
0.000017434000000000003 s |
0.00001154188003965828 s |
1.51 |
actmtch / PartOpt / cpu / BothRev |
0.000019821 s |
0.000012728359988614102 s |
1.56 |
actmtch / IPartOpt / cpu / PreRev |
0.000019461 s |
0.000013585119968411164 s |
1.43 |
actmtch / IPartOpt / cpu / PostRev |
0.000018733 s |
0.000011491159984871046 s |
1.63 |
actmtch / IPartOpt / cpu / BothRev |
0.000021168 s |
0.000012637340014407529 s |
1.68 |
actmtch / DefOpt / cpu / PreRev |
0.000020537 s |
0.00001295805999689037 s |
1.58 |
actmtch / DefOpt / cpu / PostRev |
0.000020174 s |
0.000012994319995414116 s |
1.55 |
actmtch / DefOpt / cpu / BothRev |
0.000020395 s |
0.00001293110000915476 s |
1.58 |
actmtch / IDefOpt / cpu / PreRev |
0.000019011 s |
0.000013025639982515712 s |
1.46 |
actmtch / IDefOpt / cpu / PostRev |
0.000019125 s |
0.000012815459986086352 s |
1.49 |
actmtch / IDefOpt / cpu / BothRev |
0.000019777 s |
0.000012133060017731625 s |
1.63 |
add_one / JaXPipe / cpu / Primal |
0.000007761200001823455 s |
0.000007868599977882695 s |
0.99 |
add_one / Jax / cpu / Primal |
0.000007347579999077424 s |
0.00000741951998861623 s |
0.99 |
add_one / HLOOpt / cpu / Primal |
0.000010474620009972569 s |
0.000010513959987292765 s |
1.00 |
add_one / PartOpt / cpu / Primal |
0.000006494919989563641 s |
0.000007595719998789718 s |
0.86 |
add_one / IPartOpt / cpu / Primal |
0.000007159340004818659 s |
0.000007777159989927895 s |
0.92 |
add_one / DefOpt / cpu / Primal |
0.00001166422000324019 s |
0.0000112859800356091 s |
1.03 |
add_one / IDefOpt / cpu / Primal |
0.000006871599991882249 s |
0.000007331239994528005 s |
0.94 |
add_one / JaXPipe / cpu / Forward |
0.000010595259993806394 s |
0.000012304339988986613 s |
0.86 |
add_one / Jax / cpu / Forward |
0.000010391159996743226 s |
0.00001202503997774329 s |
0.86 |
add_one / HLOOpt / cpu / Forward |
0.000014891980001721094 s |
0.000013736400032939855 s |
1.08 |
add_one / PartOpt / cpu / Forward |
0.000015648139990389608 s |
0.000016401239990955217 s |
0.95 |
add_one / IPartOpt / cpu / Forward |
0.000010140320005120883 s |
0.000011726520024240016 s |
0.86 |
add_one / DefOpt / cpu / Forward |
0.000015035219983019487 s |
0.000016018619999158545 s |
0.94 |
add_one / IDefOpt / cpu / Forward |
0.000010401800004729011 s |
0.000011999660018773284 s |
0.87 |
add_one / JaXPipe / cpu / PreRev |
0.000012283580006169358 s |
0.000013027700024395015 s |
0.94 |
add_one / JaXPipe / cpu / PostRev |
0.000012171880000551029 s |
0.000012822980006603756 s |
0.95 |
add_one / JaXPipe / cpu / BothRev |
0.000016786479986876656 s |
0.000016642000000501868 s |
1.01 |
add_one / Jax / cpu / BothRev |
0.000012075240008471156 s |
0.000012804839980162796 s |
0.94 |
add_one / HLOOpt / cpu / PreRev |
0.00001234626000496064 s |
0.000014124400022410555 s |
0.87 |
add_one / HLOOpt / cpu / PostRev |
0.00001614499999959662 s |
0.00001704923999568564 s |
0.95 |
add_one / HLOOpt / cpu / BothRev |
0.000014181719993757724 s |
0.00001847005998570239 s |
0.77 |
add_one / PartOpt / cpu / PreRev |
0.000012468140000692076 s |
0.000012570299977596731 s |
0.99 |
add_one / PartOpt / cpu / PostRev |
0.000012127179991239243 s |
0.00001297972001339076 s |
0.93 |
add_one / PartOpt / cpu / BothRev |
0.000012058080010319829 s |
0.000012936780049130902 s |
0.93 |
add_one / IPartOpt / cpu / PreRev |
0.000012238579993208988 s |
0.000017668959999355137 s |
0.69 |
add_one / IPartOpt / cpu / PostRev |
0.00001201559999572055 s |
0.000012812400054826868 s |
0.94 |
add_one / IPartOpt / cpu / BothRev |
0.0000123378800071805 s |
0.000012878280012955656 s |
0.96 |
add_one / DefOpt / cpu / PreRev |
0.000011873019989252498 s |
0.000012653959993258468 s |
0.94 |
add_one / DefOpt / cpu / PostRev |
0.000012468199997783811 s |
0.000013062259986327264 s |
0.95 |
add_one / DefOpt / cpu / BothRev |
0.000011786079994635656 s |
0.000012414299990268771 s |
0.95 |
add_one / IDefOpt / cpu / PreRev |
0.00001146060000792204 s |
0.000013297360037540785 s |
0.86 |
add_one / IDefOpt / cpu / PostRev |
0.0000126086599948394 s |
0.00001240428003256966 s |
1.02 |
add_one / IDefOpt / cpu / BothRev |
0.000012307079996389805 s |
0.0000127792000057525 s |
0.96 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_one / Jax / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_one / PartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_one / DefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001919 s |
1.00 |
add_one / JaXPipe / cuda / Forward |
0.000010432 s |
0.000010176 s |
1.03 |
add_one / Jax / cuda / Forward |
0.000010688 s |
0.00000992 s |
1.08 |
add_one / HLOOpt / cuda / Forward |
0.00001056 s |
0.0000112 s |
0.94 |
add_one / PartOpt / cuda / Forward |
0.000010336 s |
0.000011424 s |
0.90 |
add_one / IPartOpt / cuda / Forward |
0.000010432 s |
0.00001136 s |
0.92 |
add_one / DefOpt / cuda / Forward |
0.000010432 s |
0.000011455999999999998 s |
0.91 |
add_one / IDefOpt / cuda / Forward |
0.0000104 s |
0.000011679 s |
0.89 |
add_one / JaXPipe / cuda / PreRev |
0.000027328 s |
0.000026848 s |
1.02 |
add_one / JaXPipe / cuda / PostRev |
0.000025984 s |
0.00002496 s |
1.04 |
add_one / JaXPipe / cuda / BothRev |
0.000025664 s |
0.000028895 s |
0.89 |
add_one / Jax / cuda / BothRev |
0.000025376 s |
0.000029056 s |
0.87 |
add_one / HLOOpt / cuda / PreRev |
0.000025984 s |
0.000025568 s |
1.02 |
add_one / HLOOpt / cuda / PostRev |
0.000025087 s |
0.000033024 s |
0.76 |
add_one / HLOOpt / cuda / BothRev |
0.000025088 s |
0.000025856 s |
0.97 |
add_one / PartOpt / cuda / PreRev |
0.000025408 s |
0.0000264 s |
0.96 |
add_one / PartOpt / cuda / PostRev |
0.000025856 s |
0.00002624 s |
0.99 |
add_one / PartOpt / cuda / BothRev |
0.000026207 s |
0.000025568 s |
1.02 |
add_one / IPartOpt / cuda / PreRev |
0.000025952 s |
0.000025888 s |
1.00 |
add_one / IPartOpt / cuda / PostRev |
0.000026048 s |
0.000025856 s |
1.01 |
add_one / IPartOpt / cuda / BothRev |
0.000036479 s |
0.000026112 s |
1.40 |
add_one / DefOpt / cuda / PreRev |
0.00002544 s |
0.00002576 s |
0.99 |
add_one / DefOpt / cuda / PostRev |
0.000025727 s |
0.000025568 s |
1.01 |
add_one / DefOpt / cuda / BothRev |
0.000025759 s |
0.0000264 s |
0.98 |
add_one / IDefOpt / cuda / PreRev |
0.000025888 s |
0.000026016 s |
1.00 |
add_one / IDefOpt / cuda / PostRev |
0.000025504 s |
0.000026209 s |
0.97 |
add_one / IDefOpt / cuda / BothRev |
0.000025792 s |
0.000025471 s |
1.01 |
add_one / JaXPipe / tpu / Primal |
0.000001452625 s |
0.000001421075 s |
1.02 |
add_one / Jax / tpu / Primal |
0.0000014521999999999998 s |
0.000001401925 s |
1.04 |
add_one / HLOOpt / tpu / Primal |
0.00000144645 s |
0.000001429425 s |
1.01 |
add_one / PartOpt / tpu / Primal |
0.000001457825 s |
0.000001411375 s |
1.03 |
add_one / IPartOpt / tpu / Primal |
0.0000014502250000000002 s |
0.000001426825 s |
1.02 |
add_one / DefOpt / tpu / Primal |
0.0000014610500000000002 s |
0.000001401575 s |
1.04 |
add_one / IDefOpt / tpu / Primal |
0.000001454475 s |
0.00000142725 s |
1.02 |
add_one / JaXPipe / tpu / Forward |
0.0000019132 s |
0.000001856925 s |
1.03 |
add_one / Jax / tpu / Forward |
0.000001870525 s |
0.000001847075 s |
1.01 |
add_one / HLOOpt / tpu / Forward |
0.000001900175 s |
0.0000018547250000000005 s |
1.02 |
add_one / PartOpt / tpu / Forward |
0.00000186445 s |
0.000001840325 s |
1.01 |
add_one / IPartOpt / tpu / Forward |
0.000001903525 s |
0.0000018468 s |
1.03 |
add_one / DefOpt / tpu / Forward |
0.0000018628 s |
0.00000183415 s |
1.02 |
add_one / IDefOpt / tpu / Forward |
0.00000191455 s |
0.000001852925 s |
1.03 |
add_one / JaXPipe / tpu / PreRev |
0.000002253275 s |
0.000002232875 s |
1.01 |
add_one / JaXPipe / tpu / PostRev |
0.0000022977 s |
0.0000022385 s |
1.03 |
add_one / JaXPipe / tpu / BothRev |
0.000002258575 s |
0.000002232425 s |
1.01 |
add_one / Jax / tpu / BothRev |
0.000002304225 s |
0.000002235125 s |
1.03 |
add_one / HLOOpt / tpu / PreRev |
0.000002264075 s |
0.00000223105 s |
1.01 |
add_one / HLOOpt / tpu / PostRev |
0.00000229775 s |
0.00000224865 s |
1.02 |
add_one / HLOOpt / tpu / BothRev |
0.0000022589 s |
0.00000223125 s |
1.01 |
add_one / PartOpt / tpu / PreRev |
0.0000023003 s |
0.00000225265 s |
1.02 |
add_one / PartOpt / tpu / PostRev |
0.000002264575 s |
0.0000022428 s |
1.01 |
add_one / PartOpt / tpu / BothRev |
0.00000230155 s |
0.0000022354750000000004 s |
1.03 |
add_one / IPartOpt / tpu / PreRev |
0.0000022529 s |
0.0000022359000000000005 s |
1.01 |
add_one / IPartOpt / tpu / PostRev |
0.000002290775 s |
0.00000225115 s |
1.02 |
add_one / IPartOpt / tpu / BothRev |
0.00000226445 s |
0.0000022322 s |
1.01 |
add_one / DefOpt / tpu / PreRev |
0.00000230435 s |
0.000002243475 s |
1.03 |
add_one / DefOpt / tpu / PostRev |
0.000002252625 s |
0.0000022352 s |
1.01 |
add_one / DefOpt / tpu / BothRev |
0.000002293075 s |
0.0000022375 s |
1.02 |
add_one / IDefOpt / tpu / PreRev |
0.0000022683 s |
0.000002236975 s |
1.01 |
add_one / IDefOpt / tpu / PostRev |
0.000002294125 s |
0.0000022343 s |
1.03 |
add_one / IDefOpt / tpu / BothRev |
0.000002264325 s |
0.0000022451 s |
1.01 |
add_one / JaXPipe / cpu / Primal |
0.000013666 s |
0.000007868599977882695 s |
1.74 |
add_one / Jax / cpu / Primal |
0.000013327 s |
0.00000741951998861623 s |
1.80 |
add_one / HLOOpt / cpu / Primal |
0.000012997 s |
0.000010513959987292765 s |
1.24 |
add_one / PartOpt / cpu / Primal |
0.000013167 s |
0.000007595719998789718 s |
1.73 |
add_one / IPartOpt / cpu / Primal |
0.000012796 s |
0.000007777159989927895 s |
1.65 |
add_one / DefOpt / cpu / Primal |
0.0000127 s |
0.0000112859800356091 s |
1.13 |
add_one / IDefOpt / cpu / Primal |
0.000012699 s |
0.000007331239994528005 s |
1.73 |
add_one / JaXPipe / cpu / Forward |
0.000017386999999999998 s |
0.000012304339988986613 s |
1.41 |
add_one / Jax / cpu / Forward |
0.000017539 s |
0.00001202503997774329 s |
1.46 |
add_one / HLOOpt / cpu / Forward |
0.000017621000000000003 s |
0.000013736400032939855 s |
1.28 |
add_one / PartOpt / cpu / Forward |
0.000017927 s |
0.000016401239990955217 s |
1.09 |
add_one / IPartOpt / cpu / Forward |
0.000017805 s |
0.000011726520024240016 s |
1.52 |
add_one / DefOpt / cpu / Forward |
0.000018138 s |
0.000016018619999158545 s |
1.13 |
add_one / IDefOpt / cpu / Forward |
0.000017542 s |
0.000011999660018773284 s |
1.46 |
add_one / JaXPipe / cpu / PreRev |
0.0000203 s |
0.000013027700024395015 s |
1.56 |
add_one / JaXPipe / cpu / PostRev |
0.000020275 s |
0.000012822980006603756 s |
1.58 |
add_one / JaXPipe / cpu / BothRev |
0.000020403 s |
0.000016642000000501868 s |
1.23 |
add_one / Jax / cpu / BothRev |
0.00002 s |
0.000012804839980162796 s |
1.56 |
add_one / HLOOpt / cpu / PreRev |
0.000020867 s |
0.000014124400022410555 s |
1.48 |
add_one / HLOOpt / cpu / PostRev |
0.00002049 s |
0.00001704923999568564 s |
1.20 |
add_one / HLOOpt / cpu / BothRev |
0.000021379 s |
0.00001847005998570239 s |
1.16 |
add_one / PartOpt / cpu / PreRev |
0.000020639 s |
0.000012570299977596731 s |
1.64 |
add_one / PartOpt / cpu / PostRev |
0.000021531 s |
0.00001297972001339076 s |
1.66 |
add_one / PartOpt / cpu / BothRev |
0.000020029 s |
0.000012936780049130902 s |
1.55 |
add_one / IPartOpt / cpu / PreRev |
0.000019992 s |
0.000017668959999355137 s |
1.13 |
add_one / IPartOpt / cpu / PostRev |
0.00001998 s |
0.000012812400054826868 s |
1.56 |
add_one / IPartOpt / cpu / BothRev |
0.000020275 s |
0.000012878280012955656 s |
1.57 |
add_one / DefOpt / cpu / PreRev |
0.000019697 s |
0.000012653959993258468 s |
1.56 |
add_one / DefOpt / cpu / PostRev |
0.000020108 s |
0.000013062259986327264 s |
1.54 |
add_one / DefOpt / cpu / BothRev |
0.000020102 s |
0.000012414299990268771 s |
1.62 |
add_one / IDefOpt / cpu / PreRev |
0.000019429 s |
0.000013297360037540785 s |
1.46 |
add_one / IDefOpt / cpu / PostRev |
0.000019685 s |
0.00001240428003256966 s |
1.59 |
add_one / IDefOpt / cpu / BothRev |
0.00001947 s |
0.0000127792000057525 s |
1.52 |
add_two / JaXPipe / cpu / Primal |
0.000007928140000785788 s |
0.000007723359985902789 s |
1.03 |
add_two / Jax / cpu / Primal |
0.000006638699996983632 s |
0.00000696983996022027 s |
0.95 |
add_two / HLOOpt / cpu / Primal |
0.000011250580000705667 s |
0.000010501580027266754 s |
1.07 |
add_two / PartOpt / cpu / Primal |
0.000006794200005515449 s |
0.000007528760006607626 s |
0.90 |
add_two / IPartOpt / cpu / Primal |
0.000007167560002017126 s |
0.000007299680046344292 s |
0.98 |
add_two / DefOpt / cpu / Primal |
0.000011758559996906116 s |
0.00001088045998585585 s |
1.08 |
add_two / IDefOpt / cpu / Primal |
0.000007010839995018614 s |
0.000007032020039332565 s |
1.00 |
add_two / JaXPipe / cpu / Forward |
0.000010698520000005374 s |
0.00001198598002702056 s |
0.89 |
add_two / Jax / cpu / Forward |
0.000010640840005180509 s |
0.000011226719998376213 s |
0.95 |
add_two / HLOOpt / cpu / Forward |
0.000015000320006492984 s |
0.000016235859975495258 s |
0.92 |
add_two / PartOpt / cpu / Forward |
0.00001506551999000294 s |
0.00001550988001326914 s |
0.97 |
add_two / IPartOpt / cpu / Forward |
0.000011008760006916418 s |
0.00001129524000134552 s |
0.97 |
add_two / DefOpt / cpu / Forward |
0.000014892639999288804 s |
0.00001601973995093431 s |
0.93 |
add_two / IDefOpt / cpu / Forward |
0.000010668099996564706 s |
0.000011547620006240325 s |
0.92 |
add_two / JaXPipe / cpu / PreRev |
0.000015182940007889556 s |
0.00001604412000233424 s |
0.95 |
add_two / JaXPipe / cpu / PostRev |
0.000014385699989816204 s |
0.000014845979967503808 s |
0.97 |
add_two / JaXPipe / cpu / BothRev |
0.000014681379989269771 s |
0.000015111439970496575 s |
0.97 |
add_two / Jax / cpu / BothRev |
0.000014747539994459658 s |
0.000015581239968014415 s |
0.95 |
add_two / HLOOpt / cpu / PreRev |
0.000014457459990353529 s |
0.000015555060026599677 s |
0.93 |
add_two / HLOOpt / cpu / PostRev |
0.00001485273998696357 s |
0.000015376660021502174 s |
0.97 |
add_two / HLOOpt / cpu / BothRev |
0.000016687999996065627 s |
0.000016302340027323227 s |
1.02 |
add_two / PartOpt / cpu / PreRev |
0.000014322499998797866 s |
0.000015145600027608452 s |
0.95 |
add_two / PartOpt / cpu / PostRev |
0.000014512039999772242 s |
0.000014814679998380598 s |
0.98 |
add_two / PartOpt / cpu / BothRev |
0.00001473678000138534 s |
0.0000159910200000013 s |
0.92 |
add_two / IPartOpt / cpu / PreRev |
0.000014869079993786729 s |
0.000016129919977174723 s |
0.92 |
add_two / IPartOpt / cpu / PostRev |
0.000014934120003999852 s |
0.000015059539964568103 s |
0.99 |
add_two / IPartOpt / cpu / BothRev |
0.000014514140004848742 s |
0.000015380740060209063 s |
0.94 |
add_two / DefOpt / cpu / PreRev |
0.000015040559992485214 s |
0.00001517411998065654 s |
0.99 |
add_two / DefOpt / cpu / PostRev |
0.00001465556000084689 s |
0.000015974059970176313 s |
0.92 |
add_two / DefOpt / cpu / BothRev |
0.000015040280018183694 s |
0.000015469319996554987 s |
0.97 |
add_two / IDefOpt / cpu / PreRev |
0.000014672060001430508 s |
0.00001592369992977183 s |
0.92 |
add_two / IDefOpt / cpu / PostRev |
0.000014792000006309537 s |
0.000015961200042511336 s |
0.93 |
add_two / IDefOpt / cpu / BothRev |
0.000014957959995172133 s |
0.000015459220012417063 s |
0.97 |
add_two / JaXPipe / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_two / Jax / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_two / HLOOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_two / PartOpt / cuda / Primal |
0.000001951 s |
0.0000019200000000000003 s |
1.02 |
add_two / IPartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_two / IDefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
add_two / JaXPipe / cuda / Forward |
0.000009952 s |
0.000009984 s |
1.00 |
add_two / Jax / cuda / Forward |
0.00001024 s |
0.000009984 s |
1.03 |
add_two / HLOOpt / cuda / Forward |
0.00001008 s |
0.000009984 s |
1.01 |
add_two / PartOpt / cuda / Forward |
0.000010016 s |
0.0000104 s |
0.96 |
add_two / IPartOpt / cuda / Forward |
0.000010016 s |
0.000010368 s |
0.97 |
add_two / DefOpt / cuda / Forward |
0.000009856 s |
0.000009824 s |
1.00 |
add_two / IDefOpt / cuda / Forward |
0.000009792 s |
0.000010015 s |
0.98 |
add_two / JaXPipe / cuda / PreRev |
0.00003264 s |
0.000032928 s |
0.99 |
add_two / JaXPipe / cuda / PostRev |
0.000031936 s |
0.000032832 s |
0.97 |
add_two / JaXPipe / cuda / BothRev |
0.000031776 s |
0.000032767999999999995 s |
0.97 |
add_two / Jax / cuda / BothRev |
0.000031424 s |
0.000033376 s |
0.94 |
add_two / HLOOpt / cuda / PreRev |
0.00003248 s |
0.000032992 s |
0.98 |
add_two / HLOOpt / cuda / PostRev |
0.000033248 s |
0.000033056 s |
1.01 |
add_two / HLOOpt / cuda / BothRev |
0.000032864 s |
0.000034848 s |
0.94 |
add_two / PartOpt / cuda / PreRev |
0.000032992 s |
0.000033343 s |
0.99 |
add_two / PartOpt / cuda / PostRev |
0.000032416 s |
0.000032576 s |
1.00 |
add_two / PartOpt / cuda / BothRev |
0.000032896000000000005 s |
0.000032672 s |
1.01 |
add_two / IPartOpt / cuda / PreRev |
0.000033312 s |
0.000032897 s |
1.01 |
add_two / IPartOpt / cuda / PostRev |
0.000033184 s |
0.000032832 s |
1.01 |
add_two / IPartOpt / cuda / BothRev |
0.000043456 s |
0.000033055 s |
1.31 |
add_two / DefOpt / cuda / PreRev |
0.000032544 s |
0.000032991 s |
0.99 |
add_two / DefOpt / cuda / PostRev |
0.000034112 s |
0.00003328 s |
1.03 |
add_two / DefOpt / cuda / BothRev |
0.000033152000000000004 s |
0.000038144 s |
0.87 |
add_two / IDefOpt / cuda / PreRev |
0.000034656 s |
0.000033536000000000006 s |
1.03 |
add_two / IDefOpt / cuda / PostRev |
0.00003408 s |
0.0000376 s |
0.91 |
add_two / IDefOpt / cuda / BothRev |
0.000033056 s |
0.000033311 s |
0.99 |
add_two / JaXPipe / tpu / Primal |
0.000001395075 s |
0.0000014412 s |
0.97 |
add_two / Jax / tpu / Primal |
0.0000014105500000000002 s |
0.000001480425 s |
0.95 |
add_two / HLOOpt / tpu / Primal |
0.0000013944500000000005 s |
0.00000142565 s |
0.98 |
add_two / PartOpt / tpu / Primal |
0.000001401025 s |
0.000001473 s |
0.95 |
add_two / IPartOpt / tpu / Primal |
0.0000013995 s |
0.000001433925 s |
0.98 |
add_two / DefOpt / tpu / Primal |
0.0000014139 s |
0.000001476275 s |
0.96 |
add_two / IDefOpt / tpu / Primal |
0.00000139005 s |
0.000001433925 s |
0.97 |
add_two / JaXPipe / tpu / Forward |
0.0000017845749999999998 s |
0.000001823425 s |
0.98 |
add_two / Jax / tpu / Forward |
0.00000182525 s |
0.000001831975 s |
1.00 |
add_two / HLOOpt / tpu / Forward |
0.000001793225 s |
0.00000183065 s |
0.98 |
add_two / PartOpt / tpu / Forward |
0.000001832425 s |
0.00000182505 s |
1.00 |
add_two / IPartOpt / tpu / Forward |
0.000001804175 s |
0.00000183225 s |
0.98 |
add_two / DefOpt / tpu / Forward |
0.000001823575 s |
0.000001836275 s |
0.99 |
add_two / IDefOpt / tpu / Forward |
0.0000017948249999999998 s |
0.000001826925 s |
0.98 |
add_two / JaXPipe / tpu / PreRev |
0.0000028249 s |
0.00000283795 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.000002735375 s |
0.00000275535 s |
0.99 |
add_two / JaXPipe / tpu / BothRev |
0.0000028071 s |
0.0000028325 s |
0.99 |
add_two / Jax / tpu / BothRev |
0.00000273835 s |
0.000002751725 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.0000028137000000000003 s |
0.0000028398250000000004 s |
0.99 |
add_two / HLOOpt / tpu / PostRev |
0.000002724225 s |
0.0000027475 s |
0.99 |
add_two / HLOOpt / tpu / BothRev |
0.000002818825 s |
0.00000284185 s |
0.99 |
add_two / PartOpt / tpu / PreRev |
0.000002728675 s |
0.000002757825 s |
0.99 |
add_two / PartOpt / tpu / PostRev |
0.000002817875 s |
0.000002824325 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.0000027294000000000005 s |
0.0000027582250000000006 s |
0.99 |
add_two / IPartOpt / tpu / PreRev |
0.000002814325 s |
0.00000283555 s |
0.99 |
add_two / IPartOpt / tpu / PostRev |
0.000002733575 s |
0.000002745575 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.000002818025 s |
0.000002840825 s |
0.99 |
add_two / DefOpt / tpu / PreRev |
0.00000273515 s |
0.000002755975 s |
0.99 |
add_two / DefOpt / tpu / PostRev |
0.000002819775 s |
0.0000028389 s |
0.99 |
add_two / DefOpt / tpu / BothRev |
0.000002727725 s |
0.0000027441 s |
0.99 |
add_two / IDefOpt / tpu / PreRev |
0.0000028079 s |
0.0000028370250000000006 s |
0.99 |
add_two / IDefOpt / tpu / PostRev |
0.00000272425 s |
0.0000027601 s |
0.99 |
add_two / IDefOpt / tpu / BothRev |
0.00000280715 s |
0.000002846225 s |
0.99 |
add_two / JaXPipe / cpu / Primal |
0.000013476 s |
0.000007723359985902789 s |
1.74 |
add_two / Jax / cpu / Primal |
0.000013404 s |
0.00000696983996022027 s |
1.92 |
add_two / HLOOpt / cpu / Primal |
0.000013378 s |
0.000010501580027266754 s |
1.27 |
add_two / PartOpt / cpu / Primal |
0.000012966 s |
0.000007528760006607626 s |
1.72 |
add_two / IPartOpt / cpu / Primal |
0.000013526 s |
0.000007299680046344292 s |
1.85 |
add_two / DefOpt / cpu / Primal |
0.000013365 s |
0.00001088045998585585 s |
1.23 |
add_two / IDefOpt / cpu / Primal |
0.000013258 s |
0.000007032020039332565 s |
1.89 |
add_two / JaXPipe / cpu / Forward |
0.000018375 s |
0.00001198598002702056 s |
1.53 |
add_two / Jax / cpu / Forward |
0.000017758 s |
0.000011226719998376213 s |
1.58 |
add_two / HLOOpt / cpu / Forward |
0.000017978 s |
0.000016235859975495258 s |
1.11 |
add_two / PartOpt / cpu / Forward |
0.000018255 s |
0.00001550988001326914 s |
1.18 |
add_two / IPartOpt / cpu / Forward |
0.000018343 s |
0.00001129524000134552 s |
1.62 |
add_two / DefOpt / cpu / Forward |
0.000018222 s |
0.00001601973995093431 s |
1.14 |
add_two / IDefOpt / cpu / Forward |
0.000018044 s |
0.000011547620006240325 s |
1.56 |
add_two / JaXPipe / cpu / PreRev |
0.000025639 s |
0.00001604412000233424 s |
1.60 |
add_two / JaXPipe / cpu / PostRev |
0.000025921 s |
0.000014845979967503808 s |
1.75 |
add_two / JaXPipe / cpu / BothRev |
0.000026127 s |
0.000015111439970496575 s |
1.73 |
add_two / Jax / cpu / BothRev |
0.000024385 s |
0.000015581239968014415 s |
1.57 |
add_two / HLOOpt / cpu / PreRev |
0.000024822 s |
0.000015555060026599677 s |
1.60 |
add_two / HLOOpt / cpu / PostRev |
0.000025743 s |
0.000015376660021502174 s |
1.67 |
add_two / HLOOpt / cpu / BothRev |
0.000025137 s |
0.000016302340027323227 s |
1.54 |
add_two / PartOpt / cpu / PreRev |
0.000024957 s |
0.000015145600027608452 s |
1.65 |
add_two / PartOpt / cpu / PostRev |
0.000025996 s |
0.000014814679998380598 s |
1.75 |
add_two / PartOpt / cpu / BothRev |
0.000025281 s |
0.0000159910200000013 s |
1.58 |
add_two / IPartOpt / cpu / PreRev |
0.000025158 s |
0.000016129919977174723 s |
1.56 |
add_two / IPartOpt / cpu / PostRev |
0.00002554 s |
0.000015059539964568103 s |
1.70 |
add_two / IPartOpt / cpu / BothRev |
0.00002555 s |
0.000015380740060209063 s |
1.66 |
add_two / DefOpt / cpu / PreRev |
0.000025359 s |
0.00001517411998065654 s |
1.67 |
add_two / DefOpt / cpu / PostRev |
0.000024429 s |
0.000015974059970176313 s |
1.53 |
add_two / DefOpt / cpu / BothRev |
0.00002445 s |
0.000015469319996554987 s |
1.58 |
add_two / IDefOpt / cpu / PreRev |
0.000024202 s |
0.00001592369992977183 s |
1.52 |
add_two / IDefOpt / cpu / PostRev |
0.000025936 s |
0.000015961200042511336 s |
1.62 |
add_two / IDefOpt / cpu / BothRev |
0.000026333 s |
0.000015459220012417063 s |
1.70 |
cache / JaXPipe / cpu / Primal |
0.000006704759996409848 s |
0.000006956840015845956 s |
0.96 |
cache / Jax / cpu / Primal |
0.000007148000001961918 s |
0.000007311219933399116 s |
0.98 |
cache / HLOOpt / cpu / Primal |
0.0000067763000106424445 s |
0.000007593460013595177 s |
0.89 |
cache / PartOpt / cpu / Primal |
0.000007507680008984608 s |
0.000007535879994975403 s |
1.00 |
cache / IPartOpt / cpu / Primal |
0.000006921660001353303 s |
0.000007371540013991762 s |
0.94 |
cache / DefOpt / cpu / Primal |
0.000006444840003041463 s |
0.000007683940011702362 s |
0.84 |
cache / IDefOpt / cpu / Primal |
0.0000065485799996167775 s |
0.0000073235999934695425 s |
0.89 |
cache / JaXPipe / cpu / Forward |
0.00001400896000177454 s |
0.000014550459982274333 s |
0.96 |
cache / Jax / cpu / Forward |
0.000014287819994933673 s |
0.000014931959995010404 s |
0.96 |
cache / HLOOpt / cpu / Forward |
0.00001953021999952398 s |
0.00001945838004758116 s |
1.00 |
cache / PartOpt / cpu / Forward |
0.000020570959986798697 s |
0.00001901271997667209 s |
1.08 |
cache / IPartOpt / cpu / Forward |
0.000015427460000410064 s |
0.0000151107200235856 s |
1.02 |
cache / DefOpt / cpu / Forward |
0.00002085963999661544 s |
0.00001980758002900984 s |
1.05 |
cache / IDefOpt / cpu / Forward |
0.000014792300007684389 s |
0.000014846979993308196 s |
1.00 |
cache / JaXPipe / cpu / PreRev |
0.000017505580012766585 s |
0.000015902700033620933 s |
1.10 |
cache / JaXPipe / cpu / PostRev |
0.000021765780002169776 s |
0.000021121619975019717 s |
1.03 |
cache / JaXPipe / cpu / BothRev |
0.000022876399989399945 s |
0.000017116039998654742 s |
1.34 |
cache / Jax / cpu / BothRev |
0.00002155626000103439 s |
0.00002173714004129579 s |
0.99 |
cache / HLOOpt / cpu / PreRev |
0.00001658504000488392 s |
0.000017123339985118948 s |
0.97 |
cache / HLOOpt / cpu / PostRev |
0.000020419739998942533 s |
0.00001864448002379504 s |
1.10 |
cache / HLOOpt / cpu / BothRev |
0.0000212658799887322 s |
0.000018996239978150696 s |
1.12 |
cache / PartOpt / cpu / PreRev |
0.00001629732000083095 s |
0.00001642888004425913 s |
0.99 |
cache / PartOpt / cpu / PostRev |
0.00002127841999708835 s |
0.000020945439973729663 s |
1.02 |
cache / PartOpt / cpu / BothRev |
0.00001688822000460277 s |
0.000016367300049751065 s |
1.03 |
cache / IPartOpt / cpu / PreRev |
0.00002261333999058479 s |
0.000017679780003163616 s |
1.28 |
cache / IPartOpt / cpu / PostRev |
0.00002229039999292581 s |
0.000022093920060797247 s |
1.01 |
cache / IPartOpt / cpu / BothRev |
0.000016651019996061224 s |
0.000016750399972806917 s |
0.99 |
cache / DefOpt / cpu / PreRev |
0.00001773136000792874 s |
0.000016780159985501086 s |
1.06 |
cache / DefOpt / cpu / PostRev |
0.00002216581999391565 s |
0.00001688983998974436 s |
1.31 |
cache / DefOpt / cpu / BothRev |
0.000016958400003659335 s |
0.000016799259938125033 s |
1.01 |
cache / IDefOpt / cpu / PreRev |
0.000016649199992571084 s |
0.000017757959994924022 s |
0.94 |
cache / IDefOpt / cpu / PostRev |
0.000016108600002553432 s |
0.00001715016000161995 s |
0.94 |
cache / IDefOpt / cpu / BothRev |
0.000015865459997712604 s |
0.0000166604799869674 s |
0.95 |
cache / JaXPipe / cuda / Primal |
0.000002304 s |
0.000002304 s |
1 |
cache / Jax / cuda / Primal |
0.000002272 s |
0.000002272 s |
1 |
cache / HLOOpt / cuda / Primal |
0.000002272 s |
0.000002272 s |
1 |
cache / PartOpt / cuda / Primal |
0.00000224 s |
0.000002272 s |
0.99 |
cache / IPartOpt / cuda / Primal |
0.00000224 s |
0.000002272 s |
0.99 |
cache / DefOpt / cuda / Primal |
0.00000224 s |
0.00000224 s |
1 |
cache / IDefOpt / cuda / Primal |
0.000002304 s |
0.000002304 s |
1 |
cache / JaXPipe / cuda / Forward |
0.000002336 s |
0.000002335 s |
1.00 |
cache / Jax / cuda / Forward |
0.000002272 s |
0.000002272 s |
1 |
cache / HLOOpt / cuda / Forward |
0.000002335 s |
0.000002335 s |
1 |
cache / PartOpt / cuda / Forward |
0.000002303 s |
0.000002304 s |
1.00 |
cache / IPartOpt / cuda / Forward |
0.000002303 s |
0.000002304 s |
1.00 |
cache / DefOpt / cuda / Forward |
0.00000224 s |
0.00000224 s |
1 |
cache / IDefOpt / cuda / Forward |
0.000002303 s |
0.000002272 s |
1.01 |
cache / JaXPipe / cuda / PreRev |
0.000013184 s |
0.000013184 s |
1 |
cache / JaXPipe / cuda / PostRev |
0.000011616 s |
0.00001136 s |
1.02 |
cache / JaXPipe / cuda / BothRev |
0.000013215 s |
0.000013216 s |
1.00 |
cache / Jax / cuda / BothRev |
0.0000112 s |
0.000011488 s |
0.97 |
cache / HLOOpt / cuda / PreRev |
0.000013216 s |
0.000013215 s |
1.00 |
cache / HLOOpt / cuda / PostRev |
0.000013216 s |
0.000013183 s |
1.00 |
cache / HLOOpt / cuda / BothRev |
0.000013216 s |
0.000013216 s |
1 |
cache / PartOpt / cuda / PreRev |
0.000013248 s |
0.000013216 s |
1.00 |
cache / PartOpt / cuda / PostRev |
0.000011872 s |
0.000011488 s |
1.03 |
cache / PartOpt / cuda / BothRev |
0.000013215 s |
0.000013216 s |
1.00 |
cache / IPartOpt / cuda / PreRev |
0.000013215 s |
0.000013216 s |
1.00 |
cache / IPartOpt / cuda / PostRev |
0.000011488 s |
0.00001184 s |
0.97 |
cache / IPartOpt / cuda / BothRev |
0.000013215 s |
0.000013184 s |
1.00 |
cache / DefOpt / cuda / PreRev |
0.000013248 s |
0.000013216 s |
1.00 |
cache / DefOpt / cuda / PostRev |
0.000013151 s |
0.000013152 s |
1.00 |
cache / DefOpt / cuda / BothRev |
0.000013248 s |
0.000013248 s |
1 |
cache / IDefOpt / cuda / PreRev |
0.000013216 s |
0.000013183 s |
1.00 |
cache / IDefOpt / cuda / PostRev |
0.000013184 s |
0.000013152 s |
1.00 |
cache / IDefOpt / cuda / BothRev |
0.000013216 s |
0.000013215 s |
1.00 |
cache / JaXPipe / tpu / Primal |
0.00000245955 s |
0.000002456125 s |
1.00 |
cache / Jax / tpu / Primal |
0.00000245175 s |
0.000002448675 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002459325 s |
0.0000024729 s |
0.99 |
cache / PartOpt / tpu / Primal |
0.000002456225 s |
0.0000024683 s |
1.00 |
cache / IPartOpt / tpu / Primal |
0.0000024477 s |
0.00000246125 s |
0.99 |
cache / DefOpt / tpu / Primal |
0.000002479225 s |
0.00000244685 s |
1.01 |
cache / IDefOpt / tpu / Primal |
0.0000024449 s |
0.000002470775 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.0000035318 s |
0.0000035407250000000003 s |
1.00 |
cache / Jax / tpu / Forward |
0.0000035532750000000003 s |
0.00000353345 s |
1.01 |
cache / HLOOpt / tpu / Forward |
0.0000035856 s |
0.0000035537000000000004 s |
1.01 |
cache / PartOpt / tpu / Forward |
0.00000355915 s |
0.0000035360750000000003 s |
1.01 |
cache / IPartOpt / tpu / Forward |
0.00000356035 s |
0.0000035455750000000004 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.0000035548000000000003 s |
0.00000352575 s |
1.01 |
cache / IDefOpt / tpu / Forward |
0.0000035819000000000004 s |
0.0000035553 s |
1.01 |
cache / JaXPipe / tpu / PreRev |
0.0000039224 s |
0.000003930274999999999 s |
1.00 |
cache / JaXPipe / tpu / PostRev |
0.000004900875 s |
0.00000496115 s |
0.99 |
cache / JaXPipe / tpu / BothRev |
0.000003972925 s |
0.00000393385 s |
1.01 |
cache / Jax / tpu / BothRev |
0.000004954275 s |
0.000004990125000000001 s |
0.99 |
cache / HLOOpt / tpu / PreRev |
0.000003975025 s |
0.000003932450000000001 s |
1.01 |
cache / HLOOpt / tpu / PostRev |
0.000004081975 s |
0.0000041239 s |
0.99 |
cache / HLOOpt / tpu / BothRev |
0.000003976225 s |
0.0000039407 s |
1.01 |
cache / PartOpt / tpu / PreRev |
0.0000041101 s |
0.000004137224999999999 s |
0.99 |
cache / PartOpt / tpu / PostRev |
0.0000049157 s |
0.000004995249999999999 s |
0.98 |
cache / PartOpt / tpu / BothRev |
0.00000408545 s |
0.0000041481 s |
0.98 |
cache / IPartOpt / tpu / PreRev |
0.00000397885 s |
0.00000394215 s |
1.01 |
cache / IPartOpt / tpu / PostRev |
0.000004943449999999999 s |
0.00000498895 s |
0.99 |
cache / IPartOpt / tpu / BothRev |
0.0000039803 s |
0.000003957575 s |
1.01 |
cache / DefOpt / tpu / PreRev |
0.00000410195 s |
0.000004119575 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.000003984275 s |
0.000003933349999999999 s |
1.01 |
cache / DefOpt / tpu / BothRev |
0.0000040772000000000005 s |
0.0000041150500000000005 s |
0.99 |
cache / IDefOpt / tpu / PreRev |
0.0000039691 s |
0.000003921825 s |
1.01 |
cache / IDefOpt / tpu / PostRev |
0.000004084225000000001 s |
0.0000041251 s |
0.99 |
cache / IDefOpt / tpu / BothRev |
0.000003978174999999999 s |
0.0000039568 s |
1.01 |
cache / JaXPipe / cpu / Primal |
0.000012655 s |
0.000006956840015845956 s |
1.82 |
cache / Jax / cpu / Primal |
0.000012979 s |
0.000007311219933399116 s |
1.78 |
cache / HLOOpt / cpu / Primal |
0.000013026 s |
0.000007593460013595177 s |
1.72 |
cache / PartOpt / cpu / Primal |
0.000013354 s |
0.000007535879994975403 s |
1.77 |
cache / IPartOpt / cpu / Primal |
0.000012584 s |
0.000007371540013991762 s |
1.71 |
cache / DefOpt / cpu / Primal |
0.000012949 s |
0.000007683940011702362 s |
1.69 |
cache / IDefOpt / cpu / Primal |
0.00001242 s |
0.0000073235999934695425 s |
1.70 |
cache / JaXPipe / cpu / Forward |
0.000018539 s |
0.000014550459982274333 s |
1.27 |
cache / Jax / cpu / Forward |
0.00001799 s |
0.000014931959995010404 s |
1.20 |
cache / HLOOpt / cpu / Forward |
0.000018149 s |
0.00001945838004758116 s |
0.93 |
cache / PartOpt / cpu / Forward |
0.000017330000000000002 s |
0.00001901271997667209 s |
0.91 |
cache / IPartOpt / cpu / Forward |
0.000017509 s |
0.0000151107200235856 s |
1.16 |
cache / DefOpt / cpu / Forward |
0.000017406999999999998 s |
0.00001980758002900984 s |
0.88 |
cache / IDefOpt / cpu / Forward |
0.000017398000000000002 s |
0.000014846979993308196 s |
1.17 |
cache / JaXPipe / cpu / PreRev |
0.000018572 s |
0.000015902700033620933 s |
1.17 |
cache / JaXPipe / cpu / PostRev |
0.000019719 s |
0.000021121619975019717 s |
0.93 |
cache / JaXPipe / cpu / BothRev |
0.000017916999999999998 s |
0.000017116039998654742 s |
1.05 |
cache / Jax / cpu / BothRev |
0.000031298 s |
0.00002173714004129579 s |
1.44 |
cache / HLOOpt / cpu / PreRev |
0.000018112 s |
0.000017123339985118948 s |
1.06 |
cache / HLOOpt / cpu / PostRev |
0.000017624 s |
0.00001864448002379504 s |
0.95 |
cache / HLOOpt / cpu / BothRev |
0.000035535 s |
0.000018996239978150696 s |
1.87 |
cache / PartOpt / cpu / PreRev |
0.00003759 s |
0.00001642888004425913 s |
2.29 |
cache / PartOpt / cpu / PostRev |
0.00002901 s |
0.000020945439973729663 s |
1.39 |
cache / PartOpt / cpu / BothRev |
0.000027251 s |
0.000016367300049751065 s |
1.66 |
cache / IPartOpt / cpu / PreRev |
0.000024226 s |
0.000017679780003163616 s |
1.37 |
cache / IPartOpt / cpu / PostRev |
0.00002642 s |
0.000022093920060797247 s |
1.20 |
cache / IPartOpt / cpu / BothRev |
0.000024396 s |
0.000016750399972806917 s |
1.46 |
cache / DefOpt / cpu / PreRev |
0.000024562 s |
0.000016780159985501086 s |
1.46 |
cache / DefOpt / cpu / PostRev |
0.000019534 s |
0.00001688983998974436 s |
1.16 |
cache / DefOpt / cpu / BothRev |
0.000022103 s |
0.000016799259938125033 s |
1.32 |
cache / IDefOpt / cpu / PreRev |
0.000024436 s |
0.000017757959994924022 s |
1.38 |
cache / IDefOpt / cpu / PostRev |
0.00002701 s |
0.00001715016000161995 s |
1.57 |
cache / IDefOpt / cpu / BothRev |
0.000024496 s |
0.0000166604799869674 s |
1.47 |
Concat / JaXPipe / cpu / Primal |
0.000007837380001092243 s |
0.00000811730000350508 s |
0.97 |
Concat / Jax / cpu / Primal |
0.00000711994000539562 s |
0.000007682900013605832 s |
0.93 |
Concat / HLOOpt / cpu / Primal |
0.00001024896000217268 s |
0.00001060113999301393 s |
0.97 |
Concat / PartOpt / cpu / Primal |
0.000006625159996929142 s |
0.000007756900004096678 s |
0.85 |
Concat / IPartOpt / cpu / Primal |
0.000006695799997942231 s |
0.000007213979988591745 s |
0.93 |
Concat / DefOpt / cpu / Primal |
0.0000106093799968221 s |
0.000011252420026721666 s |
0.94 |
Concat / IDefOpt / cpu / Primal |
0.00000657244000421997 s |
0.000007434619956256938 s |
0.88 |
Concat / JaXPipe / cpu / Forward |
0.000011313520003568556 s |
0.00001102167998396908 s |
1.03 |
Concat / Jax / cpu / Forward |
0.000010954759991363972 s |
0.000011529920011525971 s |
0.95 |
Concat / HLOOpt / cpu / Forward |
0.000014874840005631996 s |
0.000015114119942154502 s |
0.98 |
Concat / PartOpt / cpu / Forward |
0.0000151896799889073 s |
0.000015529919983237052 s |
0.98 |
Concat / IPartOpt / cpu / Forward |
0.00001100988000189318 s |
0.00001101205997656507 s |
1.00 |
Concat / DefOpt / cpu / Forward |
0.000015059980000842188 s |
0.00001623475995984336 s |
0.93 |
Concat / IDefOpt / cpu / Forward |
0.000010687780002172075 s |
0.000011439519985287916 s |
0.93 |
Concat / JaXPipe / cpu / PreRev |
0.000011720900001819244 s |
0.00001323297995440953 s |
0.89 |
Concat / JaXPipe / cpu / PostRev |
0.000012682240005688072 s |
0.000013689120060007552 s |
0.93 |
Concat / JaXPipe / cpu / BothRev |
0.00001273031999289742 s |
0.00001329070005340327 s |
0.96 |
Concat / Jax / cpu / BothRev |
0.000012601979994997236 s |
0.000013045700006841798 s |
0.97 |
Concat / HLOOpt / cpu / PreRev |
0.00001237235999724362 s |
0.000013706500003536347 s |
0.90 |
Concat / HLOOpt / cpu / PostRev |
0.000016496280006776943 s |
0.000017045479953594622 s |
0.97 |
Concat / HLOOpt / cpu / BothRev |
0.00001413272000490906 s |
0.00001466194001295662 s |
0.96 |
Concat / PartOpt / cpu / PreRev |
0.000012438120002116192 s |
0.000013567740024882369 s |
0.92 |
Concat / PartOpt / cpu / PostRev |
0.00001232705998518213 s |
0.000013383999985308037 s |
0.92 |
Concat / PartOpt / cpu / BothRev |
0.000012387320002744674 s |
0.000013322119930307964 s |
0.93 |
Concat / IPartOpt / cpu / PreRev |
0.000017036279996318626 s |
0.000014049620031073571 s |
1.21 |
Concat / IPartOpt / cpu / PostRev |
0.000012929979993714369 s |
0.000013057799988018814 s |
0.99 |
Concat / IPartOpt / cpu / BothRev |
0.00001213414000631019 s |
0.00001312568005232606 s |
0.92 |
Concat / DefOpt / cpu / PreRev |
0.000011710739997852216 s |
0.000012955279962625355 s |
0.90 |
Concat / DefOpt / cpu / PostRev |
0.00001182920000019294 s |
0.000013988820019221748 s |
0.85 |
Concat / DefOpt / cpu / BothRev |
0.000012492359996940647 s |
0.000013455380039886224 s |
0.93 |
Concat / IDefOpt / cpu / PreRev |
0.000012313399995491636 s |
0.000013155779970475124 s |
0.94 |
Concat / IDefOpt / cpu / PostRev |
0.000012558000007629743 s |
0.000012839719993280596 s |
0.98 |
Concat / IDefOpt / cpu / BothRev |
0.000011483460014005686 s |
0.000014180980006130994 s |
0.81 |
Concat / JaXPipe / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / Jax / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / HLOOpt / cuda / Primal |
0.000001951 s |
0.000001952 s |
1.00 |
Concat / PartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / IPartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / DefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / IDefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / JaXPipe / cuda / Forward |
0.000010208 s |
0.00001072 s |
0.95 |
Concat / Jax / cuda / Forward |
0.000010433 s |
0.000010624 s |
0.98 |
Concat / HLOOpt / cuda / Forward |
0.000010752 s |
0.000010432 s |
1.03 |
Concat / PartOpt / cuda / Forward |
0.000010208 s |
0.000010591 s |
0.96 |
Concat / IPartOpt / cuda / Forward |
0.000010112 s |
0.000010432 s |
0.97 |
Concat / DefOpt / cuda / Forward |
0.000010176 s |
0.0000112 s |
0.91 |
Concat / IDefOpt / cuda / Forward |
0.00001008 s |
0.000010112 s |
1.00 |
Concat / JaXPipe / cuda / PreRev |
0.000016448000000000002 s |
0.000016352 s |
1.01 |
Concat / JaXPipe / cuda / PostRev |
0.000016704 s |
0.00001696 s |
0.98 |
Concat / JaXPipe / cuda / BothRev |
0.000014977 s |
0.000016768000000000003 s |
0.89 |
Concat / Jax / cuda / BothRev |
0.000016416 s |
0.000016736 s |
0.98 |
Concat / HLOOpt / cuda / PreRev |
0.000016736 s |
0.000017184 s |
0.97 |
Concat / HLOOpt / cuda / PostRev |
0.00001616 s |
0.000017056 s |
0.95 |
Concat / HLOOpt / cuda / BothRev |
0.000016385 s |
0.000017184 s |
0.95 |
Concat / PartOpt / cuda / PreRev |
0.000016608 s |
0.000017726999999999998 s |
0.94 |
Concat / PartOpt / cuda / PostRev |
0.00001664 s |
0.000016768000000000003 s |
0.99 |
Concat / PartOpt / cuda / BothRev |
0.00001648 s |
0.000017056 s |
0.97 |
Concat / IPartOpt / cuda / PreRev |
0.00001696 s |
0.000017088 s |
0.99 |
Concat / IPartOpt / cuda / PostRev |
0.000016864 s |
0.000017153 s |
0.98 |
Concat / IPartOpt / cuda / BothRev |
0.000016672 s |
0.000017024 s |
0.98 |
Concat / DefOpt / cuda / PreRev |
0.000017055000000000002 s |
0.000016704 s |
1.02 |
Concat / DefOpt / cuda / PostRev |
0.000016672 s |
0.000016448000000000002 s |
1.01 |
Concat / DefOpt / cuda / BothRev |
0.000016831 s |
0.000016704 s |
1.01 |
Concat / IDefOpt / cuda / PreRev |
0.000016864 s |
0.000019008 s |
0.89 |
Concat / IDefOpt / cuda / PostRev |
0.0000168 s |
0.000017056 s |
0.98 |
Concat / IDefOpt / cuda / BothRev |
0.000016991 s |
0.000016352 s |
1.04 |
Concat / JaXPipe / tpu / Primal |
0.000001479775 s |
0.0000015218999999999998 s |
0.97 |
Concat / Jax / tpu / Primal |
0.00000149715 s |
0.000001528325 s |
0.98 |
Concat / HLOOpt / tpu / Primal |
0.000001477775 s |
0.000001530275 s |
0.97 |
Concat / PartOpt / tpu / Primal |
0.0000014964 s |
0.000001534625 s |
0.98 |
Concat / IPartOpt / tpu / Primal |
0.0000014830750000000002 s |
0.00000152225 s |
0.97 |
Concat / DefOpt / tpu / Primal |
0.000001485525 s |
0.000001525125 s |
0.97 |
Concat / IDefOpt / tpu / Primal |
0.000001478 s |
0.000001522775 s |
0.97 |
Concat / JaXPipe / tpu / Forward |
0.00000151395 s |
0.000001564575 s |
0.97 |
Concat / Jax / tpu / Forward |
0.0000015247 s |
0.000001564375 s |
0.97 |
Concat / HLOOpt / tpu / Forward |
0.00000151305 s |
0.000001579 s |
0.96 |
Concat / PartOpt / tpu / Forward |
0.00000152975 s |
0.000001571225 s |
0.97 |
Concat / IPartOpt / tpu / Forward |
0.0000015135750000000002 s |
0.000001578375 s |
0.96 |
Concat / DefOpt / tpu / Forward |
0.000001501675 s |
0.0000015573749999999998 s |
0.96 |
Concat / IDefOpt / tpu / Forward |
0.000001515525 s |
0.00000157315 s |
0.96 |
Concat / JaXPipe / tpu / PreRev |
0.000001999225 s |
0.000002009 s |
1.00 |
Concat / JaXPipe / tpu / PostRev |
0.000002028625 s |
0.00000208995 s |
0.97 |
Concat / JaXPipe / tpu / BothRev |
0.00000197715 s |
0.000001990725 s |
0.99 |
Concat / Jax / tpu / BothRev |
0.0000020058000000000003 s |
0.00000207115 s |
0.97 |
Concat / HLOOpt / tpu / PreRev |
0.000001950575 s |
0.000001998025 s |
0.98 |
Concat / HLOOpt / tpu / PostRev |
0.000002002075 s |
0.000002079825 s |
0.96 |
Concat / HLOOpt / tpu / BothRev |
0.00000195435 s |
0.000001992025 s |
0.98 |
Concat / PartOpt / tpu / PreRev |
0.0000020054 s |
0.000002079425 s |
0.96 |
Concat / PartOpt / tpu / PostRev |
0.00000194965 s |
0.0000020019500000000004 s |
0.97 |
Concat / PartOpt / tpu / BothRev |
0.0000020062 s |
0.000002076725 s |
0.97 |
Concat / IPartOpt / tpu / PreRev |
0.0000019557 s |
0.00000199835 s |
0.98 |
Concat / IPartOpt / tpu / PostRev |
0.0000020022 s |
0.000002073975 s |
0.97 |
Concat / IPartOpt / tpu / BothRev |
0.0000019602 s |
0.00000199655 s |
0.98 |
Concat / DefOpt / tpu / PreRev |
0.00000201435 s |
0.000002067625 s |
0.97 |
Concat / DefOpt / tpu / PostRev |
0.0000019586 s |
0.00000199465 s |
0.98 |
Concat / DefOpt / tpu / BothRev |
0.00000200585 s |
0.000002069275 s |
0.97 |
Concat / IDefOpt / tpu / PreRev |
0.00000195665 s |
0.00000200825 s |
0.97 |
Concat / IDefOpt / tpu / PostRev |
0.000002001275 s |
0.000002074575 s |
0.96 |
Concat / IDefOpt / tpu / BothRev |
0.00000196175 s |
0.000002000675 s |
0.98 |
Concat / JaXPipe / cpu / Primal |
0.000012889 s |
0.00000811730000350508 s |
1.59 |
Concat / Jax / cpu / Primal |
0.000013204 s |
0.000007682900013605832 s |
1.72 |
Concat / HLOOpt / cpu / Primal |
0.000012756 s |
0.00001060113999301393 s |
1.20 |
Concat / PartOpt / cpu / Primal |
0.000012848 s |
0.000007756900004096678 s |
1.66 |
Concat / IPartOpt / cpu / Primal |
0.000012849 s |
0.000007213979988591745 s |
1.78 |
Concat / DefOpt / cpu / Primal |
0.000013003 s |
0.000011252420026721666 s |
1.16 |
Concat / IDefOpt / cpu / Primal |
0.00001262 s |
0.000007434619956256938 s |
1.70 |
Concat / JaXPipe / cpu / Forward |
0.000017874000000000002 s |
0.00001102167998396908 s |
1.62 |
Concat / Jax / cpu / Forward |
0.000017447 s |
0.000011529920011525971 s |
1.51 |
Concat / HLOOpt / cpu / Forward |
0.000017823 s |
0.000015114119942154502 s |
1.18 |
Concat / PartOpt / cpu / Forward |
0.000017641 s |
0.000015529919983237052 s |
1.14 |
Concat / IPartOpt / cpu / Forward |
0.000018216 s |
0.00001101205997656507 s |
1.65 |
Concat / DefOpt / cpu / Forward |
0.000017828 s |
0.00001623475995984336 s |
1.10 |
Concat / IDefOpt / cpu / Forward |
0.000018044 s |
0.000011439519985287916 s |
1.58 |
Concat / JaXPipe / cpu / PreRev |
0.000020415 s |
0.00001323297995440953 s |
1.54 |
Concat / JaXPipe / cpu / PostRev |
0.000019953 s |
0.000013689120060007552 s |
1.46 |
Concat / JaXPipe / cpu / BothRev |
0.000019657 s |
0.00001329070005340327 s |
1.48 |
Concat / Jax / cpu / BothRev |
0.000020352 s |
0.000013045700006841798 s |
1.56 |
Concat / HLOOpt / cpu / PreRev |
0.00001984 s |
0.000013706500003536347 s |
1.45 |
Concat / HLOOpt / cpu / PostRev |
0.000020157 s |
0.000017045479953594622 s |
1.18 |
Concat / HLOOpt / cpu / BothRev |
0.000019415000000000003 s |
0.00001466194001295662 s |
1.32 |
Concat / PartOpt / cpu / PreRev |
0.000019998 s |
0.000013567740024882369 s |
1.47 |
Concat / PartOpt / cpu / PostRev |
0.0000207 s |
0.000013383999985308037 s |
1.55 |
Concat / PartOpt / cpu / BothRev |
0.000021109 s |
0.000013322119930307964 s |
1.58 |
Concat / IPartOpt / cpu / PreRev |
0.000020448 s |
0.000014049620031073571 s |
1.46 |
Concat / IPartOpt / cpu / PostRev |
0.000021175 s |
0.000013057799988018814 s |
1.62 |
Concat / IPartOpt / cpu / BothRev |
0.000021279 s |
0.00001312568005232606 s |
1.62 |
Concat / DefOpt / cpu / PreRev |
0.000020186 s |
0.000012955279962625355 s |
1.56 |
Concat / DefOpt / cpu / PostRev |
0.00002026 s |
0.000013988820019221748 s |
1.45 |
Concat / DefOpt / cpu / BothRev |
0.000019948 s |
0.000013455380039886224 s |
1.48 |
Concat / IDefOpt / cpu / PreRev |
0.000019963 s |
0.000013155779970475124 s |
1.52 |
Concat / IDefOpt / cpu / PostRev |
0.000021359 s |
0.000012839719993280596 s |
1.66 |
Concat / IDefOpt / cpu / BothRev |
0.000021194 s |
0.000014180980006130994 s |
1.49 |
const_scatter / JaXPipe / cpu / Primal |
0.000007045879999623139 s |
0.000007684319989493815 s |
0.92 |
const_scatter / Jax / cpu / Primal |
0.000006954280013360403 s |
0.000007795560031809146 s |
0.89 |
const_scatter / HLOOpt / cpu / Primal |
0.00000660258000380054 s |
0.000008156360036082332 s |
0.81 |
const_scatter / PartOpt / cpu / Primal |
0.000007046820003324683 s |
0.000007077780019244528 s |
1.00 |
const_scatter / IPartOpt / cpu / Primal |
0.000007281340010649729 s |
0.000007127839962777216 s |
1.02 |
const_scatter / DefOpt / cpu / Primal |
0.000007038079993435531 s |
0.00001182172003609594 s |
0.60 |
const_scatter / IDefOpt / cpu / Primal |
0.000006507680000140681 s |
0.000007085820006977883 s |
0.92 |
const_scatter / JaXPipe / cpu / Forward |
0.00001025107998657404 s |
0.00001117912002882804 s |
0.92 |
const_scatter / Jax / cpu / Forward |
0.000010728259994721156 s |
0.000012405520028551109 s |
0.86 |
const_scatter / HLOOpt / cpu / Forward |
0.000014583319998564548 s |
0.000016166460000022198 s |
0.90 |
const_scatter / PartOpt / cpu / Forward |
0.0000151262999975188 s |
0.000015849359979256405 s |
0.95 |
const_scatter / IPartOpt / cpu / Forward |
0.000010512740007015963 s |
0.000010997060026056716 s |
0.96 |
const_scatter / DefOpt / cpu / Forward |
0.000015138839999053745 s |
0.00001573457998347294 s |
0.96 |
const_scatter / IDefOpt / cpu / Forward |
0.000009518999988813447 s |
0.000010618399992381455 s |
0.90 |
const_scatter / JaXPipe / cpu / PreRev |
0.0003046743200138 s |
0.0003040449800209 s |
1.00 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002929268200091 s |
0.0002915093400042 s |
1.00 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002854891600031 s |
0.0002868209999996 s |
1.00 |
const_scatter / Jax / cpu / BothRev |
0.0002860530200018 s |
0.0002858066800308 s |
1.00 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002875375000098 s |
0.0002873173999796 s |
1.00 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002893032800034 s |
0.0002849928999967 s |
1.02 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002882812199959 s |
0.0002872712200041 s |
1.00 |
const_scatter / PartOpt / cpu / PreRev |
0.0002913089600042 s |
0.0002921233799952 s |
1.00 |
const_scatter / PartOpt / cpu / PostRev |
0.0002861352599938 s |
0.000287245599984 s |
1.00 |
const_scatter / PartOpt / cpu / BothRev |
0.0002918820600007 s |
0.0002904328599561 s |
1.00 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002940502000024 s |
0.000291726020041 s |
1.01 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002869377000024 s |
0.0002852017199893 s |
1.01 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002926235800146 s |
0.000291159999997 s |
1.01 |
const_scatter / DefOpt / cpu / PreRev |
0.0002915703600024 s |
0.0002904347400271 s |
1.00 |
const_scatter / DefOpt / cpu / PostRev |
0.0002863039400062 s |
0.0002867657000024 s |
1.00 |
const_scatter / DefOpt / cpu / BothRev |
0.0002901643400059 s |
0.0002920737600197 s |
0.99 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002911594799957 s |
0.0002910701800101 s |
1.00 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002891495400081 s |
0.000286926560002 s |
1.01 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002927561200021 s |
0.0003165261199501 s |
0.92 |
const_scatter / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / PartOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
const_scatter / IPartOpt / cuda / Primal |
0.000001919 s |
0.000001887 s |
1.02 |
const_scatter / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001919 s |
1.00 |
const_scatter / JaXPipe / cuda / Forward |
0.000009824 s |
0.000011103 s |
0.88 |
const_scatter / Jax / cuda / Forward |
0.000009888 s |
0.000009856 s |
1.00 |
const_scatter / HLOOpt / cuda / Forward |
0.000009472 s |
0.00001008 s |
0.94 |
const_scatter / PartOpt / cuda / Forward |
0.000009952 s |
0.00001008 s |
0.99 |
const_scatter / IPartOpt / cuda / Forward |
0.000009952 s |
0.000009472 s |
1.05 |
const_scatter / DefOpt / cuda / Forward |
0.000009856 s |
0.000010176 s |
0.97 |
const_scatter / IDefOpt / cuda / Forward |
0.000009695 s |
0.000010015 s |
0.97 |
const_scatter / JaXPipe / cuda / PreRev |
0.000012736 s |
0.000012863 s |
0.99 |
const_scatter / JaXPipe / cuda / PostRev |
0.000016670999999999997 s |
0.000016927999999999998 s |
0.98 |
const_scatter / JaXPipe / cuda / BothRev |
0.000017760000000000003 s |
0.00001296 s |
1.37 |
const_scatter / Jax / cuda / BothRev |
0.000016672 s |
0.000016896000000000002 s |
0.99 |
const_scatter / HLOOpt / cuda / PreRev |
0.000012513 s |
0.0000128 s |
0.98 |
const_scatter / HLOOpt / cuda / PostRev |
0.00001328 s |
0.000013152 s |
1.01 |
const_scatter / HLOOpt / cuda / BothRev |
0.000012544 s |
0.000012832 s |
0.98 |
const_scatter / PartOpt / cuda / PreRev |
0.000012864 s |
0.000013056 s |
0.99 |
const_scatter / PartOpt / cuda / PostRev |
0.000016255999999999998 s |
0.000017536 s |
0.93 |
const_scatter / PartOpt / cuda / BothRev |
0.000012416 s |
0.0000128 s |
0.97 |
const_scatter / IPartOpt / cuda / PreRev |
0.000012865 s |
0.000012704 s |
1.01 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016512 s |
0.000016929 s |
0.98 |
const_scatter / IPartOpt / cuda / BothRev |
0.000012768 s |
0.000012736 s |
1.00 |
const_scatter / DefOpt / cuda / PreRev |
0.000013472 s |
0.00001312 s |
1.03 |
const_scatter / DefOpt / cuda / PostRev |
0.000012768 s |
0.00001296 s |
0.99 |
const_scatter / DefOpt / cuda / BothRev |
0.000012896 s |
0.000012832 s |
1.00 |
const_scatter / IDefOpt / cuda / PreRev |
0.000013024 s |
0.000012832 s |
1.01 |
const_scatter / IDefOpt / cuda / PostRev |
0.000012288 s |
0.000013408 s |
0.92 |
const_scatter / IDefOpt / cuda / BothRev |
0.0000128 s |
0.00001312 s |
0.98 |
const_scatter / JaXPipe / tpu / Primal |
0.000003806775 s |
0.00000382305 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.0000038291750000000005 s |
0.000003831675 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
9.30625e-7 s |
9.5125e-7 s |
0.98 |
const_scatter / PartOpt / tpu / Primal |
0.000003834125 s |
0.000003822175 s |
1.00 |
const_scatter / IPartOpt / tpu / Primal |
0.000003829475 s |
0.00000379315 s |
1.01 |
const_scatter / DefOpt / tpu / Primal |
9.71025e-7 s |
9.58825e-7 s |
1.01 |
const_scatter / IDefOpt / tpu / Primal |
9.30375e-7 s |
9.43525e-7 s |
0.99 |
const_scatter / JaXPipe / tpu / Forward |
0.00000192845 s |
0.00000192225 s |
1.00 |
const_scatter / Jax / tpu / Forward |
0.00000648755 s |
0.000006482375 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.0000019194 s |
0.00000191365 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.000001911225 s |
0.000001931925 s |
0.99 |
const_scatter / IPartOpt / tpu / Forward |
0.000001921225 s |
0.000001927125 s |
1.00 |
const_scatter / DefOpt / tpu / Forward |
0.0000019204 s |
0.000001929975 s |
1.00 |
const_scatter / IDefOpt / tpu / Forward |
0.00000190595 s |
0.00000191915 s |
0.99 |
const_scatter / JaXPipe / tpu / PreRev |
0.000004299725 s |
0.000004329525 s |
0.99 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006683200000000001 s |
0.000006649024999999999 s |
1.01 |
const_scatter / JaXPipe / tpu / BothRev |
0.0000042966000000000005 s |
0.000004307675 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.0000066769750000000006 s |
0.0000066769 s |
1.00 |
const_scatter / HLOOpt / tpu / PreRev |
0.000004304425 s |
0.000004301700000000001 s |
1.00 |
const_scatter / HLOOpt / tpu / PostRev |
0.000004307125 s |
0.000004335525 s |
0.99 |
const_scatter / HLOOpt / tpu / BothRev |
0.0000042995 s |
0.0000043162 s |
1.00 |
const_scatter / PartOpt / tpu / PreRev |
0.000004320025 s |
0.000004306499999999999 s |
1.00 |
const_scatter / PartOpt / tpu / PostRev |
0.000006662924999999999 s |
0.000006660475 s |
1.00 |
const_scatter / PartOpt / tpu / BothRev |
0.000004295625 s |
0.00000432115 s |
0.99 |
const_scatter / IPartOpt / tpu / PreRev |
0.0000043193 s |
0.000004300775 s |
1.00 |
const_scatter / IPartOpt / tpu / PostRev |
0.0000066737 s |
0.0000066653 s |
1.00 |
const_scatter / IPartOpt / tpu / BothRev |
0.000004296725000000001 s |
0.00000431405 s |
1.00 |
const_scatter / DefOpt / tpu / PreRev |
0.000004318 s |
0.0000043047 s |
1.00 |
const_scatter / DefOpt / tpu / PostRev |
0.00000430735 s |
0.00000429905 s |
1.00 |
const_scatter / DefOpt / tpu / BothRev |
0.0000043085 s |
0.000004318675 s |
1.00 |
const_scatter / IDefOpt / tpu / PreRev |
0.000004308975 s |
0.0000043148 s |
1.00 |
const_scatter / IDefOpt / tpu / PostRev |
0.00000433655 s |
0.0000043003 s |
1.01 |
const_scatter / IDefOpt / tpu / BothRev |
0.0000043062500000000005 s |
0.0000043025 s |
1.00 |
const_scatter / JaXPipe / cpu / Primal |
0.000012664 s |
0.000007684319989493815 s |
1.65 |
const_scatter / Jax / cpu / Primal |
0.000013181999999999998 s |
0.000007795560031809146 s |
1.69 |
const_scatter / HLOOpt / cpu / Primal |
0.000013185 s |
0.000008156360036082332 s |
1.62 |
const_scatter / PartOpt / cpu / Primal |
0.000012721 s |
0.000007077780019244528 s |
1.80 |
const_scatter / IPartOpt / cpu / Primal |
0.000012458 s |
0.000007127839962777216 s |
1.75 |
const_scatter / DefOpt / cpu / Primal |
0.000012613 s |
0.00001182172003609594 s |
1.07 |
const_scatter / IDefOpt / cpu / Primal |
0.000012549 s |
0.000007085820006977883 s |
1.77 |
const_scatter / JaXPipe / cpu / Forward |
0.000017135 s |
0.00001117912002882804 s |
1.53 |
const_scatter / Jax / cpu / Forward |
0.00001685 s |
0.000012405520028551109 s |
1.36 |
const_scatter / HLOOpt / cpu / Forward |
0.000016930000000000002 s |
0.000016166460000022198 s |
1.05 |
const_scatter / PartOpt / cpu / Forward |
0.000016667 s |
0.000015849359979256405 s |
1.05 |
const_scatter / IPartOpt / cpu / Forward |
0.000017159 s |
0.000010997060026056716 s |
1.56 |
const_scatter / DefOpt / cpu / Forward |
0.000017054 s |
0.00001573457998347294 s |
1.08 |
const_scatter / IDefOpt / cpu / Forward |
0.000017233999999999998 s |
0.000010618399992381455 s |
1.62 |
const_scatter / JaXPipe / cpu / PreRev |
0.00051675 s |
0.0003040449800209 s |
1.70 |
const_scatter / JaXPipe / cpu / PostRev |
0.000530804 s |
0.0002915093400042 s |
1.82 |
const_scatter / JaXPipe / cpu / BothRev |
0.00052207 s |
0.0002868209999996 s |
1.82 |
const_scatter / Jax / cpu / BothRev |
0.000520012 s |
0.0002858066800308 s |
1.82 |
const_scatter / HLOOpt / cpu / PreRev |
0.000502599 s |
0.0002873173999796 s |
1.75 |
const_scatter / HLOOpt / cpu / PostRev |
0.000531446 s |
0.0002849928999967 s |
1.86 |
const_scatter / HLOOpt / cpu / BothRev |
0.000509983 s |
0.0002872712200041 s |
1.78 |
const_scatter / PartOpt / cpu / PreRev |
0.0005188739999999 s |
0.0002921233799952 s |
1.78 |
const_scatter / PartOpt / cpu / PostRev |
0.000525949 s |
0.000287245599984 s |
1.83 |
const_scatter / PartOpt / cpu / BothRev |
0.0005083069999999 s |
0.0002904328599561 s |
1.75 |
const_scatter / IPartOpt / cpu / PreRev |
0.000507237 s |
0.000291726020041 s |
1.74 |
const_scatter / IPartOpt / cpu / PostRev |
0.000522571 s |
0.0002852017199893 s |
1.83 |
const_scatter / IPartOpt / cpu / BothRev |
0.000520849 s |
0.000291159999997 s |
1.79 |
const_scatter / DefOpt / cpu / PreRev |
0.0005385149999999 s |
0.0002904347400271 s |
1.85 |
const_scatter / DefOpt / cpu / PostRev |
0.0005356779999999 s |
0.0002867657000024 s |
1.87 |
const_scatter / DefOpt / cpu / BothRev |
0.000520829 s |
0.0002920737600197 s |
1.78 |
const_scatter / IDefOpt / cpu / PreRev |
0.000524497 s |
0.0002910701800101 s |
1.80 |
const_scatter / IDefOpt / cpu / PostRev |
0.000536962 s |
0.000286926560002 s |
1.87 |
const_scatter / IDefOpt / cpu / BothRev |
0.000523844 s |
0.0003165261199501 s |
1.65 |
GenDot / JaXPipe / cpu / Primal |
0.000007764539996060193 s |
0.000008797720001894049 s |
0.88 |
GenDot / Jax / cpu / Primal |
0.000006740919995991135 s |
0.000008127900009640144 s |
0.83 |
GenDot / HLOOpt / cpu / Primal |
0.000012684460000400576 s |
0.000012815599984605796 s |
0.99 |
GenDot / PartOpt / cpu / Primal |
0.000007511799999520008 s |
0.000008449520009889966 s |
0.89 |
GenDot / IPartOpt / cpu / Primal |
0.000007805919999555044 s |
0.000008597039959568065 s |
0.91 |
GenDot / DefOpt / cpu / Primal |
0.000007267240000601305 s |
0.000009416619950570747 s |
0.77 |
GenDot / IDefOpt / cpu / Primal |
0.00000814504001482419 s |
0.000009196099981636509 s |
0.89 |
GenDot / JaXPipe / cpu / Forward |
0.000011676700005409657 s |
0.000012595799989867374 s |
0.93 |
GenDot / Jax / cpu / Forward |
0.000011092660001850162 s |
0.000011937699982809135 s |
0.93 |
GenDot / HLOOpt / cpu / Forward |
0.000016936820004502805 s |
0.000016989840050882777 s |
1.00 |
GenDot / PartOpt / cpu / Forward |
0.00001217976000816634 s |
0.0000176848600312951 s |
0.69 |
GenDot / IPartOpt / cpu / Forward |
0.00001097131999586054 s |
0.000012526099990282091 s |
0.88 |
GenDot / DefOpt / cpu / Forward |
0.000017099699987284113 s |
0.000017279039984714472 s |
0.99 |
GenDot / IDefOpt / cpu / Forward |
0.000010955619995911548 s |
0.000012846839963458478 s |
0.85 |
GenDot / JaXPipe / cpu / PreRev |
0.000012257100008810084 s |
0.000012676279966399309 s |
0.97 |
GenDot / JaXPipe / cpu / PostRev |
0.000010664799997357476 s |
0.000011538660028236335 s |
0.92 |
GenDot / JaXPipe / cpu / BothRev |
0.000011642160015981064 s |
0.000016753639984017353 s |
0.69 |
GenDot / Jax / cpu / BothRev |
0.000010723319996941428 s |
0.000011219420048291796 s |
0.96 |
GenDot / HLOOpt / cpu / PreRev |
0.000012119660011649104 s |
0.000012427280007614171 s |
0.98 |
GenDot / HLOOpt / cpu / PostRev |
0.00001509053999598109 s |
0.000012331320021985448 s |
1.22 |
GenDot / HLOOpt / cpu / BothRev |
0.000013554020003994085 s |
0.000013908260016251006 s |
0.97 |
GenDot / PartOpt / cpu / PreRev |
0.000011333700003888225 s |
0.000011689480024870137 s |
0.97 |
GenDot / PartOpt / cpu / PostRev |
0.000010589520011308195 s |
0.000011357859930285486 s |
0.93 |
GenDot / PartOpt / cpu / BothRev |
0.000010748720007995871 s |
0.0000118991400449886 s |
0.90 |
GenDot / IPartOpt / cpu / PreRev |
0.000017186180009503005 s |
0.000011938199986616382 s |
1.44 |
GenDot / IPartOpt / cpu / PostRev |
0.000010466019994055386 s |
0.000011882080034411048 s |
0.88 |
GenDot / IPartOpt / cpu / BothRev |
0.000011465680001947476 s |
0.000011801500022556866 s |
0.97 |
GenDot / DefOpt / cpu / PreRev |
0.000011654240004190798 s |
0.000012377880011626983 s |
0.94 |
GenDot / DefOpt / cpu / PostRev |
0.000010896979993049173 s |
0.00001185160002023622 s |
0.92 |
GenDot / DefOpt / cpu / BothRev |
0.000011346240003149432 s |
0.000011738000039258622 s |
0.97 |
GenDot / IDefOpt / cpu / PreRev |
0.000011781420009810973 s |
0.000012479819988584496 s |
0.94 |
GenDot / IDefOpt / cpu / PostRev |
0.00001154707999603488 s |
0.00001180088001092372 s |
0.98 |
GenDot / IDefOpt / cpu / BothRev |
0.000010908919996381885 s |
0.000012278259991944652 s |
0.89 |
GenDot / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002047 s |
0.98 |
GenDot / Jax / cuda / Primal |
0.000002016 s |
0.000002047 s |
0.98 |
GenDot / HLOOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / PartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002047 s |
0.98 |
GenDot / DefOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / IDefOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / JaXPipe / cuda / Forward |
0.000010304 s |
0.000010144 s |
1.02 |
GenDot / Jax / cuda / Forward |
0.000009856 s |
0.000010336 s |
0.95 |
GenDot / HLOOpt / cuda / Forward |
0.000010208 s |
0.000010016 s |
1.02 |
GenDot / PartOpt / cuda / Forward |
0.00000992 s |
0.000010272 s |
0.97 |
GenDot / IPartOpt / cuda / Forward |
0.00001024 s |
0.000009984 s |
1.03 |
GenDot / DefOpt / cuda / Forward |
0.00001008 s |
0.000010592 s |
0.95 |
GenDot / IDefOpt / cuda / Forward |
0.000010176 s |
0.000010111 s |
1.01 |
GenDot / JaXPipe / cuda / PreRev |
0.000010113 s |
0.000010048 s |
1.01 |
GenDot / JaXPipe / cuda / PostRev |
0.000010336 s |
0.000010592 s |
0.98 |
GenDot / JaXPipe / cuda / BothRev |
0.000010112 s |
0.000010176 s |
0.99 |
GenDot / Jax / cuda / BothRev |
0.000009984 s |
0.000009824 s |
1.02 |
GenDot / HLOOpt / cuda / PreRev |
0.000009952 s |
0.000010111 s |
0.98 |
GenDot / HLOOpt / cuda / PostRev |
0.00000944 s |
0.000010048 s |
0.94 |
GenDot / HLOOpt / cuda / BothRev |
0.000009503 s |
0.000010369 s |
0.92 |
GenDot / PartOpt / cuda / PreRev |
0.00001024 s |
0.00001008 s |
1.02 |
GenDot / PartOpt / cuda / PostRev |
0.000010144 s |
0.000010464 s |
0.97 |
GenDot / PartOpt / cuda / BothRev |
0.000010048 s |
0.000010176 s |
0.99 |
GenDot / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000010271 s |
0.98 |
GenDot / IPartOpt / cuda / PostRev |
0.000010176 s |
0.000010336 s |
0.98 |
GenDot / IPartOpt / cuda / BothRev |
0.000010688 s |
0.000010143 s |
1.05 |
GenDot / DefOpt / cuda / PreRev |
0.000010528 s |
0.0000104 s |
1.01 |
GenDot / DefOpt / cuda / PostRev |
0.000010048 s |
0.000010144 s |
0.99 |
GenDot / DefOpt / cuda / BothRev |
0.000009759 s |
0.00000976 s |
1.00 |
GenDot / IDefOpt / cuda / PreRev |
0.000010112 s |
0.000010208 s |
0.99 |
GenDot / IDefOpt / cuda / PostRev |
0.000010049 s |
0.000010176 s |
0.99 |
GenDot / IDefOpt / cuda / BothRev |
0.000009985 s |
0.000009984 s |
1.00 |
GenDot / JaXPipe / tpu / Primal |
9.21075e-7 s |
9.30075e-7 s |
0.99 |
GenDot / Jax / tpu / Primal |
9.403e-7 s |
9.3635e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.00000161355 s |
0.00000158145 s |
1.02 |
GenDot / PartOpt / tpu / Primal |
9.40025e-7 s |
9.3585e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.8745e-7 s |
9.3965e-7 s |
1.05 |
GenDot / DefOpt / tpu / Primal |
0.0000015063499999999998 s |
0.0000014923500000000002 s |
1.01 |
GenDot / IDefOpt / tpu / Primal |
0.0000016167500000000002 s |
0.000001577 s |
1.03 |
GenDot / JaXPipe / tpu / Forward |
0.000003060425 s |
0.000003169925 s |
0.97 |
GenDot / Jax / tpu / Forward |
0.0000023204 s |
0.0000023474 s |
0.99 |
GenDot / HLOOpt / tpu / Forward |
0.00000311285 s |
0.0000031217 s |
1.00 |
GenDot / PartOpt / tpu / Forward |
0.000003116075 s |
0.000003223 s |
0.97 |
GenDot / IPartOpt / tpu / Forward |
0.000003120175 s |
0.00000312165 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.0000031089 s |
0.0000032237 s |
0.96 |
GenDot / IDefOpt / tpu / Forward |
0.00000311095 s |
0.000003127325 s |
0.99 |
GenDot / JaXPipe / tpu / PreRev |
0.000002941875 s |
0.000002970675 s |
0.99 |
GenDot / JaXPipe / tpu / PostRev |
0.000002382775 s |
0.000002402475 s |
0.99 |
GenDot / JaXPipe / tpu / BothRev |
0.000002959025 s |
0.000002967475 s |
1.00 |
GenDot / Jax / tpu / BothRev |
0.0000023764250000000003 s |
0.000002400975 s |
0.99 |
GenDot / HLOOpt / tpu / PreRev |
0.000002957475 s |
0.0000029706 s |
1.00 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029856 s |
0.000002942875 s |
1.01 |
GenDot / HLOOpt / tpu / BothRev |
0.000002950825 s |
0.000002962125 s |
1.00 |
GenDot / PartOpt / tpu / PreRev |
0.0000029891000000000003 s |
0.00000294405 s |
1.02 |
GenDot / PartOpt / tpu / PostRev |
0.0000024054000000000003 s |
0.0000023922750000000003 s |
1.01 |
GenDot / PartOpt / tpu / BothRev |
0.000002982175 s |
0.0000029394500000000003 s |
1.01 |
GenDot / IPartOpt / tpu / PreRev |
0.000002947325 s |
0.000002957075 s |
1.00 |
GenDot / IPartOpt / tpu / PostRev |
0.0000023784 s |
0.000002408875 s |
0.99 |
GenDot / IPartOpt / tpu / BothRev |
0.000002957175 s |
0.000002964075 s |
1.00 |
GenDot / DefOpt / tpu / PreRev |
0.000002976075 s |
0.00000294815 s |
1.01 |
GenDot / DefOpt / tpu / PostRev |
0.000002950525 s |
0.0000029699500000000003 s |
0.99 |
GenDot / DefOpt / tpu / BothRev |
0.000002979625 s |
0.000002940925 s |
1.01 |
GenDot / IDefOpt / tpu / PreRev |
0.000002950825 s |
0.000002966275 s |
0.99 |
GenDot / IDefOpt / tpu / PostRev |
0.000002976075 s |
0.00000293485 s |
1.01 |
GenDot / IDefOpt / tpu / BothRev |
0.0000029458250000000003 s |
0.000002959825 s |
1.00 |
GenDot / JaXPipe / cpu / Primal |
0.000014801 s |
0.000008797720001894049 s |
1.68 |
GenDot / Jax / cpu / Primal |
0.000014984 s |
0.000008127900009640144 s |
1.84 |
GenDot / HLOOpt / cpu / Primal |
0.00002134 s |
0.000012815599984605796 s |
1.67 |
GenDot / PartOpt / cpu / Primal |
0.000014863 s |
0.000008449520009889966 s |
1.76 |
GenDot / IPartOpt / cpu / Primal |
0.00001493 s |
0.000008597039959568065 s |
1.74 |
GenDot / DefOpt / cpu / Primal |
0.000013718 s |
0.000009416619950570747 s |
1.46 |
GenDot / IDefOpt / cpu / Primal |
0.000014323 s |
0.000009196099981636509 s |
1.56 |
GenDot / JaXPipe / cpu / Forward |
0.000019645 s |
0.000012595799989867374 s |
1.56 |
GenDot / Jax / cpu / Forward |
0.00002098 s |
0.000011937699982809135 s |
1.76 |
GenDot / HLOOpt / cpu / Forward |
0.000019521 s |
0.000016989840050882777 s |
1.15 |
GenDot / PartOpt / cpu / Forward |
0.000019441 s |
0.0000176848600312951 s |
1.10 |
GenDot / IPartOpt / cpu / Forward |
0.000019649 s |
0.000012526099990282091 s |
1.57 |
GenDot / DefOpt / cpu / Forward |
0.000019512 s |
0.000017279039984714472 s |
1.13 |
GenDot / IDefOpt / cpu / Forward |
0.000019137 s |
0.000012846839963458478 s |
1.49 |
GenDot / JaXPipe / cpu / PreRev |
0.000019489 s |
0.000012676279966399309 s |
1.54 |
GenDot / JaXPipe / cpu / PostRev |
0.000021278 s |
0.000011538660028236335 s |
1.84 |
GenDot / JaXPipe / cpu / BothRev |
0.000019813 s |
0.000016753639984017353 s |
1.18 |
GenDot / Jax / cpu / BothRev |
0.00002093 s |
0.000011219420048291796 s |
1.87 |
GenDot / HLOOpt / cpu / PreRev |
0.000018535 s |
0.000012427280007614171 s |
1.49 |
GenDot / HLOOpt / cpu / PostRev |
0.000019883000000000003 s |
0.000012331320021985448 s |
1.61 |
GenDot / HLOOpt / cpu / BothRev |
0.000019256 s |
0.000013908260016251006 s |
1.38 |
GenDot / PartOpt / cpu / PreRev |
0.000019207 s |
0.000011689480024870137 s |
1.64 |
GenDot / PartOpt / cpu / PostRev |
0.000021351 s |
0.000011357859930285486 s |
1.88 |
GenDot / PartOpt / cpu / BothRev |
0.000019636 s |
0.0000118991400449886 s |
1.65 |
GenDot / IPartOpt / cpu / PreRev |
0.000019319 s |
0.000011938199986616382 s |
1.62 |
GenDot / IPartOpt / cpu / PostRev |
0.000021175 s |
0.000011882080034411048 s |
1.78 |
GenDot / IPartOpt / cpu / BothRev |
0.000019304 s |
0.000011801500022556866 s |
1.64 |
GenDot / DefOpt / cpu / PreRev |
0.000019788 s |
0.000012377880011626983 s |
1.60 |
GenDot / DefOpt / cpu / PostRev |
0.000020156 s |
0.00001185160002023622 s |
1.70 |
GenDot / DefOpt / cpu / BothRev |
0.000019505 s |
0.000011738000039258622 s |
1.66 |
GenDot / IDefOpt / cpu / PreRev |
0.000019741000000000003 s |
0.000012479819988584496 s |
1.58 |
GenDot / IDefOpt / cpu / PostRev |
0.000020126 s |
0.00001180088001092372 s |
1.71 |
GenDot / IDefOpt / cpu / BothRev |
0.000019763 s |
0.000012278259991944652 s |
1.61 |
hlo_ffi / JaXPipe / cpu / Primal |
0.00001108872000258998 s |
0.000010822440017363988 s |
1.02 |
hlo_ffi / Jax / cpu / Primal |
0.000010399159993994544 s |
0.000010285600001225249 s |
1.01 |
hlo_ffi / HLOOpt / cpu / Primal |
0.00001432895999187167 s |
0.000013866080007574056 s |
1.03 |
hlo_ffi / PartOpt / cpu / Primal |
0.000009614979999241767 s |
0.000009819259985306416 s |
0.98 |
hlo_ffi / IPartOpt / cpu / Primal |
0.00001039095999658457 s |
0.000010091600024679792 s |
1.03 |
hlo_ffi / DefOpt / cpu / Primal |
0.000013756320008724287 s |
0.000010069340023619588 s |
1.37 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000010297940004875272 s |
0.000009854680001808449 s |
1.04 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000014509259999613278 s |
0.000014822959992670804 s |
0.98 |
hlo_ffi / Jax / cpu / Forward |
0.000014524600003369414 s |
0.000015186000036919722 s |
0.96 |
hlo_ffi / HLOOpt / cpu / Forward |
0.00001482992000546801 s |
0.00001533519999611599 s |
0.97 |
hlo_ffi / PartOpt / cpu / Forward |
0.000014599199998883705 s |
0.000015111639986571393 s |
0.97 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000014593340006285871 s |
0.000014500600018436671 s |
1.01 |
hlo_ffi / DefOpt / cpu / Forward |
0.000014727720008522738 s |
0.000014748540043001412 s |
1.00 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000014917699993475252 s |
0.000014836160007689612 s |
1.01 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000014350100004776322 s |
0.0000182156799655786 s |
0.79 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000014715439999690715 s |
0.000014539180056090117 s |
1.01 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00001442673999690669 s |
0.000014412620002985931 s |
1.00 |
hlo_ffi / Jax / cpu / BothRev |
0.00001419042000634363 s |
0.000015165359964157688 s |
0.94 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000014442780000081256 s |
0.000014253160006774124 s |
1.01 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000014407880003091125 s |
0.000014273299966589549 s |
1.01 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000016216299998177418 s |
0.00001619226000912022 s |
1.00 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000014471740000772116 s |
0.000014413839962799104 s |
1.00 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000014559339992956666 s |
0.000014372299983733682 s |
1.01 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000014645680005287432 s |
0.000014556479991369995 s |
1.01 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.00001413834000004499 s |
0.00001478366000810638 s |
0.96 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000014425780011606547 s |
0.000014571639958376182 s |
0.99 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000014231559994186682 s |
0.00001460164001400699 s |
0.97 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000014671740007088376 s |
0.000014733459993294671 s |
1.00 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000013581799996700284 s |
0.000014538120021825309 s |
0.93 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000014569300003586247 s |
0.000014464760015471256 s |
1.01 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000014410839994525305 s |
0.000014284580047387862 s |
1.01 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000014940240002943029 s |
0.000014173599975038087 s |
1.05 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000013810539994665306 s |
0.00001527166002233571 s |
0.90 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / Jax / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / PartOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001983 s |
0.000002015 s |
0.98 |
hlo_ffi / JaXPipe / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / Jax / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IPartOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / DefOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IDefOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.00000208 s |
0.000002047 s |
1.02 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002079 s |
0.000002047 s |
1.02 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.00000208 s |
0.000002079 s |
1.00 |
hlo_ffi / Jax / cuda / BothRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / PartOpt / cuda / PreRev |
0.00000208 s |
0.000002079 s |
1.00 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002048 s |
0.00000208 s |
0.98 |
hlo_ffi / PartOpt / cuda / BothRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002048 s |
0.00000208 s |
0.98 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.00000208 s |
0.000002048 s |
1.02 |
hlo_ffi / DefOpt / cuda / PreRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.00000208 s |
0.000002048 s |
1.02 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002048 s |
0.00000208 s |
0.98 |
hlo_ffi / JaXPipe / tpu / Primal |
9.1995e-7 s |
9.2325e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Primal |
9.49425e-7 s |
9.5055e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
8.966250000000001e-7 s |
8.98425e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Primal |
9.5055e-7 s |
9.60775e-7 s |
0.99 |
hlo_ffi / IPartOpt / tpu / Primal |
9.029e-7 s |
9.03925e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Primal |
9.53025e-7 s |
9.5335e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Primal |
9.004e-7 s |
8.985250000000001e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Forward |
9.4885e-7 s |
9.49575e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.81925e-7 s |
9.8175e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.739e-7 s |
9.74075e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.343e-7 s |
9.33825e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.73775e-7 s |
9.73925e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.33725e-7 s |
9.339e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.742e-7 s |
9.7405e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.32325e-7 s |
9.32325e-7 s |
1 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.64325e-7 s |
9.6485e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.5975e-7 s |
9.59925e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.65e-7 s |
9.65e-7 s |
1 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.603e-7 s |
9.60075e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.649e-7 s |
9.6455e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.6025e-7 s |
9.60425e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.64575e-7 s |
9.65525e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.59775e-7 s |
9.6045e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.64625e-7 s |
9.6435e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.596250000000002e-7 s |
9.6025e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.64675e-7 s |
9.650749999999998e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.6025e-7 s |
9.602e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.65e-7 s |
9.64625e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.59975e-7 s |
9.6075e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.647e-7 s |
9.65325e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.596250000000002e-7 s |
9.606e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.65025e-7 s |
9.65225e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.59925e-7 s |
9.600250000000002e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000017979 s |
0.000010822440017363988 s |
1.66 |
hlo_ffi / Jax / cpu / Primal |
0.000017888000000000002 s |
0.000010285600001225249 s |
1.74 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000017664 s |
0.000013866080007574056 s |
1.27 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017627 s |
0.000009819259985306416 s |
1.80 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017806 s |
0.000010091600024679792 s |
1.76 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017433 s |
0.000010069340023619588 s |
1.73 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000017537 s |
0.000009854680001808449 s |
1.78 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000025136 s |
0.000014822959992670804 s |
1.70 |
hlo_ffi / Jax / cpu / Forward |
0.000025237 s |
0.000015186000036919722 s |
1.66 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000024677 s |
0.00001533519999611599 s |
1.61 |
hlo_ffi / PartOpt / cpu / Forward |
0.00002493 s |
0.000015111639986571393 s |
1.65 |
hlo_ffi / IPartOpt / cpu / Forward |
0.00002467 s |
0.000014500600018436671 s |
1.70 |
hlo_ffi / DefOpt / cpu / Forward |
0.000024377 s |
0.000014748540043001412 s |
1.65 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000024297 s |
0.000014836160007689612 s |
1.64 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000024492 s |
0.0000182156799655786 s |
1.34 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000025612 s |
0.000014539180056090117 s |
1.76 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000025452 s |
0.000014412620002985931 s |
1.77 |
hlo_ffi / Jax / cpu / BothRev |
0.000025201 s |
0.000015165359964157688 s |
1.66 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.0000248 s |
0.000014253160006774124 s |
1.74 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000025613 s |
0.000014273299966589549 s |
1.79 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000024749 s |
0.00001619226000912022 s |
1.53 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000025257 s |
0.000014413839962799104 s |
1.75 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000024803 s |
0.000014372299983733682 s |
1.73 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000024518 s |
0.000014556479991369995 s |
1.68 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000025363 s |
0.00001478366000810638 s |
1.72 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000026213 s |
0.000014571639958376182 s |
1.80 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000024919 s |
0.00001460164001400699 s |
1.71 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000024997 s |
0.000014733459993294671 s |
1.70 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000026489 s |
0.000014538120021825309 s |
1.82 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000026393 s |
0.000014464760015471256 s |
1.82 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000024252 s |
0.000014284580047387862 s |
1.70 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000026493 s |
0.000014173599975038087 s |
1.87 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000025773 s |
0.00001527166002233571 s |
1.69 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.001113750200011 s |
0.0011677415998747 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0009457276000148 s |
0.0010169414000301 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0010015292000161 s |
0.0009394241999871 s |
1.07 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0009697868000102 s |
0.000897186199927 s |
1.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0009363741999777 s |
0.0008884472000318 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009786775999828 s |
0.0009461017998546 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0010293989999809 s |
0.0009835020000537 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0027043279999816 s |
0.0027426424001532 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0025499381999907 s |
0.0022887613999046 s |
1.11 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0024127479999606 s |
0.0021659468000507 s |
1.11 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0022893057999681 s |
0.002168072799941 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0024831801999653 s |
0.0023066482000103 s |
1.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0024889160000384 s |
0.0021572545998424 s |
1.15 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0024799177999966 s |
0.002206446800119 s |
1.12 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0058833122000123 s |
0.0067595389999041 s |
0.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0067876473999831 s |
0.0058482147999711 s |
1.16 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0067923801999768 s |
0.0057265669998741 s |
1.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0062905406000254 s |
0.005685240800085 s |
1.11 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0063517330000195 s |
0.0056048894001833 s |
1.13 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0056577950000018 s |
0.005353641199963 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0057180417999688 s |
0.0055430060001526 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0053945812000165 s |
0.0056180495999797 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0056010369999967 s |
0.0057529201998477 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0059857673999886 s |
0.0055169804000797 s |
1.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0051625880000074 s |
0.0053986708001502 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0038596079999479 s |
0.0054266727999674 s |
0.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0050896610000108 s |
0.0053957413999341 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.003423768000016 s |
0.0055037852001078 s |
0.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.004659365200041 s |
0.0054062096000052 s |
0.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0040709921999905 s |
0.0058279300000322 s |
0.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.005069977800008 s |
0.0056282245999682 s |
0.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0035011601999713 s |
0.0055000705999191 s |
0.64 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0050842067999838 s |
0.0055692312000246 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.0002756469999999 s |
0.000273632 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000274496 s |
0.000273696 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000288767 s |
0.000288319 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000274175 s |
0.000273696 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.0002757749999999 s |
0.000275327 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.0002892159999999 s |
0.0002884479999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000288831 s |
0.000287967 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000559807 s |
0.000559071 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000540255 s |
0.000539583 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000558175 s |
0.000558623 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.000558943 s |
0.000558911 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000559295 s |
0.000559583 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000559775 s |
0.000559839 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000559774 s |
0.000559518 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001017342 s |
0.001017982 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000987358 s |
0.000986878 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.0010095019999999 s |
0.001008765 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.000982333 s |
0.00098259 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001007774 s |
0.001008734 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001034429 s |
0.001032286 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001006782 s |
0.001008798 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.00101011 s |
0.0010106219999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000974366 s |
0.000973885 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001009022 s |
0.001009054 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001007934 s |
0.001010174 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000975646 s |
0.000973214 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001008734 s |
0.001008542 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001021822 s |
0.001020029 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000957246 s |
0.0009578529999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001020702 s |
0.001020446 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001020669 s |
0.001020511 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.0010169889999999 s |
0.001016861 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.00102179 s |
0.001020797 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.0001306885 s |
0.0001266495 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.00012426825 s |
0.0001280524999999 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.00016020525 s |
0.0001555445 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013078475 s |
0.000135421 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.0001381985 s |
0.000133678 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.000145043 s |
0.00014911425 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.0001580225 s |
0.0001540294999999 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.0002136905 s |
0.0002146765 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.0002626389999999 s |
0.0002613837499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00022041575 s |
0.0002143195 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.00021502175 s |
0.00021728225 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.0002163485 s |
0.0002145055 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.000217868 s |
0.0002170342499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.00021611425 s |
0.00021452425 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035566175 s |
0.00035627475 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.0002566625 s |
0.0002561875 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.0003557385 s |
0.000357608 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.00025661375 s |
0.00025765525 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.000355499 s |
0.00035707325 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029069625 s |
0.000291246 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035612575 s |
0.0003571442499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.0003554795 s |
0.0003553715 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.000271586 s |
0.00027444725 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.0003552262499999 s |
0.000355547 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.00035594175 s |
0.0003567862499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027160375 s |
0.00027269325 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.000355519 s |
0.0003570915 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.00035796775 s |
0.000357857 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.0002829089999999 s |
0.00028378575 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.000357477 s |
0.0003581902499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.000358083 s |
0.0003591905 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00029808275 s |
0.00029874275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.0003582819999999 s |
0.00035939375 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002595555 s |
0.0011677415998747 s |
2.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0025726589999999 s |
0.0010169414000301 s |
2.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002552586 s |
0.0009394241999871 s |
2.72 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002322997 s |
0.000897186199927 s |
2.59 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.002666251 s |
0.0008884472000318 s |
3.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002523841 s |
0.0009461017998546 s |
2.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.002878963 s |
0.0009835020000537 s |
2.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0067617979999999 s |
0.0027426424001532 s |
2.47 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.006420085 s |
0.0022887613999046 s |
2.81 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.006253628 s |
0.0021659468000507 s |
2.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.006462881 s |
0.002168072799941 s |
2.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.006731484 s |
0.0023066482000103 s |
2.92 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.006081068 s |
0.0021572545998424 s |
2.82 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.006648748 s |
0.002206446800119 s |
3.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.01113773 s |
0.0067595389999041 s |
1.65 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.010208568 s |
0.0058482147999711 s |
1.75 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.010151555 s |
0.0057265669998741 s |
1.77 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0099830059999999 s |
0.005685240800085 s |
1.76 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.00846717 s |
0.0056048894001833 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.009074556 s |
0.005353641199963 s |
1.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.008663143 s |
0.0055430060001526 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.009717012 s |
0.0056180495999797 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.009632612 s |
0.0057529201998477 s |
1.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0102081269999999 s |
0.0055169804000797 s |
1.85 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.008793828 s |
0.0053986708001502 s |
1.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.010030523 s |
0.0054266727999674 s |
1.85 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.009925397 s |
0.0053957413999341 s |
1.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.010525616 s |
0.0055037852001078 s |
1.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.008936678 s |
0.0054062096000052 s |
1.65 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.010292888 s |
0.0058279300000322 s |
1.77 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.009095053 s |
0.0056282245999682 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.010640675 s |
0.0055000705999191 s |
1.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.009895349 s |
0.0055692312000246 s |
1.78 |
scatter_sum / JaXPipe / cpu / Primal |
0.00000919253999427383 s |
0.000009560899970892933 s |
0.96 |
scatter_sum / Jax / cpu / Primal |
0.00000856111999382847 s |
0.000008968580013970495 s |
0.95 |
scatter_sum / HLOOpt / cpu / Primal |
0.000011400040000353328 s |
0.000012460140051189227 s |
0.91 |
scatter_sum / PartOpt / cpu / Primal |
0.00000780156000473653 s |
0.000009277959989049123 s |
0.84 |
scatter_sum / IPartOpt / cpu / Primal |
0.000008322920011778479 s |
0.000009176700023090236 s |
0.91 |
scatter_sum / DefOpt / cpu / Primal |
0.000007944719989154692 s |
0.000009108180020120926 s |
0.87 |
scatter_sum / IDefOpt / cpu / Primal |
0.000008278279995010963 s |
0.000009239719984179829 s |
0.90 |
scatter_sum / JaXPipe / cpu / Forward |
0.00001229448000458433 s |
0.000013944200027253828 s |
0.88 |
scatter_sum / Jax / cpu / Forward |
0.00001210159999800453 s |
0.000013279800023155985 s |
0.91 |
scatter_sum / HLOOpt / cpu / Forward |
0.000012145440000495 s |
0.0000191325799733022 s |
0.63 |
scatter_sum / PartOpt / cpu / Forward |
0.00001185415998861572 s |
0.00001404371999342402 s |
0.84 |
scatter_sum / IPartOpt / cpu / Forward |
0.000012066780006989577 s |
0.000013532699949792004 s |
0.89 |
scatter_sum / DefOpt / cpu / Forward |
0.000017448179994516978 s |
0.000019775379987549967 s |
0.88 |
scatter_sum / IDefOpt / cpu / Forward |
0.000011824080008864257 s |
0.000013536360002035509 s |
0.87 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000012807599998723165 s |
0.000013416699976005476 s |
0.95 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000012629379993995826 s |
0.000013588860037998528 s |
0.93 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000017336919993340415 s |
0.000018541219978942537 s |
0.94 |
scatter_sum / Jax / cpu / BothRev |
0.000012039100004130888 s |
0.000013493200040102238 s |
0.89 |
scatter_sum / HLOOpt / cpu / PreRev |
0.00001279300000078365 s |
0.000013985340037834249 s |
0.91 |
scatter_sum / HLOOpt / cpu / PostRev |
0.00001717212000357904 s |
0.00001821657999244053 s |
0.94 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000014252419996410026 s |
0.000020798099985768203 s |
0.69 |
scatter_sum / PartOpt / cpu / PreRev |
0.000012034159999529947 s |
0.000014508680023936905 s |
0.83 |
scatter_sum / PartOpt / cpu / PostRev |
0.000012169960000392164 s |
0.000013988859991513892 s |
0.87 |
scatter_sum / PartOpt / cpu / BothRev |
0.000011965279991272836 s |
0.000013867720035705131 s |
0.86 |
scatter_sum / IPartOpt / cpu / PreRev |
0.0000172866600019006 s |
0.000019941060008932256 s |
0.87 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000011879779992796105 s |
0.000014055500041649794 s |
0.85 |
scatter_sum / IPartOpt / cpu / BothRev |
0.00001231825999866487 s |
0.000013829800009261817 s |
0.89 |
scatter_sum / DefOpt / cpu / PreRev |
0.000012029579991121864 s |
0.000014402780007003455 s |
0.84 |
scatter_sum / DefOpt / cpu / PostRev |
0.00001255811999726575 s |
0.00001407524003298022 s |
0.89 |
scatter_sum / DefOpt / cpu / BothRev |
0.00001213786000107575 s |
0.000013597080014733363 s |
0.89 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000012423539994870224 s |
0.000013511479946828331 s |
0.92 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000012375340008929924 s |
0.000013842139969710842 s |
0.89 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000012727219996122583 s |
0.00001488863998929446 s |
0.85 |
scatter_sum / JaXPipe / cuda / Primal |
0.000010336 s |
0.00001008 s |
1.03 |
scatter_sum / Jax / cuda / Primal |
0.000009951 s |
0.000010368 s |
0.96 |
scatter_sum / HLOOpt / cuda / Primal |
0.000010272 s |
0.000010304 s |
1.00 |
scatter_sum / PartOpt / cuda / Primal |
0.00001008 s |
0.000010016 s |
1.01 |
scatter_sum / IPartOpt / cuda / Primal |
0.000010144 s |
0.000010048 s |
1.01 |
scatter_sum / DefOpt / cuda / Primal |
0.000010048 s |
0.000010207 s |
0.98 |
scatter_sum / IDefOpt / cuda / Primal |
0.000010176 s |
0.000010113 s |
1.01 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017312 s |
0.000017408 s |
0.99 |
scatter_sum / Jax / cuda / Forward |
0.000017056 s |
0.000017344 s |
0.98 |
scatter_sum / HLOOpt / cuda / Forward |
0.000017152 s |
0.000017919999999999998 s |
0.96 |
scatter_sum / PartOpt / cuda / Forward |
0.00001728 s |
0.000017503999999999997 s |
0.99 |
scatter_sum / IPartOpt / cuda / Forward |
0.00001696 s |
0.000017664 s |
0.96 |
scatter_sum / DefOpt / cuda / Forward |
0.000017375999999999998 s |
0.000016864 s |
1.03 |
scatter_sum / IDefOpt / cuda / Forward |
0.000017087 s |
0.000017503999999999997 s |
0.98 |
scatter_sum / JaXPipe / cuda / PreRev |
0.00001728 s |
0.00001728 s |
1 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000018112 s |
0.000016863 s |
1.07 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000016927999999999998 s |
0.000017472 s |
0.97 |
scatter_sum / Jax / cuda / BothRev |
0.000017055000000000002 s |
0.00001744 s |
0.98 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000016896000000000002 s |
0.000016864 s |
1.00 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000016864 s |
0.000017312 s |
0.97 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017184 s |
0.000017984 s |
0.96 |
scatter_sum / PartOpt / cuda / PreRev |
0.000016927000000000002 s |
0.000017406999999999998 s |
0.97 |
scatter_sum / PartOpt / cuda / PostRev |
0.000017089 s |
0.000017088 s |
1.00 |
scatter_sum / PartOpt / cuda / BothRev |
0.00001904 s |
0.000016896000000000002 s |
1.13 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000017216 s |
0.000016927999999999998 s |
1.02 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000017216 s |
0.000017247999999999998 s |
1.00 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000017152 s |
0.000017056 s |
1.01 |
scatter_sum / DefOpt / cuda / PreRev |
0.000016704 s |
0.00001824 s |
0.92 |
scatter_sum / DefOpt / cuda / PostRev |
0.000016576000000000002 s |
0.000017184 s |
0.96 |
scatter_sum / DefOpt / cuda / BothRev |
0.00001696 s |
0.00001696 s |
1 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000017312 s |
0.000016544 s |
1.05 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000017152 s |
0.000016992 s |
1.01 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000017185 s |
0.000017505 s |
0.98 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001349 s |
0.0000013517 s |
1.00 |
scatter_sum / Jax / tpu / Primal |
0.0000013553 s |
0.00000141465 s |
0.96 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001359475 s |
0.000001360975 s |
1.00 |
scatter_sum / PartOpt / tpu / Primal |
0.0000013549249999999998 s |
0.00000141505 s |
0.96 |
scatter_sum / IPartOpt / tpu / Primal |
0.000001359525 s |
0.0000013612 s |
1.00 |
scatter_sum / DefOpt / tpu / Primal |
0.0000013545250000000002 s |
0.0000014150999999999998 s |
0.96 |
scatter_sum / IDefOpt / tpu / Primal |
0.000001358925 s |
0.00000136135 s |
1.00 |
scatter_sum / JaXPipe / tpu / Forward |
0.000002751075 s |
0.00000271685 s |
1.01 |
scatter_sum / Jax / tpu / Forward |
0.000002780425000000001 s |
0.00000274255 s |
1.01 |
scatter_sum / HLOOpt / tpu / Forward |
0.00000275745 s |
0.0000027148499999999995 s |
1.02 |
scatter_sum / PartOpt / tpu / Forward |
0.00000274695 s |
0.00000270915 s |
1.01 |
scatter_sum / IPartOpt / tpu / Forward |
0.000002753925 s |
0.00000271675 s |
1.01 |
scatter_sum / DefOpt / tpu / Forward |
0.0000027563500000000003 s |
0.00000271685 s |
1.01 |
scatter_sum / IDefOpt / tpu / Forward |
0.000002752325 s |
0.000002714 s |
1.01 |
scatter_sum / JaXPipe / tpu / PreRev |
0.0000027489000000000005 s |
0.000002698175 s |
1.02 |
scatter_sum / JaXPipe / tpu / PostRev |
0.00000275025 s |
0.0000026941500000000003 s |
1.02 |
scatter_sum / JaXPipe / tpu / BothRev |
0.0000027641 s |
0.000002710675 s |
1.02 |
scatter_sum / Jax / tpu / BothRev |
0.0000027978 s |
0.000002750575 s |
1.02 |
scatter_sum / HLOOpt / tpu / PreRev |
0.00000276045 s |
0.0000027181 s |
1.02 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000028073 s |
0.00000275205 s |
1.02 |
scatter_sum / HLOOpt / tpu / BothRev |
0.000002757625 s |
0.000002715975 s |
1.02 |
scatter_sum / PartOpt / tpu / PreRev |
0.000002802075 s |
0.000002754725 s |
1.02 |
scatter_sum / PartOpt / tpu / PostRev |
0.000002759725 s |
0.000002710875 s |
1.02 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027971249999999995 s |
0.000002753625 s |
1.02 |
scatter_sum / IPartOpt / tpu / PreRev |
0.00000276655 s |
0.000002717375 s |
1.02 |
scatter_sum / IPartOpt / tpu / PostRev |
0.000002799725 s |
0.00000275435 s |
1.02 |
scatter_sum / IPartOpt / tpu / BothRev |
0.00000276145 s |
0.000002715725 s |
1.02 |
scatter_sum / DefOpt / tpu / PreRev |
0.0000028019250000000003 s |
0.0000027562 s |
1.02 |
scatter_sum / DefOpt / tpu / PostRev |
0.000002757775 s |
0.000002710525 s |
1.02 |
scatter_sum / DefOpt / tpu / BothRev |
0.00000279815 s |
0.000002755825 s |
1.02 |
scatter_sum / IDefOpt / tpu / PreRev |
0.000002762975 s |
0.0000027162 s |
1.02 |
scatter_sum / IDefOpt / tpu / PostRev |
0.0000027965250000000003 s |
0.00000276105 s |
1.01 |
scatter_sum / IDefOpt / tpu / BothRev |
0.00000276455 s |
0.000002711775 s |
1.02 |
scatter_sum / JaXPipe / cpu / Primal |
0.000022573 s |
0.000009560899970892933 s |
2.36 |
scatter_sum / Jax / cpu / Primal |
0.00001539 s |
0.000008968580013970495 s |
1.72 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015858 s |
0.000012460140051189227 s |
1.27 |
scatter_sum / PartOpt / cpu / Primal |
0.000015678 s |
0.000009277959989049123 s |
1.69 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015918000000000002 s |
0.000009176700023090236 s |
1.73 |
scatter_sum / DefOpt / cpu / Primal |
0.000016055 s |
0.000009108180020120926 s |
1.76 |
scatter_sum / IDefOpt / cpu / Primal |
0.000016115 s |
0.000009239719984179829 s |
1.74 |
scatter_sum / JaXPipe / cpu / Forward |
0.000023129 s |
0.000013944200027253828 s |
1.66 |
scatter_sum / Jax / cpu / Forward |
0.000023847 s |
0.000013279800023155985 s |
1.80 |
scatter_sum / HLOOpt / cpu / Forward |
0.000023139 s |
0.0000191325799733022 s |
1.21 |
scatter_sum / PartOpt / cpu / Forward |
0.000022775 s |
0.00001404371999342402 s |
1.62 |
scatter_sum / IPartOpt / cpu / Forward |
0.000022911 s |
0.000013532699949792004 s |
1.69 |
scatter_sum / DefOpt / cpu / Forward |
0.000022869 s |
0.000019775379987549967 s |
1.16 |
scatter_sum / IDefOpt / cpu / Forward |
0.000024039 s |
0.000013536360002035509 s |
1.78 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000022918 s |
0.000013416699976005476 s |
1.71 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000023396 s |
0.000013588860037998528 s |
1.72 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000023566 s |
0.000018541219978942537 s |
1.27 |
scatter_sum / Jax / cpu / BothRev |
0.000021972 s |
0.000013493200040102238 s |
1.63 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000021782 s |
0.000013985340037834249 s |
1.56 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000024372 s |
0.00001821657999244053 s |
1.34 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000024246 s |
0.000020798099985768203 s |
1.17 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022705 s |
0.000014508680023936905 s |
1.56 |
scatter_sum / PartOpt / cpu / PostRev |
0.00002316 s |
0.000013988859991513892 s |
1.66 |
scatter_sum / PartOpt / cpu / BothRev |
0.000024354 s |
0.000013867720035705131 s |
1.76 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000022455 s |
0.000019941060008932256 s |
1.13 |
scatter_sum / IPartOpt / cpu / PostRev |
0.00002283 s |
0.000014055500041649794 s |
1.62 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000023628 s |
0.000013829800009261817 s |
1.71 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022702 s |
0.000014402780007003455 s |
1.58 |
scatter_sum / DefOpt / cpu / PostRev |
0.000024276 s |
0.00001407524003298022 s |
1.72 |
scatter_sum / DefOpt / cpu / BothRev |
0.000023963 s |
0.000013597080014733363 s |
1.76 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000023456 s |
0.000013511479946828331 s |
1.74 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000022836 s |
0.000013842139969710842 s |
1.65 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000023729 s |
0.00001488863998929446 s |
1.59 |
slicing / JaXPipe / cpu / Primal |
0.000007520179997300147 s |
0.000008040859984248528 s |
0.94 |
slicing / Jax / cpu / Primal |
0.000006730639991019416 s |
0.000007323159998122719 s |
0.92 |
slicing / HLOOpt / cpu / Primal |
0.00001012661999766351 s |
0.000011351140028637018 s |
0.89 |
slicing / PartOpt / cpu / Primal |
0.000006404899993412983 s |
0.0000070472600782522935 s |
0.91 |
slicing / IPartOpt / cpu / Primal |
0.000006238440003016876 s |
0.000006619239975407254 s |
0.94 |
slicing / DefOpt / cpu / Primal |
0.000011079560010784917 s |
0.000012038580007356358 s |
0.92 |
slicing / IDefOpt / cpu / Primal |
0.000006226900013643899 s |
0.0000068220599860069344 s |
0.91 |
slicing / JaXPipe / cpu / Forward |
0.00001081058000181656 s |
0.000011048219948861516 s |
0.98 |
slicing / Jax / cpu / Forward |
0.000010313619984572142 s |
0.000011258000013185663 s |
0.92 |
slicing / HLOOpt / cpu / Forward |
0.000015031039999939822 s |
0.000015424760022142436 s |
0.97 |
slicing / PartOpt / cpu / Forward |
0.000014974399998664012 s |
0.000015206179996312133 s |
0.98 |
slicing / IPartOpt / cpu / Forward |
0.000009607760002836583 s |
0.00001020377998429467 s |
0.94 |
slicing / DefOpt / cpu / Forward |
0.000014747959994565464 s |
0.000015816619988981983 s |
0.93 |
slicing / IDefOpt / cpu / Forward |
0.000009787320004761569 s |
0.000010371740027039776 s |
0.94 |
slicing / JaXPipe / cpu / PreRev |
0.000010868859997117395 s |
0.000011388220018488935 s |
0.95 |
slicing / JaXPipe / cpu / PostRev |
0.000011041899988413206 s |
0.000012067180014128098 s |
0.92 |
slicing / JaXPipe / cpu / BothRev |
0.000013056340001185164 s |
0.000011459180004749214 s |
1.14 |
slicing / Jax / cpu / BothRev |
0.000011301960000764666 s |
0.000011226620008528698 s |
1.01 |
slicing / HLOOpt / cpu / PreRev |
0.000010789840009692853 s |
0.00001112885999646096 s |
0.97 |
slicing / HLOOpt / cpu / PostRev |
0.000010995919999459148 s |
0.000011124339989692087 s |
0.99 |
slicing / HLOOpt / cpu / BothRev |
0.00001235998000311156 s |
0.000012854619972131332 s |
0.96 |
slicing / PartOpt / cpu / PreRev |
0.000011128959999950894 s |
0.000011010340012944652 s |
1.01 |
slicing / PartOpt / cpu / PostRev |
0.000011097220001374808 s |
0.000011743560035029076 s |
0.94 |
slicing / PartOpt / cpu / BothRev |
0.0000107541600004879 s |
0.000010728519982876606 s |
1.00 |
slicing / IPartOpt / cpu / PreRev |
0.000015612580002652977 s |
0.000013666479962921586 s |
1.14 |
slicing / IPartOpt / cpu / PostRev |
0.00001113192000957497 s |
0.00001228439999067632 s |
0.91 |
slicing / IPartOpt / cpu / BothRev |
0.000010803460006627577 s |
0.00001136875999691256 s |
0.95 |
slicing / DefOpt / cpu / PreRev |
0.000010420919998068713 s |
0.000011448680015746505 s |
0.91 |
slicing / DefOpt / cpu / PostRev |
0.000010505100003683764 s |
0.000011924800000997493 s |
0.88 |
slicing / DefOpt / cpu / BothRev |
0.000010664220003491207 s |
0.000011419660004321488 s |
0.93 |
slicing / IDefOpt / cpu / PreRev |
0.00000982672000191087 s |
0.000011360259995853997 s |
0.87 |
slicing / IDefOpt / cpu / PostRev |
0.000011185039995780244 s |
0.000011660339960144484 s |
0.96 |
slicing / IDefOpt / cpu / BothRev |
0.00001056673999528357 s |
0.00001080735999494209 s |
0.98 |
slicing / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
slicing / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
slicing / HLOOpt / cuda / Primal |
0.000001888 s |
0.0000019200000000000003 s |
0.98 |
slicing / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001919 s |
1.00 |
slicing / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001888 s |
1.02 |
slicing / DefOpt / cuda / Primal |
0.000001887 s |
0.0000019200000000000003 s |
0.98 |
slicing / IDefOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
slicing / JaXPipe / cuda / Forward |
0.000010143 s |
0.00001008 s |
1.01 |
slicing / Jax / cuda / Forward |
0.00000992 s |
0.00001008 s |
0.98 |
slicing / HLOOpt / cuda / Forward |
0.000010049 s |
0.000009887 s |
1.02 |
slicing / PartOpt / cuda / Forward |
0.00001008 s |
0.000010464 s |
0.96 |
slicing / IPartOpt / cuda / Forward |
0.000009952 s |
0.00001008 s |
0.99 |
slicing / DefOpt / cuda / Forward |
0.00001248 s |
0.000010016 s |
1.25 |
slicing / IDefOpt / cuda / Forward |
0.00000976 s |
0.000009632 s |
1.01 |
slicing / JaXPipe / cuda / PreRev |
0.00001232 s |
0.000010111 s |
1.22 |
slicing / JaXPipe / cuda / PostRev |
0.000009984 s |
0.0000104 s |
0.96 |
slicing / JaXPipe / cuda / BothRev |
0.000010432 s |
0.000011392 s |
0.92 |
slicing / Jax / cuda / BothRev |
0.000009696 s |
0.000010688 s |
0.91 |
slicing / HLOOpt / cuda / PreRev |
0.000010112 s |
0.000010593 s |
0.95 |
slicing / HLOOpt / cuda / PostRev |
0.000009984 s |
0.000009888 s |
1.01 |
slicing / HLOOpt / cuda / BothRev |
0.000010176 s |
0.000011296 s |
0.90 |
slicing / PartOpt / cuda / PreRev |
0.000010432 s |
0.000011455999999999998 s |
0.91 |
slicing / PartOpt / cuda / PostRev |
0.000010304 s |
0.000012032 s |
0.86 |
slicing / PartOpt / cuda / BothRev |
0.000010112 s |
0.000010144 s |
1.00 |
slicing / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000010304 s |
0.98 |
slicing / IPartOpt / cuda / PostRev |
0.000009984 s |
0.000010559 s |
0.95 |
slicing / IPartOpt / cuda / BothRev |
0.000010048 s |
0.000010752 s |
0.93 |
slicing / DefOpt / cuda / PreRev |
0.000010336 s |
0.000010656 s |
0.97 |
slicing / DefOpt / cuda / PostRev |
0.000010112 s |
0.000010112 s |
1 |
slicing / DefOpt / cuda / BothRev |
0.000010175 s |
0.000010208 s |
1.00 |
slicing / IDefOpt / cuda / PreRev |
0.000012672 s |
0.000009952 s |
1.27 |
slicing / IDefOpt / cuda / PostRev |
0.00001024 s |
0.000010368 s |
0.99 |
slicing / IDefOpt / cuda / BothRev |
0.00001024 s |
0.000010049 s |
1.02 |
slicing / JaXPipe / tpu / Primal |
9.582e-7 s |
0.0000010259 s |
0.93 |
slicing / Jax / tpu / Primal |
9.86025e-7 s |
9.6695e-7 s |
1.02 |
slicing / HLOOpt / tpu / Primal |
9.60925e-7 s |
0.000001033725 s |
0.93 |
slicing / PartOpt / tpu / Primal |
9.7325e-7 s |
9.66175e-7 s |
1.01 |
slicing / IPartOpt / tpu / Primal |
9.58575e-7 s |
0.0000010305 s |
0.93 |
slicing / DefOpt / tpu / Primal |
9.732e-7 s |
9.74725e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
9.59075e-7 s |
0.00000102215 s |
0.94 |
slicing / JaXPipe / tpu / Forward |
0.000001403275 s |
0.0000014117 s |
0.99 |
slicing / Jax / tpu / Forward |
0.00000140805 s |
0.0000014743999999999998 s |
0.95 |
slicing / HLOOpt / tpu / Forward |
0.00000151105 s |
0.0000015146 s |
1.00 |
slicing / PartOpt / tpu / Forward |
0.0000014276 s |
0.000001492575 s |
0.96 |
slicing / IPartOpt / tpu / Forward |
0.00000151725 s |
0.0000015134249999999998 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.00000143505 s |
0.0000014924 s |
0.96 |
slicing / IDefOpt / tpu / Forward |
0.0000015118500000000002 s |
0.000001516525 s |
1.00 |
slicing / JaXPipe / tpu / PreRev |
0.000002336375 s |
0.0000025597 s |
0.91 |
slicing / JaXPipe / tpu / PostRev |
0.0000025186 s |
0.00000251315 s |
1.00 |
slicing / JaXPipe / tpu / BothRev |
0.0000023565 s |
0.000002578425 s |
0.91 |
slicing / Jax / tpu / BothRev |
0.0000025368 s |
0.0000025517250000000005 s |
0.99 |
slicing / HLOOpt / tpu / PreRev |
0.000002350125 s |
0.00000258015 s |
0.91 |
slicing / HLOOpt / tpu / PostRev |
0.0000025418000000000004 s |
0.00000254145 s |
1.00 |
slicing / HLOOpt / tpu / BothRev |
0.00000234275 s |
0.000002590375 s |
0.90 |
slicing / PartOpt / tpu / PreRev |
0.000002536875 s |
0.00000254725 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.000002343375 s |
0.00000257835 s |
0.91 |
slicing / PartOpt / tpu / BothRev |
0.0000025353 s |
0.000002541075 s |
1.00 |
slicing / IPartOpt / tpu / PreRev |
0.000002349525 s |
0.0000025848 s |
0.91 |
slicing / IPartOpt / tpu / PostRev |
0.0000025238500000000004 s |
0.000002543075 s |
0.99 |
slicing / IPartOpt / tpu / BothRev |
0.0000023561750000000004 s |
0.0000025757 s |
0.91 |
slicing / DefOpt / tpu / PreRev |
0.0000025337 s |
0.000002541775 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.0000023544499999999995 s |
0.000002586525 s |
0.91 |
slicing / DefOpt / tpu / BothRev |
0.00000253525 s |
0.00000254005 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.0000023499 s |
0.00000257795 s |
0.91 |
slicing / IDefOpt / tpu / PostRev |
0.000002529925 s |
0.000002539375 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.0000023558 s |
0.0000025849 s |
0.91 |
slicing / JaXPipe / cpu / Primal |
0.000012754 s |
0.000008040859984248528 s |
1.59 |
slicing / Jax / cpu / Primal |
0.000012391 s |
0.000007323159998122719 s |
1.69 |
slicing / HLOOpt / cpu / Primal |
0.000012459 s |
0.000011351140028637018 s |
1.10 |
slicing / PartOpt / cpu / Primal |
0.000012657 s |
0.0000070472600782522935 s |
1.80 |
slicing / IPartOpt / cpu / Primal |
0.000012264 s |
0.000006619239975407254 s |
1.85 |
slicing / DefOpt / cpu / Primal |
0.000012303 s |
0.000012038580007356358 s |
1.02 |
slicing / IDefOpt / cpu / Primal |
0.000012472 s |
0.0000068220599860069344 s |
1.83 |
slicing / JaXPipe / cpu / Forward |
0.000017137 s |
0.000011048219948861516 s |
1.55 |
slicing / Jax / cpu / Forward |
0.000016856 s |
0.000011258000013185663 s |
1.50 |
slicing / HLOOpt / cpu / Forward |
0.000016376 s |
0.000015424760022142436 s |
1.06 |
slicing / PartOpt / cpu / Forward |
0.00001665 s |
0.000015206179996312133 s |
1.09 |
slicing / IPartOpt / cpu / Forward |
0.000016751 s |
0.00001020377998429467 s |
1.64 |
slicing / DefOpt / cpu / Forward |
0.000017323 s |
0.000015816619988981983 s |
1.10 |
slicing / IDefOpt / cpu / Forward |
0.000017163 s |
0.000010371740027039776 s |
1.65 |
slicing / JaXPipe / cpu / PreRev |
0.000019151 s |
0.000011388220018488935 s |
1.68 |
slicing / JaXPipe / cpu / PostRev |
0.000017515 s |
0.000012067180014128098 s |
1.45 |
slicing / JaXPipe / cpu / BothRev |
0.000017788000000000003 s |
0.000011459180004749214 s |
1.55 |
slicing / Jax / cpu / BothRev |
0.000017992 s |
0.000011226620008528698 s |
1.60 |
slicing / HLOOpt / cpu / PreRev |
0.000017305 s |
0.00001112885999646096 s |
1.55 |
slicing / HLOOpt / cpu / PostRev |
0.000018201 s |
0.000011124339989692087 s |
1.64 |
slicing / HLOOpt / cpu / BothRev |
0.000018343 s |
0.000012854619972131332 s |
1.43 |
slicing / PartOpt / cpu / PreRev |
0.000017597 s |
0.000011010340012944652 s |
1.60 |
slicing / PartOpt / cpu / PostRev |
0.000018233 s |
0.000011743560035029076 s |
1.55 |
slicing / PartOpt / cpu / BothRev |
0.000017501 s |
0.000010728519982876606 s |
1.63 |
slicing / IPartOpt / cpu / PreRev |
0.000017315 s |
0.000013666479962921586 s |
1.27 |
slicing / IPartOpt / cpu / PostRev |
0.000017864 s |
0.00001228439999067632 s |
1.45 |
slicing / IPartOpt / cpu / BothRev |
0.000018466 s |
0.00001136875999691256 s |
1.62 |
slicing / DefOpt / cpu / PreRev |
0.000017344 s |
0.000011448680015746505 s |
1.51 |
slicing / DefOpt / cpu / PostRev |
0.000017502 s |
0.000011924800000997493 s |
1.47 |
slicing / DefOpt / cpu / BothRev |
0.000018635 s |
0.000011419660004321488 s |
1.63 |
slicing / IDefOpt / cpu / PreRev |
0.000017860999999999997 s |
0.000011360259995853997 s |
1.57 |
slicing / IDefOpt / cpu / PostRev |
0.000018038 s |
0.000011660339960144484 s |
1.55 |
slicing / IDefOpt / cpu / BothRev |
0.000018764 s |
0.00001080735999494209 s |
1.74 |
sum / JaXPipe / cpu / Primal |
0.000009326759993655289 s |
0.000009742220008774891 s |
0.96 |
sum / Jax / cpu / Primal |
0.000007966159998886723 s |
0.00000903806000678742 s |
0.88 |
sum / HLOOpt / cpu / Primal |
0.000012604039998223016 s |
0.000012662220033234917 s |
1.00 |
sum / PartOpt / cpu / Primal |
0.000008903919997464982 s |
0.000008491360022162553 s |
1.05 |
sum / IPartOpt / cpu / Primal |
0.00000861625999732496 s |
0.00000859147996379761 s |
1.00 |
sum / DefOpt / cpu / Primal |
0.000012823760009723628 s |
0.000013167799979783012 s |
0.97 |
sum / IDefOpt / cpu / Primal |
0.000008508179998898413 s |
0.000008690520016898518 s |
0.98 |
sum / JaXPipe / cpu / Forward |
0.000012012240003969057 s |
0.000013288580057633223 s |
0.90 |
sum / Jax / cpu / Forward |
0.00001125527999647602 s |
0.000012854360011260724 s |
0.88 |
sum / HLOOpt / cpu / Forward |
0.000016323160002684746 s |
0.0000172800400378037 s |
0.94 |
sum / PartOpt / cpu / Forward |
0.000011959360003857 s |
0.00001706791999822599 s |
0.70 |
sum / IPartOpt / cpu / Forward |
0.000011761040011606383 s |
0.000012654980000661452 s |
0.93 |
sum / DefOpt / cpu / Forward |
0.00001700889999710853 s |
0.000017553959951328578 s |
0.97 |
sum / IDefOpt / cpu / Forward |
0.00001165500000524844 s |
0.000013098059989715692 s |
0.89 |
sum / JaXPipe / cpu / PreRev |
0.00001235705999761194 s |
0.000012499379963628598 s |
0.99 |
sum / JaXPipe / cpu / PostRev |
0.000012491020004290475 s |
0.000012826340016545146 s |
0.97 |
sum / JaXPipe / cpu / BothRev |
0.000015887439994912712 s |
0.000012429079970388556 s |
1.28 |
sum / Jax / cpu / BothRev |
0.000012843900010466314 s |
0.000012457939992600589 s |
1.03 |
sum / HLOOpt / cpu / PreRev |
0.0000120040600018001 s |
0.000011955399986618432 s |
1.00 |
sum / HLOOpt / cpu / PostRev |
0.000011410200013415306 s |
0.000016029180033001465 s |
0.71 |
sum / HLOOpt / cpu / BothRev |
0.000013808660000904636 s |
0.00001411953999195248 s |
0.98 |
sum / PartOpt / cpu / PreRev |
0.000012086099995940458 s |
0.000012572819996421458 s |
0.96 |
sum / PartOpt / cpu / PostRev |
0.000011975619991062558 s |
0.000012195079962111776 s |
0.98 |
sum / PartOpt / cpu / BothRev |
0.00001142363999861118 s |
0.000011735139933080064 s |
0.97 |
sum / IPartOpt / cpu / PreRev |
0.0000120522400061418 s |
0.00001590705996932229 s |
0.76 |
sum / IPartOpt / cpu / PostRev |
0.000011596380002174555 s |
0.000011663459990813863 s |
0.99 |
sum / IPartOpt / cpu / BothRev |
0.000012219540003570728 s |
0.000011670320027405978 s |
1.05 |
sum / DefOpt / cpu / PreRev |
0.000011831439994693938 s |
0.00001223856000251544 s |
0.97 |
sum / DefOpt / cpu / PostRev |
0.00001200267999593052 s |
0.000012280459986868664 s |
0.98 |
sum / DefOpt / cpu / BothRev |
0.000011404140004742655 s |
0.000011861679968205863 s |
0.96 |
sum / IDefOpt / cpu / PreRev |
0.000011652099985894892 s |
0.00001202029996420606 s |
0.97 |
sum / IDefOpt / cpu / PostRev |
0.000012267099991731811 s |
0.000011725960048352135 s |
1.05 |
sum / IDefOpt / cpu / BothRev |
0.000012170820009487216 s |
0.000012039119974360802 s |
1.01 |
sum / JaXPipe / cuda / Primal |
0.00000208 s |
0.000002143 s |
0.97 |
sum / Jax / cuda / Primal |
0.00000208 s |
0.0000021120000000000003 s |
0.98 |
sum / HLOOpt / cuda / Primal |
0.00000208 s |
0.0000021120000000000003 s |
0.98 |
sum / PartOpt / cuda / Primal |
0.00000208 s |
0.0000021120000000000003 s |
0.98 |
sum / IPartOpt / cuda / Primal |
0.00000208 s |
0.0000021120000000000003 s |
0.98 |
sum / DefOpt / cuda / Primal |
0.00000208 s |
0.0000021120000000000003 s |
0.98 |
sum / IDefOpt / cuda / Primal |
0.000002079 s |
0.000002143 s |
0.97 |
sum / JaXPipe / cuda / Forward |
0.000010464 s |
0.000011007 s |
0.95 |
sum / Jax / cuda / Forward |
0.000010304 s |
0.000015456 s |
0.67 |
sum / HLOOpt / cuda / Forward |
0.00001008 s |
0.000010815 s |
0.93 |
sum / PartOpt / cuda / Forward |
0.000010592 s |
0.000011967 s |
0.89 |
sum / IPartOpt / cuda / Forward |
0.0000104 s |
0.0000104 s |
1 |
sum / DefOpt / cuda / Forward |
0.000010432 s |
0.000010368 s |
1.01 |
sum / IDefOpt / cuda / Forward |
0.0000104 s |
0.000011712 s |
0.89 |
sum / JaXPipe / cuda / PreRev |
0.00001008 s |
0.000009888 s |
1.02 |
sum / JaXPipe / cuda / PostRev |
0.000010336 s |
0.000010207 s |
1.01 |
sum / JaXPipe / cuda / BothRev |
0.0000096 s |
0.000009952 s |
0.96 |
sum / Jax / cuda / BothRev |
0.00000992 s |
0.00001008 s |
0.98 |
sum / HLOOpt / cuda / PreRev |
0.00000992 s |
0.00000992 s |
1 |
sum / HLOOpt / cuda / PostRev |
0.000010047 s |
0.000010017 s |
1.00 |
sum / HLOOpt / cuda / BothRev |
0.000010208 s |
0.000009824 s |
1.04 |
sum / PartOpt / cuda / PreRev |
0.000010273 s |
0.000009984 s |
1.03 |
sum / PartOpt / cuda / PostRev |
0.000010112 s |
0.000010208 s |
0.99 |
sum / PartOpt / cuda / BothRev |
0.000010272 s |
0.000010144 s |
1.01 |
sum / IPartOpt / cuda / PreRev |
0.000010016 s |
0.000009952 s |
1.01 |
sum / IPartOpt / cuda / PostRev |
0.000009984 s |
0.000010144 s |
0.98 |
sum / IPartOpt / cuda / BothRev |
0.00000992 s |
0.000010176 s |
0.97 |
sum / DefOpt / cuda / PreRev |
0.000010112 s |
0.000009952 s |
1.02 |
sum / DefOpt / cuda / PostRev |
0.000009728 s |
0.000010176 s |
0.96 |
sum / DefOpt / cuda / BothRev |
0.000009984 s |
0.000010016 s |
1.00 |
sum / IDefOpt / cuda / PreRev |
0.000010016 s |
0.000009984 s |
1.00 |
sum / IDefOpt / cuda / PostRev |
0.000009952 s |
0.000010272 s |
0.97 |
sum / IDefOpt / cuda / BothRev |
0.000010336 s |
0.00001024 s |
1.01 |
sum / JaXPipe / tpu / Primal |
5.1665e-7 s |
5.1055e-7 s |
1.01 |
sum / Jax / tpu / Primal |
5.565e-7 s |
5.583250000000001e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.26675e-7 s |
5.2135e-7 s |
1.01 |
sum / PartOpt / tpu / Primal |
5.56575e-7 s |
5.58275e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.26975e-7 s |
5.22425e-7 s |
1.01 |
sum / DefOpt / tpu / Primal |
5.566e-7 s |
5.584499999999999e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.26625e-7 s |
5.213e-7 s |
1.01 |
sum / JaXPipe / tpu / Forward |
0.0000015475249999999998 s |
0.0000015467249999999998 s |
1.00 |
sum / Jax / tpu / Forward |
0.0000015096 s |
0.0000014976500000000002 s |
1.01 |
sum / HLOOpt / tpu / Forward |
0.00000153035 s |
0.0000015402 s |
0.99 |
sum / PartOpt / tpu / Forward |
0.000001502225 s |
0.0000014953 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.000001535175 s |
0.000001534125 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.000001507725 s |
0.000001501775 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.000001538325 s |
0.00000153315 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.0000010021 s |
0.00000105375 s |
0.95 |
sum / JaXPipe / tpu / PostRev |
0.0000010339 s |
0.000001084475 s |
0.95 |
sum / JaXPipe / tpu / BothRev |
0.0000010031 s |
0.000001061775 s |
0.94 |
sum / Jax / tpu / BothRev |
0.000001036125 s |
0.000001087875 s |
0.95 |
sum / HLOOpt / tpu / PreRev |
0.00000101465 s |
0.00000104835 s |
0.97 |
sum / HLOOpt / tpu / PostRev |
0.00000104015 s |
0.000001083475 s |
0.96 |
sum / HLOOpt / tpu / BothRev |
0.00000101045 s |
0.000001045575 s |
0.97 |
sum / PartOpt / tpu / PreRev |
0.0000010413500000000002 s |
0.00000108545 s |
0.96 |
sum / PartOpt / tpu / PostRev |
0.0000010096 s |
0.000001057975 s |
0.95 |
sum / PartOpt / tpu / BothRev |
0.0000010373000000000002 s |
0.0000010851249999999998 s |
0.96 |
sum / IPartOpt / tpu / PreRev |
0.0000010097999999999998 s |
0.0000010513 s |
0.96 |
sum / IPartOpt / tpu / PostRev |
0.00000103305 s |
0.000001089675 s |
0.95 |
sum / IPartOpt / tpu / BothRev |
0.0000010066 s |
0.000001061925 s |
0.95 |
sum / DefOpt / tpu / PreRev |
0.0000010381 s |
0.000001097775 s |
0.95 |
sum / DefOpt / tpu / PostRev |
0.000001004925 s |
0.00000105545 s |
0.95 |
sum / DefOpt / tpu / BothRev |
0.0000010304 s |
0.0000010873249999999998 s |
0.95 |
sum / IDefOpt / tpu / PreRev |
0.000001000075 s |
0.0000010545 s |
0.95 |
sum / IDefOpt / tpu / PostRev |
0.000001034525 s |
0.000001090125 s |
0.95 |
sum / IDefOpt / tpu / BothRev |
0.000001000575 s |
0.00000105155 s |
0.95 |
sum / JaXPipe / cpu / Primal |
0.000014732 s |
0.000009742220008774891 s |
1.51 |
sum / Jax / cpu / Primal |
0.00001438 s |
0.00000903806000678742 s |
1.59 |
sum / HLOOpt / cpu / Primal |
0.000014384 s |
0.000012662220033234917 s |
1.14 |
sum / PartOpt / cpu / Primal |
0.000014534 s |
0.000008491360022162553 s |
1.71 |
sum / IPartOpt / cpu / Primal |
0.000014378 s |
0.00000859147996379761 s |
1.67 |
sum / DefOpt / cpu / Primal |
0.000014605 s |
0.000013167799979783012 s |
1.11 |
sum / IDefOpt / cpu / Primal |
0.000014792 s |
0.000008690520016898518 s |
1.70 |
sum / JaXPipe / cpu / Forward |
0.000020518 s |
0.000013288580057633223 s |
1.54 |
sum / Jax / cpu / Forward |
0.000020765 s |
0.000012854360011260724 s |
1.62 |
sum / HLOOpt / cpu / Forward |
0.000020066 s |
0.0000172800400378037 s |
1.16 |
sum / PartOpt / cpu / Forward |
0.000020638 s |
0.00001706791999822599 s |
1.21 |
sum / IPartOpt / cpu / Forward |
0.000019555 s |
0.000012654980000661452 s |
1.55 |
sum / DefOpt / cpu / Forward |
0.000020433 s |
0.000017553959951328578 s |
1.16 |
sum / IDefOpt / cpu / Forward |
0.000020119 s |
0.000013098059989715692 s |
1.54 |
sum / JaXPipe / cpu / PreRev |
0.000019066 s |
0.000012499379963628598 s |
1.53 |
sum / JaXPipe / cpu / PostRev |
0.000019634 s |
0.000012826340016545146 s |
1.53 |
sum / JaXPipe / cpu / BothRev |
0.000019585000000000003 s |
0.000012429079970388556 s |
1.58 |
sum / Jax / cpu / BothRev |
0.00001955 s |
0.000012457939992600589 s |
1.57 |
sum / HLOOpt / cpu / PreRev |
0.000019234000000000003 s |
0.000011955399986618432 s |
1.61 |
sum / HLOOpt / cpu / PostRev |
0.000019721 s |
0.000016029180033001465 s |
1.23 |
sum / HLOOpt / cpu / BothRev |
0.000019752 s |
0.00001411953999195248 s |
1.40 |
sum / PartOpt / cpu / PreRev |
0.000019082 s |
0.000012572819996421458 s |
1.52 |
sum / PartOpt / cpu / PostRev |
0.000019777 s |
0.000012195079962111776 s |
1.62 |
sum / PartOpt / cpu / BothRev |
0.00001971 s |
0.000011735139933080064 s |
1.68 |
sum / IPartOpt / cpu / PreRev |
0.000019383 s |
0.00001590705996932229 s |
1.22 |
sum / IPartOpt / cpu / PostRev |
0.000019888 s |
0.000011663459990813863 s |
1.71 |
sum / IPartOpt / cpu / BothRev |
0.000019730000000000003 s |
0.000011670320027405978 s |
1.69 |
sum / DefOpt / cpu / PreRev |
0.00001851 s |
0.00001223856000251544 s |
1.51 |
sum / DefOpt / cpu / PostRev |
0.000019464 s |
0.000012280459986868664 s |
1.58 |
sum / DefOpt / cpu / BothRev |
0.000019227 s |
0.000011861679968205863 s |
1.62 |
sum / IDefOpt / cpu / PreRev |
0.000018478 s |
0.00001202029996420606 s |
1.54 |
sum / IDefOpt / cpu / PostRev |
0.000019285 s |
0.000011725960048352135 s |
1.64 |
sum / IDefOpt / cpu / BothRev |
0.000019605 s |
0.000012039119974360802 s |
1.63 |
value_and_grad / JaXPipe / cpu / Primal |
0.000015838880005958345 s |
0.00001745141998071631 s |
0.91 |
value_and_grad / Jax / cpu / Primal |
0.000015497940000841482 s |
0.000015697619983257027 s |
0.99 |
value_and_grad / HLOOpt / cpu / Primal |
0.000014571780006917834 s |
0.00001602679996722145 s |
0.91 |
value_and_grad / PartOpt / cpu / Primal |
0.000015079859992965796 s |
0.000016012839987524787 s |
0.94 |
value_and_grad / IPartOpt / cpu / Primal |
0.0000154496400023163 s |
0.000015338099992732167 s |
1.01 |
value_and_grad / DefOpt / cpu / Primal |
0.000015043379999042373 s |
0.0000166041199281608 s |
0.91 |
value_and_grad / IDefOpt / cpu / Primal |
0.000014858900005947362 s |
0.000015574700009892696 s |
0.95 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033344 s |
0.000033568 s |
0.99 |
value_and_grad / Jax / cuda / Primal |
0.000033888 s |
0.000033696 s |
1.01 |
value_and_grad / HLOOpt / cuda / Primal |
0.000033728 s |
0.000033696 s |
1.00 |
value_and_grad / PartOpt / cuda / Primal |
0.000033632 s |
0.000033312 s |
1.01 |
value_and_grad / IPartOpt / cuda / Primal |
0.000033984 s |
0.00003392 s |
1.00 |
value_and_grad / DefOpt / cuda / Primal |
0.000033664 s |
0.000033472 s |
1.01 |
value_and_grad / IDefOpt / cuda / Primal |
0.000032416 s |
0.00003344 s |
0.97 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000024196 s |
0.00001745141998071631 s |
1.39 |
value_and_grad / Jax / cpu / Primal |
0.000023369 s |
0.000015697619983257027 s |
1.49 |
value_and_grad / HLOOpt / cpu / Primal |
0.000023537 s |
0.00001602679996722145 s |
1.47 |
value_and_grad / PartOpt / cpu / Primal |
0.000023469 s |
0.000016012839987524787 s |
1.47 |
value_and_grad / IPartOpt / cpu / Primal |
0.000023623 s |
0.000015338099992732167 s |
1.54 |
value_and_grad / DefOpt / cpu / Primal |
0.00002318 s |
0.0000166041199281608 s |
1.40 |
value_and_grad / IDefOpt / cpu / Primal |
0.000024123 s |
0.000015574700009892696 s |
1.55 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001514205 s |
0.001519418 s |
1.00 |
jaxmd20 / Jax / cuda / Primal |
0.0014199969999999 s |
0.001550144 s |
0.92 |
jaxmd20 / HLOOpt / cuda / Primal |
0.0011384939999999 s |
0.001116446 s |
1.02 |
jaxmd20 / PartOpt / cuda / Primal |
0.0013158369999999 s |
0.001325692 s |
0.99 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001369948 s |
0.001323068 s |
1.04 |
jaxmd20 / DefOpt / cuda / Primal |
0.000528863 s |
0.000562847 s |
0.94 |
jaxmd20 / IDefOpt / cuda / Primal |
0.0004933429999999 s |
0.000521662 s |
0.95 |
jaxmd20 / JaXPipe / cuda / Forward |
0.000816958 s |
0.000818078 s |
1.00 |
jaxmd20 / Jax / cuda / Forward |
0.001778716 s |
0.001826107 s |
0.97 |
jaxmd20 / HLOOpt / cuda / Forward |
0.000825054 s |
0.000835102 s |
0.99 |
jaxmd20 / PartOpt / cuda / Forward |
0.000824191 s |
0.000867135 s |
0.95 |
jaxmd20 / IPartOpt / cuda / Forward |
0.000845822 s |
0.000834141 s |
1.01 |
jaxmd20 / DefOpt / cuda / Forward |
0.000819006 s |
0.000830559 s |
0.99 |
jaxmd20 / IDefOpt / cuda / Forward |
0.000815422 s |
0.000836286 s |
0.98 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.001655678 s |
0.001664188 s |
0.99 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005335381 s |
0.005894961 s |
0.91 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.0016737249999999 s |
0.00167104 s |
1.00 |
jaxmd20 / Jax / cuda / BothRev |
0.00527202 s |
0.005287989 s |
1.00 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.00170166 s |
0.001727004 s |
0.99 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005159988 s |
0.005237779 s |
0.99 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.001632412 s |
0.001671901 s |
0.98 |
jaxmd20 / PartOpt / cuda / PreRev |
0.0016958359999999 s |
0.001737211 s |
0.98 |
jaxmd20 / PartOpt / cuda / PostRev |
0.0053760519999999 s |
0.005424818 s |
0.99 |
jaxmd20 / PartOpt / cuda / BothRev |
0.001676828 s |
0.001691996 s |
0.99 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.001702236 s |
0.001741371 s |
0.98 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005356308 s |
0.005383125 s |
1.00 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.001631741 s |
0.001743899 s |
0.94 |
jaxmd20 / DefOpt / cuda / PreRev |
0.001703548 s |
0.001727099 s |
0.99 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002708794 s |
0.002745498 s |
0.99 |
jaxmd20 / DefOpt / cuda / BothRev |
0.00164582 s |
0.001666843 s |
0.99 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.0017023 s |
0.0018710999999999 s |
0.91 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.001977756 s |
0.002012187 s |
0.98 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0016621079999999 s |
0.00165782 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Primal |
0.009288338125 s |
0.009265228125 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.009269065625 s |
0.009272055 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.0091692 s |
0.009155241875 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.009199598125 s |
0.009206595625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.009203113125 s |
0.009205539375 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.00874739875 s |
0.00875223875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.008637930625 s |
0.0086319706249999 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.01726621125 s |
0.01725431125 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.018737995 s |
0.01872932625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.01723638625 s |
0.01724062375 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.017268876875 s |
0.01726465875 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.01726452125 s |
0.01725511375 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.017262946875 s |
0.017267023125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.017265083125 s |
0.017254851875 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.02535844625 s |
0.0253632874999999 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.02187210125 s |
0.02187114125 s |
1.00 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.02537150875 s |
0.0253652781249999 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.021875314375 s |
0.021876271875 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.025359616875 s |
0.025353813125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.02097529875 s |
0.020961249375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.025278398125 s |
0.025266944375 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.02536749875 s |
0.0253546081249999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.021533745625 s |
0.0215015887499999 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.02527951875 s |
0.02526319 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.0253673556249999 s |
0.025354286875 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.021524795 s |
0.021510863125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.0252794025 s |
0.02527126125 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.0253659875 s |
0.02534782875 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.018918110625 s |
0.0188936225 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.025282743125 s |
0.02526132 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.025365989375 s |
0.025350525 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.018413318125 s |
0.018375641875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.025285955 s |
0.025267529375 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.087128366 s |
0.073189519 s |
1.19 |
jaxmd40 / Jax / cpu / Primal |
0.062924865 s |
0.062088033 s |
1.01 |
jaxmd40 / HLOOpt / cpu / Primal |
0.093542024 s |
0.093305827 s |
1.00 |
jaxmd40 / PartOpt / cpu / Primal |
0.0719967099999999 s |
0.071012165 s |
1.01 |
jaxmd40 / IPartOpt / cpu / Primal |
0.076234853 s |
0.076429691 s |
1.00 |
jaxmd40 / DefOpt / cpu / Primal |
0.088710815 s |
0.097710628 s |
0.91 |
jaxmd40 / IDefOpt / cpu / Primal |
0.089218281 s |
0.094188508 s |
0.95 |
jaxmd40 / JaXPipe / cpu / Forward |
0.165594481 s |
0.181294098 s |
0.91 |
jaxmd40 / Jax / cpu / Forward |
0.093084362 s |
0.0968687699999999 s |
0.96 |
jaxmd40 / HLOOpt / cpu / Forward |
0.1701688969999999 s |
0.169265531 s |
1.01 |
jaxmd40 / PartOpt / cpu / Forward |
0.166875254 s |
0.171988099 s |
0.97 |
jaxmd40 / IPartOpt / cpu / Forward |
0.17398824 s |
0.167338572 s |
1.04 |
jaxmd40 / DefOpt / cpu / Forward |
0.17601385 s |
0.175491958 s |
1.00 |
jaxmd40 / IDefOpt / cpu / Forward |
0.174184795 s |
0.178809637 s |
0.97 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.239904073 s |
0.2251418449999999 s |
1.07 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.150545654 s |
0.1506595829999999 s |
1.00 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.2375609279999999 s |
0.233851988 s |
1.02 |
jaxmd40 / Jax / cpu / BothRev |
0.150644459 s |
0.148079004 s |
1.02 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.235047131 s |
0.234927513 s |
1.00 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.191001227 s |
0.185470716 s |
1.03 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.25071463 s |
0.2584409299999999 s |
0.97 |
jaxmd40 / PartOpt / cpu / PreRev |
0.225936246 s |
0.22314567 s |
1.01 |
jaxmd40 / PartOpt / cpu / PostRev |
0.134630885 s |
0.139577358 s |
0.96 |
jaxmd40 / PartOpt / cpu / BothRev |
0.258274218 s |
0.249477685 s |
1.04 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.249821856 s |
0.229576528 s |
1.09 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.132872998 s |
0.1397846999999999 s |
0.95 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.241774627 s |
0.262264603 s |
0.92 |
jaxmd40 / DefOpt / cpu / PreRev |
0.24332245 s |
0.2235982 s |
1.09 |
jaxmd40 / DefOpt / cpu / PostRev |
0.185615815 s |
0.183224511 s |
1.01 |
jaxmd40 / DefOpt / cpu / BothRev |
0.235466411 s |
0.258078644 s |
0.91 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.2314958379999999 s |
0.224068647 s |
1.03 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.187687464 s |
0.172570329 s |
1.09 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.254376159 s |
0.267373618 s |
0.95 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.704326045 s |
1.704129989 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.706411658 s |
1.706821368 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.718811061 s |
1.717827882 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.6996707510000002 s |
1.699017653 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.697329531 s |
1.6982513 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.711860507 s |
1.710406564 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.961061921 s |
1.962014622 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.952643924375 s |
4.019778363125 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.03899281125 s |
3.0385960193750003 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.12113594625 s |
3.12114431125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.059245798125 s |
3.058878809375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.059230625 s |
3.05891877625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.2635729500000004 s |
2.2633443037500003 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
4.7431810775 s |
4.7428916 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.441740369 s |
6.305729684 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.287252185000001 s |
6.161372106 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.310646281 s |
6.139615052 s |
1.03 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.440268734 s |
6.266638879 s |
1.03 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.328368245 s |
6.203732882000001 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.548313525 s |
2.440213385 s |
1.04 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.9867942990000005 s |
6.919413947 s |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
9fe0697 to
6a1ec6d
Compare
No description provided.