-
Notifications
You must be signed in to change notification settings - Fork 27
RecognizeMultiRotate #1956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecognizeMultiRotate #1956
Conversation
| // Check density - only combine if utilization is reasonable | ||
| // Avoid creating a MultiRotateOp with many unused intermediate results | ||
| if (totalResults > 2 * static_cast<int32_t>(rotates.size())) | ||
| return failure(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need such a check? This one might be too lenient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should throw if there's any gaps [also checking if an op is use_empty], though we don't need to return failure, we can just resize [and select the relevant subset].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added filtering for a contiguous sequence that neighbors or includes rotate(x, 0)
aecb25c to
424244f
Compare
| return failure(); | ||
|
|
||
| // Create the MultiRotateOp | ||
| auto newOp = rewriter.create<enzymexla::MultiRotateOp>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to move the newOp to be defined right after the operand, otherwise we risk a dominace errror
424244f to
1936050
Compare
|
I've been confusing myself. Currently it's implemented as the former but I now think the latter might be more logical? |
1936050 to
48557ca
Compare
| } | ||
| } | ||
| contiguousGroups.push_back({groupStart, groupEnd}); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do the contiguous group check before the 2 rotates [in case we reduce the number because not contiguous]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a second check as well:
// No qualifying groups found
if (qualifyingAmounts.size() < 2)
return failure();There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay
* check if the rotates are actually used * only merge rotates if they form a contiguous sequence that neighbors (or includes) the identity rotation
14bea1f to
74d31ac
Compare
|
Re the existing rotate id need to check, but look at the existing rotate lower and recognize? also perhaps we should add better docs on? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 1a288d6 | Previous: c6280c7 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000007811020013832604 s |
0.0000070109200623846845 s |
1.11 |
actmtch / Jax / cpu / Primal |
0.000007301200012079789 s |
0.00000701169999956619 s |
1.04 |
actmtch / HLOOpt / cpu / Primal |
0.000011415199996918091 s |
0.000010657899974830796 s |
1.07 |
actmtch / PartOpt / cpu / Primal |
0.000006702959972244571 s |
0.000006732300034855143 s |
1.00 |
actmtch / IPartOpt / cpu / Primal |
0.000006998340013524284 s |
0.0000068570599705708445 s |
1.02 |
actmtch / DefOpt / cpu / Primal |
0.000007483140007025213 s |
0.000011533039996720618 s |
0.65 |
actmtch / IDefOpt / cpu / Primal |
0.000007393980004053447 s |
0.000007239939995997702 s |
1.02 |
actmtch / JaXPipe / cpu / Forward |
0.000011722619992724504 s |
0.00001105510002162191 s |
1.06 |
actmtch / Jax / cpu / Forward |
0.000010145140004169662 s |
0.000009777999985089993 s |
1.04 |
actmtch / HLOOpt / cpu / Forward |
0.000016489940016981563 s |
0.000014957540024624904 s |
1.10 |
actmtch / PartOpt / cpu / Forward |
0.000016076980073194137 s |
0.000015249399966705823 s |
1.05 |
actmtch / IPartOpt / cpu / Forward |
0.00001142214004175912 s |
0.000010884359990086522 s |
1.05 |
actmtch / DefOpt / cpu / Forward |
0.000015992240023479098 s |
0.000015586240015181828 s |
1.03 |
actmtch / IDefOpt / cpu / Forward |
0.000011382200000298326 s |
0.000010999039996022474 s |
1.03 |
actmtch / JaXPipe / cpu / PreRev |
0.00001244667998435034 s |
0.00001187840001875884 s |
1.05 |
actmtch / JaXPipe / cpu / PostRev |
0.00001178777999484737 s |
0.000011447939969002618 s |
1.03 |
actmtch / JaXPipe / cpu / BothRev |
0.000012524619969553895 s |
0.00001139407995651709 s |
1.10 |
actmtch / Jax / cpu / BothRev |
0.000011000319946106174 s |
0.0000104005599678203 s |
1.06 |
actmtch / HLOOpt / cpu / PreRev |
0.00001269331999537826 s |
0.000011989220020041105 s |
1.06 |
actmtch / HLOOpt / cpu / PostRev |
0.00001652812003158033 s |
0.000015558620016236092 s |
1.06 |
actmtch / HLOOpt / cpu / BothRev |
0.000014140020011836896 s |
0.000013079480004307695 s |
1.08 |
actmtch / PartOpt / cpu / PreRev |
0.00001252550003300712 s |
0.00001192708000417042 s |
1.05 |
actmtch / PartOpt / cpu / PostRev |
0.000011804919950009207 s |
0.000010795760035762214 s |
1.09 |
actmtch / PartOpt / cpu / BothRev |
0.00001279191999856266 s |
0.000011267960035183933 s |
1.14 |
actmtch / IPartOpt / cpu / PreRev |
0.000012690299972746287 s |
0.000013439059994198031 s |
0.94 |
actmtch / IPartOpt / cpu / PostRev |
0.000011765780018322402 s |
0.000010422020004625665 s |
1.13 |
actmtch / IPartOpt / cpu / BothRev |
0.000012225419995957054 s |
0.000011947300017709494 s |
1.02 |
actmtch / DefOpt / cpu / PreRev |
0.000011876260005010407 s |
0.00001179932004561124 s |
1.01 |
actmtch / DefOpt / cpu / PostRev |
0.00001234206004482985 s |
0.00001211730002069089 s |
1.02 |
actmtch / DefOpt / cpu / BothRev |
0.000012431639988790269 s |
0.000011895040051967951 s |
1.05 |
actmtch / IDefOpt / cpu / PreRev |
0.000012736880025840946 s |
0.000011312639981042591 s |
1.13 |
actmtch / IDefOpt / cpu / PostRev |
0.000012277780006115791 s |
0.000012055839970344096 s |
1.02 |
actmtch / IDefOpt / cpu / BothRev |
0.000012370519953037728 s |
0.00001205063998895639 s |
1.03 |
actmtch / JaXPipe / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
actmtch / Jax / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / HLOOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / PartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
actmtch / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
actmtch / DefOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / IDefOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / JaXPipe / cuda / Forward |
0.000010528 s |
0.0000104 s |
1.01 |
actmtch / Jax / cuda / Forward |
0.000010432 s |
0.000011488 s |
0.91 |
actmtch / HLOOpt / cuda / Forward |
0.000010208 s |
0.000011710999999999998 s |
0.87 |
actmtch / PartOpt / cuda / Forward |
0.00000976 s |
0.000010113 s |
0.97 |
actmtch / IPartOpt / cuda / Forward |
0.000010208 s |
0.000010176 s |
1.00 |
actmtch / DefOpt / cuda / Forward |
0.000009888 s |
0.000009888 s |
1 |
actmtch / IDefOpt / cuda / Forward |
0.00001008 s |
0.000009952 s |
1.01 |
actmtch / JaXPipe / cuda / PreRev |
0.000010144 s |
0.000010112 s |
1.00 |
actmtch / JaXPipe / cuda / PostRev |
0.000010464 s |
0.000010784 s |
0.97 |
actmtch / JaXPipe / cuda / BothRev |
0.000010527 s |
0.00000992 s |
1.06 |
actmtch / Jax / cuda / BothRev |
0.000012 s |
0.000010111 s |
1.19 |
actmtch / HLOOpt / cuda / PreRev |
0.000010049 s |
0.000010112 s |
0.99 |
actmtch / HLOOpt / cuda / PostRev |
0.000010048 s |
0.000010048 s |
1 |
actmtch / HLOOpt / cuda / BothRev |
0.000010016 s |
0.000010208 s |
0.98 |
actmtch / PartOpt / cuda / PreRev |
0.000009952 s |
0.000011136 s |
0.89 |
actmtch / PartOpt / cuda / PostRev |
0.000009888 s |
0.000009888 s |
1 |
actmtch / PartOpt / cuda / BothRev |
0.000010048 s |
0.000009952 s |
1.01 |
actmtch / IPartOpt / cuda / PreRev |
0.000010272 s |
0.000010017 s |
1.03 |
actmtch / IPartOpt / cuda / PostRev |
0.000010144 s |
0.000010112 s |
1.00 |
actmtch / IPartOpt / cuda / BothRev |
0.000009952 s |
0.000009536 s |
1.04 |
actmtch / DefOpt / cuda / PreRev |
0.000010208 s |
0.000010016 s |
1.02 |
actmtch / DefOpt / cuda / PostRev |
0.000010336 s |
0.000010049 s |
1.03 |
actmtch / DefOpt / cuda / BothRev |
0.000009536 s |
0.000010111 s |
0.94 |
actmtch / IDefOpt / cuda / PreRev |
0.0000104 s |
0.00000976 s |
1.07 |
actmtch / IDefOpt / cuda / PostRev |
0.000010304 s |
0.000010432 s |
0.99 |
actmtch / IDefOpt / cuda / BothRev |
0.000009792 s |
0.000009984 s |
0.98 |
actmtch / JaXPipe / tpu / Primal |
5.63475e-7 s |
5.63675e-7 s |
1.00 |
actmtch / Jax / tpu / Primal |
6.06775e-7 s |
6.068500000000001e-7 s |
1.00 |
actmtch / HLOOpt / tpu / Primal |
0.0000021026 s |
0.00000210545 s |
1.00 |
actmtch / PartOpt / tpu / Primal |
6.06275e-7 s |
6.06625e-7 s |
1.00 |
actmtch / IPartOpt / tpu / Primal |
5.628999999999999e-7 s |
5.62425e-7 s |
1.00 |
actmtch / DefOpt / tpu / Primal |
0.000002165475 s |
0.000002154425 s |
1.01 |
actmtch / IDefOpt / tpu / Primal |
0.000002111075 s |
0.000002103825 s |
1.00 |
actmtch / JaXPipe / tpu / Forward |
0.000003830625 s |
0.000003831975 s |
1.00 |
actmtch / Jax / tpu / Forward |
0.00000122195 s |
0.000001222375 s |
1.00 |
actmtch / HLOOpt / tpu / Forward |
0.0000039358250000000006 s |
0.000003929075000000001 s |
1.00 |
actmtch / PartOpt / tpu / Forward |
0.000003911525 s |
0.00000391155 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.000003925375 s |
0.000003937450000000001 s |
1.00 |
actmtch / DefOpt / tpu / Forward |
0.000003915025 s |
0.000003910825 s |
1.00 |
actmtch / IDefOpt / tpu / Forward |
0.000003938025 s |
0.000003938025 s |
1 |
actmtch / JaXPipe / tpu / PreRev |
0.000003467450000000001 s |
0.000003491825 s |
0.99 |
actmtch / JaXPipe / tpu / PostRev |
0.00000164135 s |
0.000001641675 s |
1.00 |
actmtch / JaXPipe / tpu / BothRev |
0.0000034801 s |
0.0000034855750000000005 s |
1.00 |
actmtch / Jax / tpu / BothRev |
0.00000164795 s |
0.00000163335 s |
1.01 |
actmtch / HLOOpt / tpu / PreRev |
0.0000034749000000000004 s |
0.000003482125 s |
1.00 |
actmtch / HLOOpt / tpu / PostRev |
0.000003407175 s |
0.0000034108 s |
1.00 |
actmtch / HLOOpt / tpu / BothRev |
0.00000348205 s |
0.0000034754 s |
1.00 |
actmtch / PartOpt / tpu / PreRev |
0.000003413975 s |
0.00000341845 s |
1.00 |
actmtch / PartOpt / tpu / PostRev |
0.0000015952 s |
0.000001588075 s |
1.00 |
actmtch / PartOpt / tpu / BothRev |
0.00000340825 s |
0.0000034159499999999995 s |
1.00 |
actmtch / IPartOpt / tpu / PreRev |
0.000003471125 s |
0.0000034752250000000003 s |
1.00 |
actmtch / IPartOpt / tpu / PostRev |
0.0000016378 s |
0.000001642975 s |
1.00 |
actmtch / IPartOpt / tpu / BothRev |
0.0000034876 s |
0.0000034939750000000003 s |
1.00 |
actmtch / DefOpt / tpu / PreRev |
0.000003413075 s |
0.000003421725 s |
1.00 |
actmtch / DefOpt / tpu / PostRev |
0.000003418125 s |
0.000003405425 s |
1.00 |
actmtch / DefOpt / tpu / BothRev |
0.00000340895 s |
0.000003413275 s |
1.00 |
actmtch / IDefOpt / tpu / PreRev |
0.000003473825 s |
0.0000034806 s |
1.00 |
actmtch / IDefOpt / tpu / PostRev |
0.00000341255 s |
0.0000033974500000000004 s |
1.00 |
actmtch / IDefOpt / tpu / BothRev |
0.0000034711750000000003 s |
0.0000034731750000000004 s |
1.00 |
actmtch / JaXPipe / cpu / Primal |
0.000013682 s |
0.0000070109200623846845 s |
1.95 |
actmtch / Jax / cpu / Primal |
0.000014005 s |
0.00000701169999956619 s |
2.00 |
actmtch / HLOOpt / cpu / Primal |
0.000014474 s |
0.000010657899974830796 s |
1.36 |
actmtch / PartOpt / cpu / Primal |
0.000013438 s |
0.000006732300034855143 s |
2.00 |
actmtch / IPartOpt / cpu / Primal |
0.00001357 s |
0.0000068570599705708445 s |
1.98 |
actmtch / DefOpt / cpu / Primal |
0.000014415 s |
0.000011533039996720618 s |
1.25 |
actmtch / IDefOpt / cpu / Primal |
0.000014195 s |
0.000007239939995997702 s |
1.96 |
actmtch / JaXPipe / cpu / Forward |
0.000019397 s |
0.00001105510002162191 s |
1.75 |
actmtch / Jax / cpu / Forward |
0.000017956 s |
0.000009777999985089993 s |
1.84 |
actmtch / HLOOpt / cpu / Forward |
0.000019521 s |
0.000014957540024624904 s |
1.31 |
actmtch / PartOpt / cpu / Forward |
0.000018891000000000003 s |
0.000015249399966705823 s |
1.24 |
actmtch / IPartOpt / cpu / Forward |
0.000019068 s |
0.000010884359990086522 s |
1.75 |
actmtch / DefOpt / cpu / Forward |
0.000019484 s |
0.000015586240015181828 s |
1.25 |
actmtch / IDefOpt / cpu / Forward |
0.0000191 s |
0.000010999039996022474 s |
1.74 |
actmtch / JaXPipe / cpu / PreRev |
0.000019634 s |
0.00001187840001875884 s |
1.65 |
actmtch / JaXPipe / cpu / PostRev |
0.000017126 s |
0.000011447939969002618 s |
1.50 |
actmtch / JaXPipe / cpu / BothRev |
0.000020024 s |
0.00001139407995651709 s |
1.76 |
actmtch / Jax / cpu / BothRev |
0.000017763000000000003 s |
0.0000104005599678203 s |
1.71 |
actmtch / HLOOpt / cpu / PreRev |
0.000019076 s |
0.000011989220020041105 s |
1.59 |
actmtch / HLOOpt / cpu / PostRev |
0.000020375 s |
0.000015558620016236092 s |
1.31 |
actmtch / HLOOpt / cpu / BothRev |
0.000019764 s |
0.000013079480004307695 s |
1.51 |
actmtch / PartOpt / cpu / PreRev |
0.000019239 s |
0.00001192708000417042 s |
1.61 |
actmtch / PartOpt / cpu / PostRev |
0.000017598 s |
0.000010795760035762214 s |
1.63 |
actmtch / PartOpt / cpu / BothRev |
0.000019187 s |
0.000011267960035183933 s |
1.70 |
actmtch / IPartOpt / cpu / PreRev |
0.00001939 s |
0.000013439059994198031 s |
1.44 |
actmtch / IPartOpt / cpu / PostRev |
0.000017676999999999997 s |
0.000010422020004625665 s |
1.70 |
actmtch / IPartOpt / cpu / BothRev |
0.000019786 s |
0.000011947300017709494 s |
1.66 |
actmtch / DefOpt / cpu / PreRev |
0.00001996 s |
0.00001179932004561124 s |
1.69 |
actmtch / DefOpt / cpu / PostRev |
0.000019288 s |
0.00001211730002069089 s |
1.59 |
actmtch / DefOpt / cpu / BothRev |
0.000019826 s |
0.000011895040051967951 s |
1.67 |
actmtch / IDefOpt / cpu / PreRev |
0.00001941 s |
0.000011312639981042591 s |
1.72 |
actmtch / IDefOpt / cpu / PostRev |
0.000019606 s |
0.000012055839970344096 s |
1.63 |
actmtch / IDefOpt / cpu / BothRev |
0.000019789 s |
0.00001205063998895639 s |
1.64 |
actmtch / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.0000070109200623846845 s |
1.28 |
actmtch / Jax / cpu / Primal |
0.000008999999999999999 s |
0.00000701169999956619 s |
1.28 |
actmtch / HLOOpt / cpu / Primal |
0.00001 s |
0.000010657899974830796 s |
0.94 |
actmtch / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006732300034855143 s |
1.34 |
actmtch / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000068570599705708445 s |
1.31 |
actmtch / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000011533039996720618 s |
0.78 |
actmtch / IDefOpt / cpu / Primal |
0.00001 s |
0.000007239939995997702 s |
1.38 |
actmtch / JaXPipe / cpu / Forward |
0.000013 s |
0.00001105510002162191 s |
1.18 |
actmtch / Jax / cpu / Forward |
0.000012 s |
0.000009777999985089993 s |
1.23 |
actmtch / HLOOpt / cpu / Forward |
0.000014 s |
0.000014957540024624904 s |
0.94 |
actmtch / PartOpt / cpu / Forward |
0.000014 s |
0.000015249399966705823 s |
0.92 |
actmtch / IPartOpt / cpu / Forward |
0.000013 s |
0.000010884359990086522 s |
1.19 |
actmtch / DefOpt / cpu / Forward |
0.000013 s |
0.000015586240015181828 s |
0.83 |
actmtch / IDefOpt / cpu / Forward |
0.000013 s |
0.000010999039996022474 s |
1.18 |
actmtch / JaXPipe / cpu / PreRev |
0.000014 s |
0.00001187840001875884 s |
1.18 |
actmtch / JaXPipe / cpu / PostRev |
0.000012 s |
0.000011447939969002618 s |
1.05 |
actmtch / JaXPipe / cpu / BothRev |
0.000014 s |
0.00001139407995651709 s |
1.23 |
actmtch / Jax / cpu / BothRev |
0.000012 s |
0.0000104005599678203 s |
1.15 |
actmtch / HLOOpt / cpu / PreRev |
0.000013 s |
0.000011989220020041105 s |
1.08 |
actmtch / HLOOpt / cpu / PostRev |
0.000013 s |
0.000015558620016236092 s |
0.84 |
actmtch / HLOOpt / cpu / BothRev |
0.000014 s |
0.000013079480004307695 s |
1.07 |
actmtch / PartOpt / cpu / PreRev |
0.000014 s |
0.00001192708000417042 s |
1.17 |
actmtch / PartOpt / cpu / PostRev |
0.000013 s |
0.000010795760035762214 s |
1.20 |
actmtch / PartOpt / cpu / BothRev |
0.000015 s |
0.000011267960035183933 s |
1.33 |
actmtch / IPartOpt / cpu / PreRev |
0.000014 s |
0.000013439059994198031 s |
1.04 |
actmtch / IPartOpt / cpu / PostRev |
0.000012 s |
0.000010422020004625665 s |
1.15 |
actmtch / IPartOpt / cpu / BothRev |
0.000014 s |
0.000011947300017709494 s |
1.17 |
actmtch / DefOpt / cpu / PreRev |
0.000014 s |
0.00001179932004561124 s |
1.19 |
actmtch / DefOpt / cpu / PostRev |
0.000013 s |
0.00001211730002069089 s |
1.07 |
actmtch / DefOpt / cpu / BothRev |
0.000013 s |
0.000011895040051967951 s |
1.09 |
actmtch / IDefOpt / cpu / PreRev |
0.000013 s |
0.000011312639981042591 s |
1.15 |
actmtch / IDefOpt / cpu / PostRev |
0.000013 s |
0.000012055839970344096 s |
1.08 |
actmtch / IDefOpt / cpu / BothRev |
0.000013 s |
0.00001205063998895639 s |
1.08 |
add_one / JaXPipe / cpu / Primal |
0.000008299000010083546 s |
0.000007610080019730958 s |
1.09 |
add_one / Jax / cpu / Primal |
0.000008125500007736264 s |
0.000007455760060111061 s |
1.09 |
add_one / HLOOpt / cpu / Primal |
0.000011582559964153917 s |
0.000009914300017044298 s |
1.17 |
add_one / PartOpt / cpu / Primal |
0.000007942099991851136 s |
0.000006775040001230081 s |
1.17 |
add_one / IPartOpt / cpu / Primal |
0.000007567600023321574 s |
0.0000064760799796204085 s |
1.17 |
add_one / DefOpt / cpu / Primal |
0.000011937939971176091 s |
0.000010284079962730177 s |
1.16 |
add_one / IDefOpt / cpu / Primal |
0.000007066259986459044 s |
0.000006921240037627286 s |
1.02 |
add_one / JaXPipe / cpu / Forward |
0.000011016719954568545 s |
0.00001079442002264841 s |
1.02 |
add_one / Jax / cpu / Forward |
0.00001118312003200117 s |
0.000010645579977790476 s |
1.05 |
add_one / HLOOpt / cpu / Forward |
0.00001585782006259251 s |
0.000014617440019719651 s |
1.08 |
add_one / PartOpt / cpu / Forward |
0.000016053759991336848 s |
0.000015689119982198464 s |
1.02 |
add_one / IPartOpt / cpu / Forward |
0.000011629020036707516 s |
0.000010346240042053978 s |
1.12 |
add_one / DefOpt / cpu / Forward |
0.00001581619997523376 s |
0.000010955160005323706 s |
1.44 |
add_one / IDefOpt / cpu / Forward |
0.000011246000003666267 s |
0.00001038683998558554 s |
1.08 |
add_one / JaXPipe / cpu / PreRev |
0.000012999200016565738 s |
0.00001245874000233016 s |
1.04 |
add_one / JaXPipe / cpu / PostRev |
0.00001272888001949468 s |
0.000012182040009065532 s |
1.04 |
add_one / JaXPipe / cpu / BothRev |
0.00001680824000686698 s |
0.000011788040019382608 s |
1.43 |
add_one / Jax / cpu / BothRev |
0.000012509200014392264 s |
0.000012370640006338362 s |
1.01 |
add_one / HLOOpt / cpu / PreRev |
0.000012748379995173308 s |
0.000012197079968245816 s |
1.05 |
add_one / HLOOpt / cpu / PostRev |
0.000016790900017440434 s |
0.00001592787995832623 s |
1.05 |
add_one / HLOOpt / cpu / BothRev |
0.000014303660000223316 s |
0.000014361539979290682 s |
1.00 |
add_one / PartOpt / cpu / PreRev |
0.000012691959982475964 s |
0.000012441599956218852 s |
1.02 |
add_one / PartOpt / cpu / PostRev |
0.000012468760014598956 s |
0.000012768499964295188 s |
0.98 |
add_one / PartOpt / cpu / BothRev |
0.0000124934000086796 s |
0.000012874940011897706 s |
0.97 |
add_one / IPartOpt / cpu / PreRev |
0.000017635099993640324 s |
0.000015121999986149604 s |
1.17 |
add_one / IPartOpt / cpu / PostRev |
0.000012381859978631835 s |
0.000012084039990440944 s |
1.02 |
add_one / IPartOpt / cpu / BothRev |
0.000012873900004706227 s |
0.000011877479973918524 s |
1.08 |
add_one / DefOpt / cpu / PreRev |
0.00001259321998077212 s |
0.000012192400008643745 s |
1.03 |
add_one / DefOpt / cpu / PostRev |
0.000013065480043223945 s |
0.000012204380000184756 s |
1.07 |
add_one / DefOpt / cpu / BothRev |
0.00001291836000746116 s |
0.000012285359971428987 s |
1.05 |
add_one / IDefOpt / cpu / PreRev |
0.000012939040043420392 s |
0.0000128007200146385 s |
1.01 |
add_one / IDefOpt / cpu / PostRev |
0.000013799180051137227 s |
0.000012469899966163211 s |
1.11 |
add_one / IDefOpt / cpu / BothRev |
0.000013007160032429965 s |
0.00001239260000147624 s |
1.05 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / JaXPipe / cuda / Forward |
0.000010593 s |
0.000010047 s |
1.05 |
add_one / Jax / cuda / Forward |
0.000009728 s |
0.000009472 s |
1.03 |
add_one / HLOOpt / cuda / Forward |
0.000010336 s |
0.000010176 s |
1.02 |
add_one / PartOpt / cuda / Forward |
0.000010048 s |
0.000009888 s |
1.02 |
add_one / IPartOpt / cuda / Forward |
0.000010049 s |
0.000009952 s |
1.01 |
add_one / DefOpt / cuda / Forward |
0.00001024 s |
0.000010015 s |
1.02 |
add_one / IDefOpt / cuda / Forward |
0.00001024 s |
0.00001008 s |
1.02 |
add_one / JaXPipe / cuda / PreRev |
0.000025408 s |
0.000025792 s |
0.99 |
add_one / JaXPipe / cuda / PostRev |
0.00002592 s |
0.000025568 s |
1.01 |
add_one / JaXPipe / cuda / BothRev |
0.00002576 s |
0.000024991 s |
1.03 |
add_one / Jax / cuda / BothRev |
0.000025312 s |
0.00002528 s |
1.00 |
add_one / HLOOpt / cuda / PreRev |
0.00002528 s |
0.00002576 s |
0.98 |
add_one / HLOOpt / cuda / PostRev |
0.000025025 s |
0.000025056 s |
1.00 |
add_one / HLOOpt / cuda / BothRev |
0.000025184 s |
0.00002512 s |
1.00 |
add_one / PartOpt / cuda / PreRev |
0.00002464 s |
0.000025215 s |
0.98 |
add_one / PartOpt / cuda / PostRev |
0.000025632 s |
0.000024768 s |
1.03 |
add_one / PartOpt / cuda / BothRev |
0.000026975 s |
0.000025024 s |
1.08 |
add_one / IPartOpt / cuda / PreRev |
0.000025632 s |
0.0000264 s |
0.97 |
add_one / IPartOpt / cuda / PostRev |
0.00002576 s |
0.000025632 s |
1.00 |
add_one / IPartOpt / cuda / BothRev |
0.00002528 s |
0.000024607 s |
1.03 |
add_one / DefOpt / cuda / PreRev |
0.000025504 s |
0.000024896 s |
1.02 |
add_one / DefOpt / cuda / PostRev |
0.000025633 s |
0.000025184 s |
1.02 |
add_one / DefOpt / cuda / BothRev |
0.00002608 s |
0.000024928 s |
1.05 |
add_one / IDefOpt / cuda / PreRev |
0.000030496 s |
0.000024512 s |
1.24 |
add_one / IDefOpt / cuda / PostRev |
0.000031008 s |
0.000030144 s |
1.03 |
add_one / IDefOpt / cuda / BothRev |
0.000025536 s |
0.000025215 s |
1.01 |
add_one / JaXPipe / tpu / Primal |
0.0000014255000000000002 s |
0.0000014410500000000002 s |
0.99 |
add_one / Jax / tpu / Primal |
0.000001399025 s |
0.00000141065 s |
0.99 |
add_one / HLOOpt / tpu / Primal |
0.0000014289250000000005 s |
0.0000014297999999999998 s |
1.00 |
add_one / PartOpt / tpu / Primal |
0.000001402975 s |
0.00000142075 s |
0.99 |
add_one / IPartOpt / tpu / Primal |
0.000001433675 s |
0.0000014248250000000002 s |
1.01 |
add_one / DefOpt / tpu / Primal |
0.000001402475 s |
0.0000013976 s |
1.00 |
add_one / IDefOpt / tpu / Primal |
0.000001423575 s |
0.0000014301 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.000001855425 s |
0.0000018513 s |
1.00 |
add_one / Jax / tpu / Forward |
0.000001841025 s |
0.000001847325 s |
1.00 |
add_one / HLOOpt / tpu / Forward |
0.000001855625 s |
0.000001859675 s |
1.00 |
add_one / PartOpt / tpu / Forward |
0.000001834875 s |
0.000001844325 s |
0.99 |
add_one / IPartOpt / tpu / Forward |
0.00000185095 s |
0.000001851 s |
1.00 |
add_one / DefOpt / tpu / Forward |
0.000001838275 s |
0.0000018474 s |
1.00 |
add_one / IDefOpt / tpu / Forward |
0.000001848525 s |
0.0000018475 s |
1.00 |
add_one / JaXPipe / tpu / PreRev |
0.000002244475 s |
0.0000022364 s |
1.00 |
add_one / JaXPipe / tpu / PostRev |
0.0000022402 s |
0.0000022444 s |
1.00 |
add_one / JaXPipe / tpu / BothRev |
0.000002243775 s |
0.0000022395000000000003 s |
1.00 |
add_one / Jax / tpu / BothRev |
0.000002241 s |
0.0000022446 s |
1.00 |
add_one / HLOOpt / tpu / PreRev |
0.000002235875 s |
0.00000223825 s |
1.00 |
add_one / HLOOpt / tpu / PostRev |
0.000002241075 s |
0.0000022425 s |
1.00 |
add_one / HLOOpt / tpu / BothRev |
0.000002239575 s |
0.000002243675 s |
1.00 |
add_one / PartOpt / tpu / PreRev |
0.00000224675 s |
0.000002238825 s |
1.00 |
add_one / PartOpt / tpu / PostRev |
0.0000022424 s |
0.00000223935 s |
1.00 |
add_one / PartOpt / tpu / BothRev |
0.000002242625 s |
0.0000022354750000000004 s |
1.00 |
add_one / IPartOpt / tpu / PreRev |
0.000002237075 s |
0.0000022375999999999995 s |
1.00 |
add_one / IPartOpt / tpu / PostRev |
0.000002234775 s |
0.00000224575 s |
1.00 |
add_one / IPartOpt / tpu / BothRev |
0.0000022371750000000003 s |
0.00000224635 s |
1.00 |
add_one / DefOpt / tpu / PreRev |
0.000002240325 s |
0.0000022418250000000003 s |
1.00 |
add_one / DefOpt / tpu / PostRev |
0.0000022512 s |
0.000002233525 s |
1.01 |
add_one / DefOpt / tpu / BothRev |
0.00000224285 s |
0.0000022407 s |
1.00 |
add_one / IDefOpt / tpu / PreRev |
0.0000022366 s |
0.00000223925 s |
1.00 |
add_one / IDefOpt / tpu / PostRev |
0.000002240025 s |
0.00000224175 s |
1.00 |
add_one / IDefOpt / tpu / BothRev |
0.000002239425 s |
0.00000223755 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000013098 s |
0.000007610080019730958 s |
1.72 |
add_one / Jax / cpu / Primal |
0.000013449 s |
0.000007455760060111061 s |
1.80 |
add_one / HLOOpt / cpu / Primal |
0.000013149 s |
0.000009914300017044298 s |
1.33 |
add_one / PartOpt / cpu / Primal |
0.000013058 s |
0.000006775040001230081 s |
1.93 |
add_one / IPartOpt / cpu / Primal |
0.000013057 s |
0.0000064760799796204085 s |
2.02 |
add_one / DefOpt / cpu / Primal |
0.000012978 s |
0.000010284079962730177 s |
1.26 |
add_one / IDefOpt / cpu / Primal |
0.000013122 s |
0.000006921240037627286 s |
1.90 |
add_one / JaXPipe / cpu / Forward |
0.000018002 s |
0.00001079442002264841 s |
1.67 |
add_one / Jax / cpu / Forward |
0.000017641 s |
0.000010645579977790476 s |
1.66 |
add_one / HLOOpt / cpu / Forward |
0.000017415 s |
0.000014617440019719651 s |
1.19 |
add_one / PartOpt / cpu / Forward |
0.000017259 s |
0.000015689119982198464 s |
1.10 |
add_one / IPartOpt / cpu / Forward |
0.000017624999999999998 s |
0.000010346240042053978 s |
1.70 |
add_one / DefOpt / cpu / Forward |
0.000017826000000000002 s |
0.000010955160005323706 s |
1.63 |
add_one / IDefOpt / cpu / Forward |
0.00001763 s |
0.00001038683998558554 s |
1.70 |
add_one / JaXPipe / cpu / PreRev |
0.000020413 s |
0.00001245874000233016 s |
1.64 |
add_one / JaXPipe / cpu / PostRev |
0.000020815 s |
0.000012182040009065532 s |
1.71 |
add_one / JaXPipe / cpu / BothRev |
0.000020504 s |
0.000011788040019382608 s |
1.74 |
add_one / Jax / cpu / BothRev |
0.000019606 s |
0.000012370640006338362 s |
1.58 |
add_one / HLOOpt / cpu / PreRev |
0.000019897000000000003 s |
0.000012197079968245816 s |
1.63 |
add_one / HLOOpt / cpu / PostRev |
0.000020442 s |
0.00001592787995832623 s |
1.28 |
add_one / HLOOpt / cpu / BothRev |
0.000019423 s |
0.000014361539979290682 s |
1.35 |
add_one / PartOpt / cpu / PreRev |
0.000019763 s |
0.000012441599956218852 s |
1.59 |
add_one / PartOpt / cpu / PostRev |
0.000020281 s |
0.000012768499964295188 s |
1.59 |
add_one / PartOpt / cpu / BothRev |
0.000020445 s |
0.000012874940011897706 s |
1.59 |
add_one / IPartOpt / cpu / PreRev |
0.000019924 s |
0.000015121999986149604 s |
1.32 |
add_one / IPartOpt / cpu / PostRev |
0.000020368 s |
0.000012084039990440944 s |
1.69 |
add_one / IPartOpt / cpu / BothRev |
0.000020543 s |
0.000011877479973918524 s |
1.73 |
add_one / DefOpt / cpu / PreRev |
0.00002032 s |
0.000012192400008643745 s |
1.67 |
add_one / DefOpt / cpu / PostRev |
0.000020467 s |
0.000012204380000184756 s |
1.68 |
add_one / DefOpt / cpu / BothRev |
0.000020383 s |
0.000012285359971428987 s |
1.66 |
add_one / IDefOpt / cpu / PreRev |
0.000019767 s |
0.0000128007200146385 s |
1.54 |
add_one / IDefOpt / cpu / PostRev |
0.000020056 s |
0.000012469899966163211 s |
1.61 |
add_one / IDefOpt / cpu / BothRev |
0.000019824 s |
0.00001239260000147624 s |
1.60 |
add_one / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000007610080019730958 s |
1.18 |
add_one / Jax / cpu / Primal |
0.000008 s |
0.000007455760060111061 s |
1.07 |
add_one / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000009914300017044298 s |
0.91 |
add_one / PartOpt / cpu / Primal |
0.000008 s |
0.000006775040001230081 s |
1.18 |
add_one / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000064760799796204085 s |
1.39 |
add_one / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000010284079962730177 s |
0.88 |
add_one / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006921240037627286 s |
1.30 |
add_one / JaXPipe / cpu / Forward |
0.000012 s |
0.00001079442002264841 s |
1.11 |
add_one / Jax / cpu / Forward |
0.000013 s |
0.000010645579977790476 s |
1.22 |
add_one / HLOOpt / cpu / Forward |
0.000012 s |
0.000014617440019719651 s |
0.82 |
add_one / PartOpt / cpu / Forward |
0.000012 s |
0.000015689119982198464 s |
0.76 |
add_one / IPartOpt / cpu / Forward |
0.000013 s |
0.000010346240042053978 s |
1.26 |
add_one / DefOpt / cpu / Forward |
0.000012 s |
0.000010955160005323706 s |
1.10 |
add_one / IDefOpt / cpu / Forward |
0.000012 s |
0.00001038683998558554 s |
1.16 |
add_one / JaXPipe / cpu / PreRev |
0.000014 s |
0.00001245874000233016 s |
1.12 |
add_one / JaXPipe / cpu / PostRev |
0.000014 s |
0.000012182040009065532 s |
1.15 |
add_one / JaXPipe / cpu / BothRev |
0.000013 s |
0.000011788040019382608 s |
1.10 |
add_one / Jax / cpu / BothRev |
0.000013 s |
0.000012370640006338362 s |
1.05 |
add_one / HLOOpt / cpu / PreRev |
0.000013 s |
0.000012197079968245816 s |
1.07 |
add_one / HLOOpt / cpu / PostRev |
0.000014 s |
0.00001592787995832623 s |
0.88 |
add_one / HLOOpt / cpu / BothRev |
0.000014 s |
0.000014361539979290682 s |
0.97 |
add_one / PartOpt / cpu / PreRev |
0.000013 s |
0.000012441599956218852 s |
1.04 |
add_one / PartOpt / cpu / PostRev |
0.000014 s |
0.000012768499964295188 s |
1.10 |
add_one / PartOpt / cpu / BothRev |
0.000014 s |
0.000012874940011897706 s |
1.09 |
add_one / IPartOpt / cpu / PreRev |
0.000043 s |
0.000015121999986149604 s |
2.84 |
add_one / IPartOpt / cpu / PostRev |
0.000017 s |
0.000012084039990440944 s |
1.41 |
add_one / IPartOpt / cpu / BothRev |
0.000014 s |
0.000011877479973918524 s |
1.18 |
add_one / DefOpt / cpu / PreRev |
0.000014 s |
0.000012192400008643745 s |
1.15 |
add_one / DefOpt / cpu / PostRev |
0.000014 s |
0.000012204380000184756 s |
1.15 |
add_one / DefOpt / cpu / BothRev |
0.000015 s |
0.000012285359971428987 s |
1.22 |
add_one / IDefOpt / cpu / PreRev |
0.000014 s |
0.0000128007200146385 s |
1.09 |
add_one / IDefOpt / cpu / PostRev |
0.000029 s |
0.000012469899966163211 s |
2.33 |
add_one / IDefOpt / cpu / BothRev |
0.000015 s |
0.00001239260000147624 s |
1.21 |
add_two / JaXPipe / cpu / Primal |
0.000008081239984676358 s |
0.000008162819976860192 s |
0.99 |
add_two / Jax / cpu / Primal |
0.000006833220004409668 s |
0.000007155640005294117 s |
0.95 |
add_two / HLOOpt / cpu / Primal |
0.000011251159985476989 s |
0.000010254219951093546 s |
1.10 |
add_two / PartOpt / cpu / Primal |
0.00000686575999679917 s |
0.00000718430002052628 s |
0.96 |
add_two / IPartOpt / cpu / Primal |
0.00000699128000633209 s |
0.000007414760048050084 s |
0.94 |
add_two / DefOpt / cpu / Primal |
0.000011676739995891694 s |
0.000011154560015711467 s |
1.05 |
add_two / IDefOpt / cpu / Primal |
0.000007351920003202395 s |
0.0000070370199864555614 s |
1.04 |
add_two / JaXPipe / cpu / Forward |
0.000010913339965554769 s |
0.00001075002000106906 s |
1.02 |
add_two / Jax / cpu / Forward |
0.00001083413998458127 s |
0.000010894060023929342 s |
0.99 |
add_two / HLOOpt / cpu / Forward |
0.00001650566001444531 s |
0.00001475107999795 s |
1.12 |
add_two / PartOpt / cpu / Forward |
0.00001595563996488636 s |
0.00001495366000199283 s |
1.07 |
add_two / IPartOpt / cpu / Forward |
0.000011088040009781253 s |
0.000011008659985236593 s |
1.01 |
add_two / DefOpt / cpu / Forward |
0.000016130100048030725 s |
0.000015363799975602886 s |
1.05 |
add_two / IDefOpt / cpu / Forward |
0.00001074757999049325 s |
0.000011219039997740765 s |
0.96 |
add_two / JaXPipe / cpu / PreRev |
0.000015303739965020212 s |
0.000014846300009594416 s |
1.03 |
add_two / JaXPipe / cpu / PostRev |
0.00001550980005049496 s |
0.000014568139995390084 s |
1.06 |
add_two / JaXPipe / cpu / BothRev |
0.00001549651997265755 s |
0.0000145148999945377 s |
1.07 |
add_two / Jax / cpu / BothRev |
0.00001604763999239367 s |
0.000015333739975176285 s |
1.05 |
add_two / HLOOpt / cpu / PreRev |
0.000015250699989337593 s |
0.000014390839960469749 s |
1.06 |
add_two / HLOOpt / cpu / PostRev |
0.000015635000036127167 s |
0.000014757260023543497 s |
1.06 |
add_two / HLOOpt / cpu / BothRev |
0.00001686870000412455 s |
0.000016447680000055698 s |
1.03 |
add_two / PartOpt / cpu / PreRev |
0.000015333320006902796 s |
0.000014473599985649344 s |
1.06 |
add_two / PartOpt / cpu / PostRev |
0.000015136059973883677 s |
0.000014772960003028856 s |
1.02 |
add_two / PartOpt / cpu / BothRev |
0.00001565096005833766 s |
0.000014620759993704268 s |
1.07 |
add_two / IPartOpt / cpu / PreRev |
0.00001541286003885034 s |
0.00001467320001211192 s |
1.05 |
add_two / IPartOpt / cpu / PostRev |
0.000015139179977268212 s |
0.000014904119971106412 s |
1.02 |
add_two / IPartOpt / cpu / BothRev |
0.000014759420009795576 s |
0.000015015659992059229 s |
0.98 |
add_two / DefOpt / cpu / PreRev |
0.000015000220018919208 s |
0.000014867620029690442 s |
1.01 |
add_two / DefOpt / cpu / PostRev |
0.00001492532003794622 s |
0.000014927759948477615 s |
1.00 |
add_two / DefOpt / cpu / BothRev |
0.00001518806000603945 s |
0.000014545599970006152 s |
1.04 |
add_two / IDefOpt / cpu / PreRev |
0.000015363639986389897 s |
0.000015273900016836707 s |
1.01 |
add_two / IDefOpt / cpu / PostRev |
0.000016271180002149776 s |
0.000015041340038806083 s |
1.08 |
add_two / IDefOpt / cpu / BothRev |
0.0000153297399810981 s |
0.000015036300028441474 s |
1.02 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / JaXPipe / cuda / Forward |
0.000009952 s |
0.000009984 s |
1.00 |
add_two / Jax / cuda / Forward |
0.00000992 s |
0.000010081 s |
0.98 |
add_two / HLOOpt / cuda / Forward |
0.000010048 s |
0.000009248 s |
1.09 |
add_two / PartOpt / cuda / Forward |
0.000009536 s |
0.000009504 s |
1.00 |
add_two / IPartOpt / cuda / Forward |
0.000009985 s |
0.000009312000000000002 s |
1.07 |
add_two / DefOpt / cuda / Forward |
0.000009728 s |
0.000009825 s |
0.99 |
add_two / IDefOpt / cuda / Forward |
0.000009728 s |
0.000009568 s |
1.02 |
add_two / JaXPipe / cuda / PreRev |
0.000033216 s |
0.000031648 s |
1.05 |
add_two / JaXPipe / cuda / PostRev |
0.000033376 s |
0.00003248 s |
1.03 |
add_two / JaXPipe / cuda / BothRev |
0.000032864 s |
0.000032832 s |
1.00 |
add_two / Jax / cuda / BothRev |
0.000033376 s |
0.00003184 s |
1.05 |
add_two / HLOOpt / cuda / PreRev |
0.000032416 s |
0.000031616 s |
1.03 |
add_two / HLOOpt / cuda / PostRev |
0.00003488 s |
0.00003232 s |
1.08 |
add_two / HLOOpt / cuda / BothRev |
0.00003296 s |
0.000032032 s |
1.03 |
add_two / PartOpt / cuda / PreRev |
0.000033056 s |
0.000032736 s |
1.01 |
add_two / PartOpt / cuda / PostRev |
0.000033119999999999995 s |
0.000033088 s |
1.00 |
add_two / PartOpt / cuda / BothRev |
0.000037504000000000005 s |
0.000031551 s |
1.19 |
add_two / IPartOpt / cuda / PreRev |
0.000032544 s |
0.000032160000000000004 s |
1.01 |
add_two / IPartOpt / cuda / PostRev |
0.000032767999999999995 s |
0.000031456 s |
1.04 |
add_two / IPartOpt / cuda / BothRev |
0.000037857 s |
0.000042144 s |
0.90 |
add_two / DefOpt / cuda / PreRev |
0.000032511 s |
0.000032800000000000004 s |
0.99 |
add_two / DefOpt / cuda / PostRev |
0.000033152000000000004 s |
0.000032416 s |
1.02 |
add_two / DefOpt / cuda / BothRev |
0.000033119999999999995 s |
0.000031424 s |
1.05 |
add_two / IDefOpt / cuda / PreRev |
0.000033216 s |
0.000032928 s |
1.01 |
add_two / IDefOpt / cuda / PostRev |
0.00003328 s |
0.000032864 s |
1.01 |
add_two / IDefOpt / cuda / BothRev |
0.000032928 s |
0.000032704 s |
1.01 |
add_two / JaXPipe / tpu / Primal |
0.000001437425 s |
0.0000014289250000000005 s |
1.01 |
add_two / Jax / tpu / Primal |
0.000001474975 s |
0.00000147225 s |
1.00 |
add_two / HLOOpt / tpu / Primal |
0.000001439975 s |
0.0000014288999999999998 s |
1.01 |
add_two / PartOpt / tpu / Primal |
0.000001481575 s |
0.0000014854 s |
1.00 |
add_two / IPartOpt / tpu / Primal |
0.0000014359749999999998 s |
0.0000014466750000000002 s |
0.99 |
add_two / DefOpt / tpu / Primal |
0.0000014767 s |
0.00000148395 s |
1.00 |
add_two / IDefOpt / tpu / Primal |
0.000001453475 s |
0.000001445225 s |
1.01 |
add_two / JaXPipe / tpu / Forward |
0.00000182825 s |
0.00000184105 s |
0.99 |
add_two / Jax / tpu / Forward |
0.000001831375 s |
0.000001822225 s |
1.01 |
add_two / HLOOpt / tpu / Forward |
0.0000018235 s |
0.000001831225 s |
1.00 |
add_two / PartOpt / tpu / Forward |
0.000001841225 s |
0.000001831625 s |
1.01 |
add_two / IPartOpt / tpu / Forward |
0.000001835575 s |
0.000001822125 s |
1.01 |
add_two / DefOpt / tpu / Forward |
0.00000183175 s |
0.000001831125 s |
1.00 |
add_two / IDefOpt / tpu / Forward |
0.000001822 s |
0.000001831125 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.00000284215 s |
0.0000028381250000000005 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.000002771025 s |
0.0000027414 s |
1.01 |
add_two / JaXPipe / tpu / BothRev |
0.000002847975 s |
0.0000028322750000000003 s |
1.01 |
add_two / Jax / tpu / BothRev |
0.00000275845 s |
0.0000027434750000000004 s |
1.01 |
add_two / HLOOpt / tpu / PreRev |
0.000002842975 s |
0.0000028451 s |
1.00 |
add_two / HLOOpt / tpu / PostRev |
0.0000027551750000000003 s |
0.000002758575 s |
1.00 |
add_two / HLOOpt / tpu / BothRev |
0.000002841875 s |
0.000002828825 s |
1.00 |
add_two / PartOpt / tpu / PreRev |
0.0000027578000000000005 s |
0.00000274625 s |
1.00 |
add_two / PartOpt / tpu / PostRev |
0.000002840025 s |
0.000002832975 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.0000027477 s |
0.000002757475 s |
1.00 |
add_two / IPartOpt / tpu / PreRev |
0.00000283175 s |
0.000002834275 s |
1.00 |
add_two / IPartOpt / tpu / PostRev |
0.0000027611 s |
0.00000275975 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.000002843725 s |
0.000002849325 s |
1.00 |
add_two / DefOpt / tpu / PreRev |
0.0000027543 s |
0.000002746525 s |
1.00 |
add_two / DefOpt / tpu / PostRev |
0.000002845525 s |
0.000002838725 s |
1.00 |
add_two / DefOpt / tpu / BothRev |
0.00000275045 s |
0.00000275765 s |
1.00 |
add_two / IDefOpt / tpu / PreRev |
0.0000028452250000000004 s |
0.000002837125 s |
1.00 |
add_two / IDefOpt / tpu / PostRev |
0.00000274845 s |
0.000002754725 s |
1.00 |
add_two / IDefOpt / tpu / BothRev |
0.0000028410000000000004 s |
0.00000283905 s |
1.00 |
add_two / JaXPipe / cpu / Primal |
0.000013492 s |
0.000008162819976860192 s |
1.65 |
add_two / Jax / cpu / Primal |
0.000013586 s |
0.000007155640005294117 s |
1.90 |
add_two / HLOOpt / cpu / Primal |
0.000013784 s |
0.000010254219951093546 s |
1.34 |
add_two / PartOpt / cpu / Primal |
0.000013663 s |
0.00000718430002052628 s |
1.90 |
add_two / IPartOpt / cpu / Primal |
0.000013583 s |
0.000007414760048050084 s |
1.83 |
add_two / DefOpt / cpu / Primal |
0.000013295 s |
0.000011154560015711467 s |
1.19 |
add_two / IDefOpt / cpu / Primal |
0.000013198 s |
0.0000070370199864555614 s |
1.88 |
add_two / JaXPipe / cpu / Forward |
0.000018122 s |
0.00001075002000106906 s |
1.69 |
add_two / Jax / cpu / Forward |
0.000017913 s |
0.000010894060023929342 s |
1.64 |
add_two / HLOOpt / cpu / Forward |
0.000018168 s |
0.00001475107999795 s |
1.23 |
add_two / PartOpt / cpu / Forward |
0.00001777 s |
0.00001495366000199283 s |
1.19 |
add_two / IPartOpt / cpu / Forward |
0.000017646 s |
0.000011008659985236593 s |
1.60 |
add_two / DefOpt / cpu / Forward |
0.000017829999999999997 s |
0.000015363799975602886 s |
1.16 |
add_two / IDefOpt / cpu / Forward |
0.000017694 s |
0.000011219039997740765 s |
1.58 |
add_two / JaXPipe / cpu / PreRev |
0.000023832 s |
0.000014846300009594416 s |
1.61 |
add_two / JaXPipe / cpu / PostRev |
0.00002366 s |
0.000014568139995390084 s |
1.62 |
add_two / JaXPipe / cpu / BothRev |
0.00002468 s |
0.0000145148999945377 s |
1.70 |
add_two / Jax / cpu / BothRev |
0.000023778 s |
0.000015333739975176285 s |
1.55 |
add_two / HLOOpt / cpu / PreRev |
0.000023849 s |
0.000014390839960469749 s |
1.66 |
add_two / HLOOpt / cpu / PostRev |
0.000024789 s |
0.000014757260023543497 s |
1.68 |
add_two / HLOOpt / cpu / BothRev |
0.000023377 s |
0.000016447680000055698 s |
1.42 |
add_two / PartOpt / cpu / PreRev |
0.000023058 s |
0.000014473599985649344 s |
1.59 |
add_two / PartOpt / cpu / PostRev |
0.0000247 s |
0.000014772960003028856 s |
1.67 |
add_two / PartOpt / cpu / BothRev |
0.000024312 s |
0.000014620759993704268 s |
1.66 |
add_two / IPartOpt / cpu / PreRev |
0.000023005 s |
0.00001467320001211192 s |
1.57 |
add_two / IPartOpt / cpu / PostRev |
0.000024355000000000003 s |
0.000014904119971106412 s |
1.63 |
add_two / IPartOpt / cpu / BothRev |
0.000024546 s |
0.000015015659992059229 s |
1.63 |
add_two / DefOpt / cpu / PreRev |
0.000023485 s |
0.000014867620029690442 s |
1.58 |
add_two / DefOpt / cpu / PostRev |
0.000024138 s |
0.000014927759948477615 s |
1.62 |
add_two / DefOpt / cpu / BothRev |
0.00002461 s |
0.000014545599970006152 s |
1.69 |
add_two / IDefOpt / cpu / PreRev |
0.000023423 s |
0.000015273900016836707 s |
1.53 |
add_two / IDefOpt / cpu / PostRev |
0.00003692 s |
0.000015041340038806083 s |
2.45 |
add_two / IDefOpt / cpu / BothRev |
0.000024064 s |
0.000015036300028441474 s |
1.60 |
add_two / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000008162819976860192 s |
1.10 |
add_two / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000007155640005294117 s |
1.26 |
add_two / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000010254219951093546 s |
0.88 |
add_two / PartOpt / cpu / Primal |
0.000008 s |
0.00000718430002052628 s |
1.11 |
add_two / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007414760048050084 s |
1.21 |
add_two / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000011154560015711467 s |
0.81 |
add_two / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000070370199864555614 s |
1.28 |
add_two / JaXPipe / cpu / Forward |
0.000013 s |
0.00001075002000106906 s |
1.21 |
add_two / Jax / cpu / Forward |
0.000012 s |
0.000010894060023929342 s |
1.10 |
add_two / HLOOpt / cpu / Forward |
0.000013 s |
0.00001475107999795 s |
0.88 |
add_two / PartOpt / cpu / Forward |
0.000012 s |
0.00001495366000199283 s |
0.80 |
add_two / IPartOpt / cpu / Forward |
0.000012 s |
0.000011008659985236593 s |
1.09 |
add_two / DefOpt / cpu / Forward |
0.000013 s |
0.000015363799975602886 s |
0.85 |
add_two / IDefOpt / cpu / Forward |
0.000014 s |
0.000011219039997740765 s |
1.25 |
add_two / JaXPipe / cpu / PreRev |
0.000016 s |
0.000014846300009594416 s |
1.08 |
add_two / JaXPipe / cpu / PostRev |
0.000017 s |
0.000014568139995390084 s |
1.17 |
add_two / JaXPipe / cpu / BothRev |
0.000017 s |
0.0000145148999945377 s |
1.17 |
add_two / Jax / cpu / BothRev |
0.000016 s |
0.000015333739975176285 s |
1.04 |
add_two / HLOOpt / cpu / PreRev |
0.000017 s |
0.000014390839960469749 s |
1.18 |
add_two / HLOOpt / cpu / PostRev |
0.000016 s |
0.000014757260023543497 s |
1.08 |
add_two / HLOOpt / cpu / BothRev |
0.000017 s |
0.000016447680000055698 s |
1.03 |
add_two / PartOpt / cpu / PreRev |
0.000016 s |
0.000014473599985649344 s |
1.11 |
add_two / PartOpt / cpu / PostRev |
0.000017 s |
0.000014772960003028856 s |
1.15 |
add_two / PartOpt / cpu / BothRev |
0.000016 s |
0.000014620759993704268 s |
1.09 |
add_two / IPartOpt / cpu / PreRev |
0.000052 s |
0.00001467320001211192 s |
3.54 |
add_two / IPartOpt / cpu / PostRev |
0.000016 s |
0.000014904119971106412 s |
1.07 |
add_two / IPartOpt / cpu / BothRev |
0.000016 s |
0.000015015659992059229 s |
1.07 |
add_two / DefOpt / cpu / PreRev |
0.000017 s |
0.000014867620029690442 s |
1.14 |
add_two / DefOpt / cpu / PostRev |
0.000016 s |
0.000014927759948477615 s |
1.07 |
add_two / DefOpt / cpu / BothRev |
0.000016 s |
0.000014545599970006152 s |
1.10 |
add_two / IDefOpt / cpu / PreRev |
0.000015 s |
0.000015273900016836707 s |
0.98 |
add_two / IDefOpt / cpu / PostRev |
0.000016 s |
0.000015041340038806083 s |
1.06 |
add_two / IDefOpt / cpu / BothRev |
0.000016 s |
0.000015036300028441474 s |
1.06 |
cache / JaXPipe / cpu / Primal |
0.000007032439998511108 s |
0.000007066839998515206 s |
1.00 |
cache / Jax / cpu / Primal |
0.0000069124799665587485 s |
0.000007472959996448481 s |
0.92 |
cache / HLOOpt / cpu / Primal |
0.000006889740006954526 s |
0.000006779060013286653 s |
1.02 |
cache / PartOpt / cpu / Primal |
0.000007360560002780403 s |
0.000006673139987469767 s |
1.10 |
cache / IPartOpt / cpu / Primal |
0.000007197780032583978 s |
0.000007064180017550825 s |
1.02 |
cache / DefOpt / cpu / Primal |
0.000007525120045102085 s |
0.000006838919962319778 s |
1.10 |
cache / IDefOpt / cpu / Primal |
0.000007224980026876437 s |
0.0000064925199876597614 s |
1.11 |
cache / JaXPipe / cpu / Forward |
0.000014878199963277438 s |
0.00001453619999665534 s |
1.02 |
cache / Jax / cpu / Forward |
0.000014946679984859656 s |
0.000014236959959816886 s |
1.05 |
cache / HLOOpt / cpu / Forward |
0.00001939286004017049 s |
0.000019094519993814176 s |
1.02 |
cache / PartOpt / cpu / Forward |
0.000020110299992666117 s |
0.000018744980025076075 s |
1.07 |
cache / IPartOpt / cpu / Forward |
0.000014990820027378504 s |
0.000013940860044385773 s |
1.08 |
cache / DefOpt / cpu / Forward |
0.00002035759999671427 s |
0.000023298279975279 s |
0.87 |
cache / IDefOpt / cpu / Forward |
0.000015212139987852423 s |
0.000013929639981142828 s |
1.09 |
cache / JaXPipe / cpu / PreRev |
0.000016578160011704312 s |
0.00001627784004085697 s |
1.02 |
cache / JaXPipe / cpu / PostRev |
0.00002125364003404684 s |
0.00002106017997903109 s |
1.01 |
cache / JaXPipe / cpu / BothRev |
0.00001933483998072916 s |
0.000015854440007387894 s |
1.22 |
cache / Jax / cpu / BothRev |
0.000020725140029753677 s |
0.000021164679992580204 s |
0.98 |
cache / HLOOpt / cpu / PreRev |
0.00001697658003649849 s |
0.00001555754000946763 s |
1.09 |
cache / HLOOpt / cpu / PostRev |
0.000017357039996568347 s |
0.00001774892000867112 s |
0.98 |
cache / HLOOpt / cpu / BothRev |
0.00001915232001010736 s |
0.000017897059997267206 s |
1.07 |
cache / PartOpt / cpu / PreRev |
0.000016229460006798034 s |
0.000015700039957664556 s |
1.03 |
cache / PartOpt / cpu / PostRev |
0.000020753379967572986 s |
0.000026045420017908327 s |
0.80 |
cache / PartOpt / cpu / BothRev |
0.000016577880023760373 s |
0.000015716900006736977 s |
1.05 |
cache / IPartOpt / cpu / PreRev |
0.00002309939998667687 s |
0.00001823539995712053 s |
1.27 |
cache / IPartOpt / cpu / PostRev |
0.00002162253998903907 s |
0.000021174960011194344 s |
1.02 |
cache / IPartOpt / cpu / BothRev |
0.000016900039963729795 s |
0.000015372979996755022 s |
1.10 |
cache / DefOpt / cpu / PreRev |
0.000017515500003355557 s |
0.000015136819974941318 s |
1.16 |
cache / DefOpt / cpu / PostRev |
0.00002289114001541748 s |
0.000016664260010657016 s |
1.37 |
cache / DefOpt / cpu / BothRev |
0.00001779193999027484 s |
0.000016260779993899632 s |
1.09 |
cache / IDefOpt / cpu / PreRev |
0.000017509880008219626 s |
0.000016183900052055832 s |
1.08 |
cache / IDefOpt / cpu / PostRev |
0.00001735616003315954 s |
0.000015626220010744875 s |
1.11 |
cache / IDefOpt / cpu / BothRev |
0.000018139799940399823 s |
0.000015151899979173325 s |
1.20 |
cache / JaXPipe / cuda / Primal |
0.000002304 s |
0.0000023050000000000004 s |
1.00 |
cache / Jax / cuda / Primal |
0.000002272 s |
0.000002272 s |
1 |
cache / HLOOpt / cuda / Primal |
0.000002272 s |
0.000002273 s |
1.00 |
cache / PartOpt / cuda / Primal |
0.00000224 s |
0.00000224 s |
1 |
cache / IPartOpt / cuda / Primal |
0.000002273 s |
0.000002272 s |
1.00 |
cache / DefOpt / cuda / Primal |
0.000002272 s |
0.00000224 s |
1.01 |
cache / IDefOpt / cuda / Primal |
0.000002304 s |
0.000002304 s |
1 |
cache / JaXPipe / cuda / Forward |
0.000002335 s |
0.000002336 s |
1.00 |
cache / Jax / cuda / Forward |
0.000002304 s |
0.000002304 s |
1 |
cache / HLOOpt / cuda / Forward |
0.000002336 s |
0.000002335 s |
1.00 |
cache / PartOpt / cuda / Forward |
0.000002335 s |
0.000002335 s |
1 |
cache / IPartOpt / cuda / Forward |
0.000002273 s |
0.000002304 s |
0.99 |
cache / DefOpt / cuda / Forward |
0.00000224 s |
0.000002272 s |
0.99 |
cache / IDefOpt / cuda / Forward |
0.000002273 s |
0.000002304 s |
0.99 |
cache / JaXPipe / cuda / PreRev |
0.000011520000000000002 s |
0.000011264 s |
1.02 |
cache / JaXPipe / cuda / PostRev |
0.000011616 s |
0.000011520000000000002 s |
1.01 |
cache / JaXPipe / cuda / BothRev |
0.000011584 s |
0.000012192 s |
0.95 |
cache / Jax / cuda / BothRev |
0.000011488 s |
0.000011520000000000002 s |
1.00 |
cache / HLOOpt / cuda / PreRev |
0.000013504 s |
0.000013248 s |
1.02 |
cache / HLOOpt / cuda / PostRev |
0.000013535 s |
0.000013215 s |
1.02 |
cache / HLOOpt / cuda / BothRev |
0.000013504 s |
0.000013216 s |
1.02 |
cache / PartOpt / cuda / PreRev |
0.0000112 s |
0.000011775 s |
0.95 |
cache / PartOpt / cuda / PostRev |
0.00001184 s |
0.000011584 s |
1.02 |
cache / PartOpt / cuda / BothRev |
0.000011071 s |
0.000011264 s |
0.98 |
cache / IPartOpt / cuda / PreRev |
0.000012096 s |
0.000011584 s |
1.04 |
cache / IPartOpt / cuda / PostRev |
0.000011488 s |
0.00001168 s |
0.98 |
cache / IPartOpt / cuda / BothRev |
0.000011392 s |
0.000011712 s |
0.97 |
cache / DefOpt / cuda / PreRev |
0.000011328 s |
0.00001136 s |
1.00 |
cache / DefOpt / cuda / PostRev |
0.000011392 s |
0.000011776 s |
0.97 |
cache / DefOpt / cuda / BothRev |
0.000011296 s |
0.000011392 s |
0.99 |
cache / IDefOpt / cuda / PreRev |
0.0000112 s |
0.000011488 s |
0.97 |
cache / IDefOpt / cuda / PostRev |
0.000011520000000000002 s |
0.000011488 s |
1.00 |
cache / IDefOpt / cuda / BothRev |
0.000011712 s |
0.000011936 s |
0.98 |
cache / JaXPipe / tpu / Primal |
0.000002481025 s |
0.0000024641 s |
1.01 |
cache / Jax / tpu / Primal |
0.000002459425 s |
0.0000024560250000000003 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002454125 s |
0.000002472575 s |
0.99 |
cache / PartOpt / tpu / Primal |
0.00000246745 s |
0.000002467225 s |
1.00 |
cache / IPartOpt / tpu / Primal |
0.000002488475 s |
0.000002477975 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.0000024755750000000004 s |
0.000002458625 s |
1.01 |
cache / IDefOpt / tpu / Primal |
0.000002447475 s |
0.0000024739 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.000003560075 s |
0.000003538975 s |
1.01 |
cache / Jax / tpu / Forward |
0.00000354345 s |
0.000003541325 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.00000353675 s |
0.000003561925 s |
0.99 |
cache / PartOpt / tpu / Forward |
0.0000035326000000000003 s |
0.0000035528749999999994 s |
0.99 |
cache / IPartOpt / tpu / Forward |
0.0000035527 s |
0.0000035541 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.0000035262749999999995 s |
0.000003528425 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.0000035397 s |
0.00000352755 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.000004992750000000001 s |
0.000004963125 s |
1.01 |
cache / JaXPipe / tpu / PostRev |
0.000005001175 s |
0.000004970225 s |
1.01 |
cache / JaXPipe / tpu / BothRev |
0.000005002 s |
0.00000497175 s |
1.01 |
cache / Jax / tpu / BothRev |
0.000005012075000000001 s |
0.00000498175 s |
1.01 |
cache / HLOOpt / tpu / PreRev |
0.00000398085 s |
0.0000039377 s |
1.01 |
cache / HLOOpt / tpu / PostRev |
0.0000041382 s |
0.000004116650000000001 s |
1.01 |
cache / HLOOpt / tpu / BothRev |
0.00000398985 s |
0.0000039415 s |
1.01 |
cache / PartOpt / tpu / PreRev |
0.00000503165 s |
0.00000497795 s |
1.01 |
cache / PartOpt / tpu / PostRev |
0.00000502305 s |
0.000004961175 s |
1.01 |
cache / PartOpt / tpu / BothRev |
0.0000050404 s |
0.000004959475 s |
1.02 |
cache / IPartOpt / tpu / PreRev |
0.0000050637750000000005 s |
0.000004961225000000001 s |
1.02 |
cache / IPartOpt / tpu / PostRev |
0.000005027975000000001 s |
0.000004974049999999999 s |
1.01 |
cache / IPartOpt / tpu / BothRev |
0.000005021975 s |
0.000004962675 s |
1.01 |
cache / DefOpt / tpu / PreRev |
0.000005029125 s |
0.000004962375 s |
1.01 |
cache / DefOpt / tpu / PostRev |
0.0000050181 s |
0.000004966875 s |
1.01 |
cache / DefOpt / tpu / BothRev |
0.000005005950000000001 s |
0.00000497465 s |
1.01 |
cache / IDefOpt / tpu / PreRev |
0.000005043174999999999 s |
0.00000495725 s |
1.02 |
cache / IDefOpt / tpu / PostRev |
0.000005048850000000001 s |
0.000004968925 s |
1.02 |
cache / IDefOpt / tpu / BothRev |
0.000005042225 s |
0.000004959 s |
1.02 |
cache / JaXPipe / cpu / Primal |
0.000013327 s |
0.000007066839998515206 s |
1.89 |
cache / Jax / cpu / Primal |
0.00001314 s |
0.000007472959996448481 s |
1.76 |
cache / HLOOpt / cpu / Primal |
0.000012941 s |
0.000006779060013286653 s |
1.91 |
cache / PartOpt / cpu / Primal |
0.00001315 s |
0.000006673139987469767 s |
1.97 |
cache / IPartOpt / cpu / Primal |
0.000012747 s |
0.000007064180017550825 s |
1.80 |
cache / DefOpt / cpu / Primal |
0.000012732 s |
0.000006838919962319778 s |
1.86 |
cache / IDefOpt / cpu / Primal |
0.000012795 s |
0.0000064925199876597614 s |
1.97 |
cache / JaXPipe / cpu / Forward |
0.000017901 s |
0.00001453619999665534 s |
1.23 |
cache / Jax / cpu / Forward |
0.000017477 s |
0.000014236959959816886 s |
1.23 |
cache / HLOOpt / cpu / Forward |
0.000017316 s |
0.000019094519993814176 s |
0.91 |
cache / PartOpt / cpu / Forward |
0.000017473 s |
0.000018744980025076075 s |
0.93 |
cache / IPartOpt / cpu / Forward |
0.000017687000000000002 s |
0.000013940860044385773 s |
1.27 |
cache / DefOpt / cpu / Forward |
0.000017196 s |
0.000023298279975279 s |
0.74 |
cache / IDefOpt / cpu / Forward |
0.000017542 s |
0.000013929639981142828 s |
1.26 |
cache / JaXPipe / cpu / PreRev |
0.000017177000000000002 s |
0.00001627784004085697 s |
1.06 |
cache / JaXPipe / cpu / PostRev |
0.000019992 s |
0.00002106017997903109 s |
0.95 |
cache / JaXPipe / cpu / BothRev |
0.000017902000000000002 s |
0.000015854440007387894 s |
1.13 |
cache / Jax / cpu / BothRev |
0.000020881 s |
0.000021164679992580204 s |
0.99 |
cache / HLOOpt / cpu / PreRev |
0.000018121 s |
0.00001555754000946763 s |
1.16 |
cache / HLOOpt / cpu / PostRev |
0.000017554 s |
0.00001774892000867112 s |
0.99 |
cache / HLOOpt / cpu / BothRev |
0.000017754 s |
0.000017897059997267206 s |
0.99 |
cache / PartOpt / cpu / PreRev |
0.000017331 s |
0.000015700039957664556 s |
1.10 |
cache / PartOpt / cpu / PostRev |
0.000020391 s |
0.000026045420017908327 s |
0.78 |
cache / PartOpt / cpu / BothRev |
0.000018021 s |
0.000015716900006736977 s |
1.15 |
cache / IPartOpt / cpu / PreRev |
0.00001814 s |
0.00001823539995712053 s |
0.99 |
cache / IPartOpt / cpu / PostRev |
0.000020742 s |
0.000021174960011194344 s |
0.98 |
cache / IPartOpt / cpu / BothRev |
0.00001804 s |
0.000015372979996755022 s |
1.17 |
cache / DefOpt / cpu / PreRev |
0.000017622 s |
0.000015136819974941318 s |
1.16 |
cache / DefOpt / cpu / PostRev |
0.000017899999999999998 s |
0.000016664260010657016 s |
1.07 |
cache / DefOpt / cpu / BothRev |
0.000017853 s |
0.000016260779993899632 s |
1.10 |
cache / IDefOpt / cpu / PreRev |
0.000017255 s |
0.000016183900052055832 s |
1.07 |
cache / IDefOpt / cpu / PostRev |
0.000018713 s |
0.000015626220010744875 s |
1.20 |
cache / IDefOpt / cpu / BothRev |
0.000018281 s |
0.000015151899979173325 s |
1.21 |
cache / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000007066839998515206 s |
1.27 |
cache / Jax / cpu / Primal |
0.00003 s |
0.000007472959996448481 s |
4.01 |
cache / HLOOpt / cpu / Primal |
0.000008 s |
0.000006779060013286653 s |
1.18 |
cache / PartOpt / cpu / Primal |
0.000008 s |
0.000006673139987469767 s |
1.20 |
cache / IPartOpt / cpu / Primal |
0.000008 s |
0.000007064180017550825 s |
1.13 |
cache / DefOpt / cpu / Primal |
0.000008 s |
0.000006838919962319778 s |
1.17 |
cache / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.0000064925199876597614 s |
1.39 |
cache / JaXPipe / cpu / Forward |
0.000035000000000000004 s |
0.00001453619999665534 s |
2.41 |
cache / Jax / cpu / Forward |
0.000025 s |
0.000014236959959816886 s |
1.76 |
cache / HLOOpt / cpu / Forward |
0.000035000000000000004 s |
0.000019094519993814176 s |
1.83 |
cache / PartOpt / cpu / Forward |
0.000037 s |
0.000018744980025076075 s |
1.97 |
cache / IPartOpt / cpu / Forward |
0.00001 s |
0.000013940860044385773 s |
0.72 |
cache / DefOpt / cpu / Forward |
0.000017 s |
0.000023298279975279 s |
0.73 |
cache / IDefOpt / cpu / Forward |
0.000017999999999999997 s |
0.000013929639981142828 s |
1.29 |
cache / JaXPipe / cpu / PreRev |
0.000011 s |
0.00001627784004085697 s |
0.68 |
cache / JaXPipe / cpu / PostRev |
0.000046 s |
0.00002106017997903109 s |
2.18 |
cache / JaXPipe / cpu / BothRev |
0.000035999999999999994 s |
0.000015854440007387894 s |
2.27 |
cache / Jax / cpu / BothRev |
0.000035999999999999994 s |
0.000021164679992580204 s |
1.70 |
cache / HLOOpt / cpu / PreRev |
0.000011 s |
0.00001555754000946763 s |
0.71 |
cache / HLOOpt / cpu / PostRev |
0.000011 s |
0.00001774892000867112 s |
0.62 |
cache / HLOOpt / cpu / BothRev |
0.000014 s |
0.000017897059997267206 s |
0.78 |
cache / PartOpt / cpu / PreRev |
0.000013 s |
0.000015700039957664556 s |
0.83 |
cache / PartOpt / cpu / PostRev |
0.000013 s |
0.000026045420017908327 s |
0.50 |
cache / PartOpt / cpu / BothRev |
0.000011 s |
0.000015716900006736977 s |
0.70 |
cache / IPartOpt / cpu / PreRev |
0.000011 s |
0.00001823539995712053 s |
0.60 |
cache / IPartOpt / cpu / PostRev |
0.00003 s |
0.000021174960011194344 s |
1.42 |
cache / IPartOpt / cpu / BothRev |
0.000017 s |
0.000015372979996755022 s |
1.11 |
cache / DefOpt / cpu / PreRev |
0.000011 s |
0.000015136819974941318 s |
0.73 |
cache / DefOpt / cpu / PostRev |
0.000011 s |
0.000016664260010657016 s |
0.66 |
cache / DefOpt / cpu / BothRev |
0.000035000000000000004 s |
0.000016260779993899632 s |
2.15 |
cache / IDefOpt / cpu / PreRev |
0.000011 s |
0.000016183900052055832 s |
0.68 |
cache / IDefOpt / cpu / PostRev |
0.00001 s |
0.000015626220010744875 s |
0.64 |
cache / IDefOpt / cpu / BothRev |
0.000011 s |
0.000015151899979173325 s |
0.73 |
Concat / JaXPipe / cpu / Primal |
0.00000829906004582881 s |
0.00000728931998310145 s |
1.14 |
Concat / Jax / cpu / Primal |
0.000007768720042804489 s |
0.000007061659989631152 s |
1.10 |
Concat / HLOOpt / cpu / Primal |
0.000010710960032156435 s |
0.000009589940036676126 s |
1.12 |
Concat / PartOpt / cpu / Primal |
0.000007859679999455694 s |
0.00000695204000294325 s |
1.13 |
Concat / IPartOpt / cpu / Primal |
0.000007252259983943075 s |
0.000006810799986851635 s |
1.06 |
Concat / DefOpt / cpu / Primal |
0.000011469840028439647 s |
0.000010120979968633036 s |
1.13 |
Concat / IDefOpt / cpu / Primal |
0.000007617159963047016 s |
0.000006804140002714121 s |
1.12 |
Concat / JaXPipe / cpu / Forward |
0.000011274760008745944 s |
0.000010196720004387315 s |
1.11 |
Concat / Jax / cpu / Forward |
0.000011028039998564057 s |
0.00001051378000738623 s |
1.05 |
Concat / HLOOpt / cpu / Forward |
0.00001542666002023907 s |
0.000014242419974834774 s |
1.08 |
Concat / PartOpt / cpu / Forward |
0.00001604325998414424 s |
0.000014837000007901224 s |
1.08 |
Concat / IPartOpt / cpu / Forward |
0.000011542860002009548 s |
0.000010930519947578431 s |
1.06 |
Concat / DefOpt / cpu / Forward |
0.0000161349199788674 s |
0.00001516530002845684 s |
1.06 |
Concat / IDefOpt / cpu / Forward |
0.00001182561995847209 s |
0.000010654359984982876 s |
1.11 |
Concat / JaXPipe / cpu / PreRev |
0.000013263659984659171 s |
0.000012291220000406611 s |
1.08 |
Concat / JaXPipe / cpu / PostRev |
0.000013080599965178408 s |
0.000012073859988959156 s |
1.08 |
Concat / JaXPipe / cpu / BothRev |
0.000012982260068383768 s |
0.000011819039991678438 s |
1.10 |
Concat / Jax / cpu / BothRev |
0.000012999519994991716 s |
0.000012705760009339428 s |
1.02 |
Concat / HLOOpt / cpu / PreRev |
0.00001275908000934578 s |
0.000011614619997999398 s |
1.10 |
Concat / HLOOpt / cpu / PostRev |
0.000016885180011740887 s |
0.00001240181997673062 s |
1.36 |
Concat / HLOOpt / cpu / BothRev |
0.000014768179989914642 s |
0.000013595179925687262 s |
1.09 |
Concat / PartOpt / cpu / PreRev |
0.000012778419986716473 s |
0.000011691240015352378 s |
1.09 |
Concat / PartOpt / cpu / PostRev |
0.000013143920041329691 s |
0.00001208772000609315 s |
1.09 |
Concat / PartOpt / cpu / BothRev |
0.00001331450002908241 s |
0.000011353420031809946 s |
1.17 |
Concat / IPartOpt / cpu / PreRev |
0.000018031419967883268 s |
0.000012248699995325296 s |
1.47 |
Concat / IPartOpt / cpu / PostRev |
0.000012552600010167224 s |
0.000011892720031028149 s |
1.06 |
Concat / IPartOpt / cpu / BothRev |
0.000012262579994057886 s |
0.00001183884000965918 s |
1.04 |
Concat / DefOpt / cpu / PreRev |
0.000012731639999401523 s |
0.000012121159907110268 s |
1.05 |
Concat / DefOpt / cpu / PostRev |
0.000013212040003054426 s |
0.000011294820005787189 s |
1.17 |
Concat / DefOpt / cpu / BothRev |
0.000012634260001505026 s |
0.000012426319981386767 s |
1.02 |
Concat / IDefOpt / cpu / PreRev |
0.000012809499994546058 s |
0.000011653779965854484 s |
1.10 |
Concat / IDefOpt / cpu / PostRev |
0.000012846619965785066 s |
0.00001219744000081846 s |
1.05 |
Concat / IDefOpt / cpu / BothRev |
0.000012837839985877508 s |
0.00001177469997855951 s |
1.09 |
Concat / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / JaXPipe / cuda / Forward |
0.000010209 s |
0.000009729 s |
1.05 |
Concat / Jax / cuda / Forward |
0.000009952 s |
0.000009824 s |
1.01 |
Concat / HLOOpt / cuda / Forward |
0.000010176 s |
0.000009951 s |
1.02 |
Concat / PartOpt / cuda / Forward |
0.000010048 s |
0.00000976 s |
1.03 |
Concat / IPartOpt / cuda / Forward |
0.000010336 s |
0.000009824 s |
1.05 |
Concat / DefOpt / cuda / Forward |
0.000009792 s |
0.000009792 s |
1 |
Concat / IDefOpt / cuda / Forward |
0.000010176 s |
0.000009856 s |
1.03 |
Concat / JaXPipe / cuda / PreRev |
0.000016704 s |
0.00001664 s |
1.00 |
Concat / JaXPipe / cuda / PostRev |
0.000016544 s |
0.000016576000000000002 s |
1.00 |
Concat / JaXPipe / cuda / BothRev |
0.000016448000000000002 s |
0.000016544 s |
0.99 |
Concat / Jax / cuda / BothRev |
0.000016576000000000002 s |
0.000016544 s |
1.00 |
Concat / HLOOpt / cuda / PreRev |
0.000016864 s |
0.000016672 s |
1.01 |
Concat / HLOOpt / cuda / PostRev |
0.000016608 s |
0.000016448000000000002 s |
1.01 |
Concat / HLOOpt / cuda / BothRev |
0.000016352 s |
0.00001664 s |
0.98 |
Concat / PartOpt / cuda / PreRev |
0.0000168 s |
0.000016832 s |
1.00 |
Concat / PartOpt / cuda / PostRev |
0.000016321 s |
0.000016608 s |
0.98 |
Concat / PartOpt / cuda / BothRev |
0.000017087 s |
0.000016608 s |
1.03 |
Concat / IPartOpt / cuda / PreRev |
0.000016352 s |
0.000016768000000000003 s |
0.98 |
Concat / IPartOpt / cuda / PostRev |
0.000016224 s |
0.000016 s |
1.01 |
Concat / IPartOpt / cuda / BothRev |
0.000016416 s |
0.00001632 s |
1.01 |
Concat / DefOpt / cuda / PreRev |
0.000016736 s |
0.000016416 s |
1.02 |
Concat / DefOpt / cuda / PostRev |
0.000016608 s |
0.000016768000000000003 s |
0.99 |
Concat / DefOpt / cuda / BothRev |
0.000015904000000000002 s |
0.000016737 s |
0.95 |
Concat / IDefOpt / cuda / PreRev |
0.000016832 s |
0.000017151 s |
0.98 |
Concat / IDefOpt / cuda / PostRev |
0.000016768000000000003 s |
0.000018272 s |
0.92 |
Concat / IDefOpt / cuda / BothRev |
0.000016575 s |
0.000016161 s |
1.03 |
Concat / JaXPipe / tpu / Primal |
0.0000015347 s |
0.00000153695 s |
1.00 |
Concat / Jax / tpu / Primal |
0.000001520925 s |
0.0000015382000000000002 s |
0.99 |
Concat / HLOOpt / tpu / Primal |
0.000001532775 s |
0.0000015347 s |
1.00 |
Concat / PartOpt / tpu / Primal |
0.0000015308250000000005 s |
0.000001523 s |
1.01 |
Concat / IPartOpt / tpu / Primal |
0.000001540875 s |
0.0000015328249999999998 s |
1.01 |
Concat / DefOpt / tpu / Primal |
0.000001525625 s |
0.000001518975 s |
1.00 |
Concat / IDefOpt / tpu / Primal |
0.00000153045 s |
0.00000152715 s |
1.00 |
Concat / JaXPipe / tpu / Forward |
0.0000015790499999999995 s |
0.0000015834999999999995 s |
1.00 |
Concat / Jax / tpu / Forward |
0.0000015596750000000002 s |
0.000001556725 s |
1.00 |
Concat / HLOOpt / tpu / Forward |
0.0000015848750000000002 s |
0.0000015840750000000002 s |
1.00 |
Concat / PartOpt / tpu / Forward |
0.000001563025 s |
0.0000015588750000000002 s |
1.00 |
Concat / IPartOpt / tpu / Forward |
0.000001587725 s |
0.0000015833 s |
1.00 |
Concat / DefOpt / tpu / Forward |
0.000001566775 s |
0.000001565725 s |
1.00 |
Concat / IDefOpt / tpu / Forward |
0.000001583725 s |
0.0000015870750000000002 s |
1.00 |
Concat / JaXPipe / tpu / PreRev |
0.000002010675 s |
0.00000201065 s |
1.00 |
Concat / JaXPipe / tpu / PostRev |
0.00000207025 s |
0.0000020853 s |
0.99 |
Concat / JaXPipe / tpu / BothRev |
0.00000201945 s |
0.0000020050000000000003 s |
1.01 |
Concat / Jax / tpu / BothRev |
0.000002076 s |
0.000002069025 s |
1.00 |
Concat / HLOOpt / tpu / PreRev |
0.000002023475 s |
0.000002011075 s |
1.01 |
Concat / HLOOpt / tpu / PostRev |
0.000002075375 s |
0.000002083 s |
1.00 |
Concat / HLOOpt / tpu / BothRev |
0.0000020157 s |
0.0000020125 s |
1.00 |
Concat / PartOpt / tpu / PreRev |
0.00000207855 s |
0.000002075825 s |
1.00 |
Concat / PartOpt / tpu / PostRev |
0.0000020102000000000003 s |
0.000002008925 s |
1.00 |
Concat / PartOpt / tpu / BothRev |
0.000002066275 s |
0.000002089875 s |
0.99 |
Concat / IPartOpt / tpu / PreRev |
0.0000020143 s |
0.000002010675 s |
1.00 |
Concat / IPartOpt / tpu / PostRev |
0.00000206245 s |
0.000002073175 s |
0.99 |
Concat / IPartOpt / tpu / BothRev |
0.0000020061750000000003 s |
0.0000020139000000000003 s |
1.00 |
Concat / DefOpt / tpu / PreRev |
0.0000020683999999999995 s |
0.0000020753 s |
1.00 |
Concat / DefOpt / tpu / PostRev |
0.000002005575 s |
0.000002000725 s |
1.00 |
Concat / DefOpt / tpu / BothRev |
0.000002070475 s |
0.000002083775 s |
0.99 |
Concat / IDefOpt / tpu / PreRev |
0.000002012925 s |
0.0000020033 s |
1.00 |
Concat / IDefOpt / tpu / PostRev |
0.000002065875 s |
0.00000207225 s |
1.00 |
Concat / IDefOpt / tpu / BothRev |
0.000002008625 s |
0.00000200575 s |
1.00 |
Concat / JaXPipe / cpu / Primal |
0.000013135 s |
0.00000728931998310145 s |
1.80 |
Concat / Jax / cpu / Primal |
0.000013291 s |
0.000007061659989631152 s |
1.88 |
Concat / HLOOpt / cpu / Primal |
0.000012851 s |
0.000009589940036676126 s |
1.34 |
Concat / PartOpt / cpu / Primal |
0.000012989 s |
0.00000695204000294325 s |
1.87 |
Concat / IPartOpt / cpu / Primal |
0.000012825 s |
0.000006810799986851635 s |
1.88 |
Concat / DefOpt / cpu / Primal |
0.000013006 s |
0.000010120979968633036 s |
1.29 |
Concat / IDefOpt / cpu / Primal |
0.000012924 s |
0.000006804140002714121 s |
1.90 |
Concat / JaXPipe / cpu / Forward |
0.000017676999999999997 s |
0.000010196720004387315 s |
1.73 |
Concat / Jax / cpu / Forward |
0.000017198 s |
0.00001051378000738623 s |
1.64 |
Concat / HLOOpt / cpu / Forward |
0.00001711 s |
0.000014242419974834774 s |
1.20 |
Concat / PartOpt / cpu / Forward |
0.000017670000000000002 s |
0.000014837000007901224 s |
1.19 |
Concat / IPartOpt / cpu / Forward |
0.000017327 s |
0.000010930519947578431 s |
1.59 |
Concat / DefOpt / cpu / Forward |
0.000017514 s |
0.00001516530002845684 s |
1.15 |
Concat / IDefOpt / cpu / Forward |
0.000017696 s |
0.000010654359984982876 s |
1.66 |
Concat / JaXPipe / cpu / PreRev |
0.000020065 s |
0.000012291220000406611 s |
1.63 |
Concat / JaXPipe / cpu / PostRev |
0.00001984 s |
0.000012073859988959156 s |
1.64 |
Concat / JaXPipe / cpu / BothRev |
0.000019791 s |
0.000011819039991678438 s |
1.67 |
Concat / Jax / cpu / BothRev |
0.000019968 s |
0.000012705760009339428 s |
1.57 |
Concat / HLOOpt / cpu / PreRev |
0.00002003 s |
0.000011614619997999398 s |
1.72 |
Concat / HLOOpt / cpu / PostRev |
0.000020277 s |
0.00001240181997673062 s |
1.64 |
Concat / HLOOpt / cpu / BothRev |
0.000019494 s |
0.000013595179925687262 s |
1.43 |
Concat / PartOpt / cpu / PreRev |
0.000020321 s |
0.000011691240015352378 s |
1.74 |
Concat / PartOpt / cpu / PostRev |
0.000020031 s |
0.00001208772000609315 s |
1.66 |
Concat / PartOpt / cpu / BothRev |
0.000019756 s |
0.000011353420031809946 s |
1.74 |
Concat / IPartOpt / cpu / PreRev |
0.000019728 s |
0.000012248699995325296 s |
1.61 |
Concat / IPartOpt / cpu / PostRev |
0.000019552 s |
0.000011892720031028149 s |
1.64 |
Concat / IPartOpt / cpu / BothRev |
0.000019684 s |
0.00001183884000965918 s |
1.66 |
Concat / DefOpt / cpu / PreRev |
0.000020076 s |
0.000012121159907110268 s |
1.66 |
Concat / DefOpt / cpu / PostRev |
0.000020269 s |
0.000011294820005787189 s |
1.79 |
Concat / DefOpt / cpu / BothRev |
0.000020537 s |
0.000012426319981386767 s |
1.65 |
Concat / IDefOpt / cpu / PreRev |
0.000020073 s |
0.000011653779965854484 s |
1.72 |
Concat / IDefOpt / cpu / PostRev |
0.000020138 s |
0.00001219744000081846 s |
1.65 |
Concat / IDefOpt / cpu / BothRev |
0.000019838 s |
0.00001177469997855951 s |
1.68 |
Concat / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.00000728931998310145 s |
1.23 |
Concat / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000007061659989631152 s |
1.27 |
Concat / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000009589940036676126 s |
0.94 |
Concat / PartOpt / cpu / Primal |
0.000008999999999999999 s |
0.00000695204000294325 s |
1.29 |
Concat / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006810799986851635 s |
1.32 |
Concat / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000010120979968633036 s |
0.89 |
Concat / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006804140002714121 s |
1.32 |
Concat / JaXPipe / cpu / Forward |
0.000013 s |
0.000010196720004387315 s |
1.27 |
Concat / Jax / cpu / Forward |
0.000012 s |
0.00001051378000738623 s |
1.14 |
Concat / HLOOpt / cpu / Forward |
0.000012 s |
0.000014242419974834774 s |
0.84 |
Concat / PartOpt / cpu / Forward |
0.000012 s |
0.000014837000007901224 s |
0.81 |
Concat / IPartOpt / cpu / Forward |
0.000013 s |
0.000010930519947578431 s |
1.19 |
Concat / DefOpt / cpu / Forward |
0.000041 s |
0.00001516530002845684 s |
2.70 |
Concat / IDefOpt / cpu / Forward |
0.000013 s |
0.000010654359984982876 s |
1.22 |
Concat / JaXPipe / cpu / PreRev |
0.000015 s |
0.000012291220000406611 s |
1.22 |
Concat / JaXPipe / cpu / PostRev |
0.000014 s |
0.000012073859988959156 s |
1.16 |
Concat / JaXPipe / cpu / BothRev |
0.000014 s |
0.000011819039991678438 s |
1.18 |
Concat / Jax / cpu / BothRev |
0.000014 s |
0.000012705760009339428 s |
1.10 |
Concat / HLOOpt / cpu / PreRev |
0.000015 s |
0.000011614619997999398 s |
1.29 |
Concat / HLOOpt / cpu / PostRev |
0.000014 s |
0.00001240181997673062 s |
1.13 |
Concat / HLOOpt / cpu / BothRev |
0.000015 s |
0.000013595179925687262 s |
1.10 |
Concat / PartOpt / cpu / PreRev |
0.000013 s |
0.000011691240015352378 s |
1.11 |
Concat / PartOpt / cpu / PostRev |
0.000015 s |
0.00001208772000609315 s |
1.24 |
Concat / PartOpt / cpu / BothRev |
0.000014 s |
0.000011353420031809946 s |
1.23 |
Concat / IPartOpt / cpu / PreRev |
0.000044 s |
0.000012248699995325296 s |
3.59 |
Concat / IPartOpt / cpu / PostRev |
0.000014 s |
0.000011892720031028149 s |
1.18 |
Concat / IPartOpt / cpu / BothRev |
0.000015 s |
0.00001183884000965918 s |
1.27 |
Concat / DefOpt / cpu / PreRev |
0.000014 s |
0.000012121159907110268 s |
1.16 |
Concat / DefOpt / cpu / PostRev |
0.000014 s |
0.000011294820005787189 s |
1.24 |
Concat / DefOpt / cpu / BothRev |
0.000023 s |
0.000012426319981386767 s |
1.85 |
Concat / IDefOpt / cpu / PreRev |
0.000014 s |
0.000011653779965854484 s |
1.20 |
Concat / IDefOpt / cpu / PostRev |
0.000014 s |
0.00001219744000081846 s |
1.15 |
Concat / IDefOpt / cpu / BothRev |
0.000015 s |
0.00001177469997855951 s |
1.27 |
const_scatter / JaXPipe / cpu / Primal |
0.000007858459948693053 s |
0.000007763400026306044 s |
1.01 |
const_scatter / Jax / cpu / Primal |
0.000008319000025949208 s |
0.0000070458600202982776 s |
1.18 |
const_scatter / HLOOpt / cpu / Primal |
0.000008191460010493756 s |
0.000006994660006967024 s |
1.17 |
const_scatter / PartOpt / cpu / Primal |
0.000007210560015664669 s |
0.0000072007200014923 s |
1.00 |
const_scatter / IPartOpt / cpu / Primal |
0.000007383600004686741 s |
0.000006805479988543084 s |
1.08 |
const_scatter / DefOpt / cpu / Primal |
0.000007526819972554222 s |
0.000006607559998883516 s |
1.14 |
const_scatter / IDefOpt / cpu / Primal |
0.000007296140047401423 s |
0.000006990259953454369 s |
1.04 |
const_scatter / JaXPipe / cpu / Forward |
0.00001072375997864583 s |
0.000010474540013092336 s |
1.02 |
const_scatter / Jax / cpu / Forward |
0.000010594060013318083 s |
0.000010772199975690456 s |
0.98 |
const_scatter / HLOOpt / cpu / Forward |
0.000014750539976375877 s |
0.000014144699998723808 s |
1.04 |
const_scatter / PartOpt / cpu / Forward |
0.000015150840017668088 s |
0.000015396779990624053 s |
0.98 |
const_scatter / IPartOpt / cpu / Forward |
0.000010379679997640778 s |
0.000010131839981113443 s |
1.02 |
const_scatter / DefOpt / cpu / Forward |
0.000015649040033167694 s |
0.000014296299987108796 s |
1.09 |
const_scatter / IDefOpt / cpu / Forward |
0.000010864899986700038 s |
0.00001000824006041512 s |
1.09 |
const_scatter / JaXPipe / cpu / PreRev |
0.0003033570000206 s |
0.0003034807000403 s |
1.00 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002936581799713 s |
0.0002982542200425 s |
0.98 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002863510400129 s |
0.000285023639999 s |
1.00 |
const_scatter / Jax / cpu / BothRev |
0.0002861106400359 s |
0.0002842879400213 s |
1.01 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002875174600103 s |
0.0002853217399842 s |
1.01 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002900775599755 s |
0.0002907451599548 s |
1.00 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002879207200021 s |
0.0002864135400341 s |
1.01 |
const_scatter / PartOpt / cpu / PreRev |
0.0002904777799994 s |
0.0002835488199889 s |
1.02 |
const_scatter / PartOpt / cpu / PostRev |
0.0002927011999963 s |
0.0002898025000013 s |
1.01 |
const_scatter / PartOpt / cpu / BothRev |
0.0002876077000109 s |
0.0002821240400044 s |
1.02 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002914237799723 s |
0.0002919069400104 s |
1.00 |
const_scatter / IPartOpt / cpu / PostRev |
0.000291716099955 s |
0.0002938434999578 s |
0.99 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002864443399721 s |
0.0002843726199625 s |
1.01 |
const_scatter / DefOpt / cpu / PreRev |
0.0002932473599958 s |
0.0002895347999765 s |
1.01 |
const_scatter / DefOpt / cpu / PostRev |
0.0002930403200025 s |
0.000291647899985 s |
1.00 |
const_scatter / DefOpt / cpu / BothRev |
0.0002866178000112 s |
0.0002939875399533 s |
0.97 |
const_scatter / IDefOpt / cpu / PreRev |
0.000290391260014 s |
0.0002921339599561 s |
0.99 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002924229199834 s |
0.0002916513599848 s |
1.00 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002868765000403 s |
0.0002838417199109 s |
1.01 |
const_scatter / JaXPipe / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / HLOOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
const_scatter / PartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / DefOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
const_scatter / IDefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / JaXPipe / cuda / Forward |
0.000010144 s |
0.000009664 s |
1.05 |
const_scatter / Jax / cuda / Forward |
0.000010048 s |
0.00000944 s |
1.06 |
const_scatter / HLOOpt / cuda / Forward |
0.000010208 s |
0.000009728 s |
1.05 |
const_scatter / PartOpt / cuda / Forward |
0.000009888 s |
0.00000992 s |
1.00 |
const_scatter / IPartOpt / cuda / Forward |
0.000009856 s |
0.000009665 s |
1.02 |
const_scatter / DefOpt / cuda / Forward |
0.00001024 s |
0.000009535 s |
1.07 |
const_scatter / IDefOpt / cuda / Forward |
0.000010304 s |
0.000009952 s |
1.04 |
const_scatter / JaXPipe / cuda / PreRev |
0.000012575 s |
0.000012736 s |
0.99 |
const_scatter / JaXPipe / cuda / PostRev |
0.000025568 s |
0.0000168 s |
1.52 |
const_scatter / JaXPipe / cuda / BothRev |
0.000014016 s |
0.000012736 s |
1.10 |
const_scatter / Jax / cuda / BothRev |
0.000018752000000000003 s |
0.000016608 s |
1.13 |
const_scatter / HLOOpt / cuda / PreRev |
0.00001312 s |
0.000012672 s |
1.04 |
const_scatter / HLOOpt / cuda / PostRev |
0.000013632 s |
0.000012512 s |
1.09 |
const_scatter / HLOOpt / cuda / BothRev |
0.00001376 s |
0.00001296 s |
1.06 |
const_scatter / PartOpt / cuda / PreRev |
0.000014272 s |
0.000012992 s |
1.10 |
const_scatter / PartOpt / cuda / PostRev |
0.000016416 s |
0.000016383999999999998 s |
1.00 |
const_scatter / PartOpt / cuda / BothRev |
0.000013024 s |
0.000012736 s |
1.02 |
const_scatter / IPartOpt / cuda / PreRev |
0.000012448 s |
0.000012769 s |
0.97 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016383999999999998 s |
0.000016416 s |
1.00 |
const_scatter / IPartOpt / cuda / BothRev |
0.000012384 s |
0.000012256 s |
1.01 |
const_scatter / DefOpt / cuda / PreRev |
0.000012448 s |
0.000012576 s |
0.99 |
const_scatter / DefOpt / cuda / PostRev |
0.000012929 s |
0.00001264 s |
1.02 |
const_scatter / DefOpt / cuda / BothRev |
0.000012864 s |
0.000013344 s |
0.96 |
const_scatter / IDefOpt / cuda / PreRev |
0.000012896 s |
0.000013088 s |
0.99 |
const_scatter / IDefOpt / cuda / PostRev |
0.000012767 s |
0.0000128 s |
1.00 |
const_scatter / IDefOpt / cuda / BothRev |
0.000012768 s |
0.000012767 s |
1.00 |
const_scatter / JaXPipe / tpu / Primal |
0.00000377545 s |
0.0000038034 s |
0.99 |
const_scatter / Jax / tpu / Primal |
0.00000383505 s |
0.00000380865 s |
1.01 |
const_scatter / HLOOpt / tpu / Primal |
9.53425e-7 s |
9.24775e-7 s |
1.03 |
const_scatter / PartOpt / tpu / Primal |
0.000003806675 s |
0.00000381505 s |
1.00 |
const_scatter / IPartOpt / tpu / Primal |
0.000003772525 s |
0.000003789125 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
9.73975e-7 s |
9.593500000000002e-7 s |
1.02 |
const_scatter / IDefOpt / tpu / Primal |
9.67225e-7 s |
9.3435e-7 s |
1.04 |
const_scatter / JaXPipe / tpu / Forward |
0.000001938325 s |
0.0000019250500000000003 s |
1.01 |
const_scatter / Jax / tpu / Forward |
0.000006491025 s |
0.000006493175000000001 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.000001925175 s |
0.000001916875 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.000001958975 s |
0.000001943925 s |
1.01 |
const_scatter / IPartOpt / tpu / Forward |
0.00000192605 s |
0.00000192015 s |
1.00 |
const_scatter / DefOpt / tpu / Forward |
0.0000019687 s |
0.000001923975 s |
1.02 |
const_scatter / IDefOpt / tpu / Forward |
0.000001933 s |
0.0000019232 s |
1.01 |
const_scatter / JaXPipe / tpu / PreRev |
0.0000043261 s |
0.000004320925 s |
1.00 |
const_scatter / JaXPipe / tpu / PostRev |
0.0000066103 s |
0.00000660905 s |
1.00 |
const_scatter / JaXPipe / tpu / BothRev |
0.00000431415 s |
0.00000429955 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.0000066142 s |
0.000006660325 s |
0.99 |
const_scatter / HLOOpt / tpu / PreRev |
0.0000043207 s |
0.000004301925 s |
1.00 |
const_scatter / HLOOpt / tpu / PostRev |
0.000004309425000000001 s |
0.000004297575000000001 s |
1.00 |
const_scatter / HLOOpt / tpu / BothRev |
0.00000431585 s |
0.00000430105 s |
1.00 |
const_scatter / PartOpt / tpu / PreRev |
0.0000043174750000000005 s |
0.0000043072 s |
1.00 |
const_scatter / PartOpt / tpu / PostRev |
0.00000660145 s |
0.000006595149999999999 s |
1.00 |
const_scatter / PartOpt / tpu / BothRev |
0.000004308375 s |
0.0000043066500000000005 s |
1.00 |
const_scatter / IPartOpt / tpu / PreRev |
0.0000043242 s |
0.000004302775 s |
1.00 |
const_scatter / IPartOpt / tpu / PostRev |
0.000006626975 s |
0.000006622050000000001 s |
1.00 |
const_scatter / IPartOpt / tpu / BothRev |
0.00000432295 s |
0.000004293175 s |
1.01 |
const_scatter / DefOpt / tpu / PreRev |
0.000004297425000000001 s |
0.000004297425000000001 s |
1 |
const_scatter / DefOpt / tpu / PostRev |
0.000004309725 s |
0.0000043024250000000006 s |
1.00 |
const_scatter / DefOpt / tpu / BothRev |
0.00000430405 s |
0.000004299799999999999 s |
1.00 |
const_scatter / IDefOpt / tpu / PreRev |
0.0000043157 s |
0.000004311499999999999 s |
1.00 |
const_scatter / IDefOpt / tpu / PostRev |
0.000004318675 s |
0.00000428725 s |
1.01 |
const_scatter / IDefOpt / tpu / BothRev |
0.0000043286 s |
0.00000428755 s |
1.01 |
const_scatter / JaXPipe / cpu / Primal |
0.000012945 s |
0.000007763400026306044 s |
1.67 |
const_scatter / Jax / cpu / Primal |
0.000013333 s |
0.0000070458600202982776 s |
1.89 |
const_scatter / HLOOpt / cpu / Primal |
0.000012977 s |
0.000006994660006967024 s |
1.86 |
const_scatter / PartOpt / cpu / Primal |
0.000012791 s |
0.0000072007200014923 s |
1.78 |
const_scatter / IPartOpt / cpu / Primal |
0.000012577 s |
0.000006805479988543084 s |
1.85 |
const_scatter / DefOpt / cpu / Primal |
0.000012885 s |
0.000006607559998883516 s |
1.95 |
const_scatter / IDefOpt / cpu / Primal |
0.000012843 s |
0.000006990259953454369 s |
1.84 |
const_scatter / JaXPipe / cpu / Forward |
0.000017576999999999998 s |
0.000010474540013092336 s |
1.68 |
const_scatter / Jax / cpu / Forward |
0.000016751 s |
0.000010772199975690456 s |
1.56 |
const_scatter / HLOOpt / cpu / Forward |
0.000016712000000000002 s |
0.000014144699998723808 s |
1.18 |
const_scatter / PartOpt / cpu / Forward |
0.000016934 s |
0.000015396779990624053 s |
1.10 |
const_scatter / IPartOpt / cpu / Forward |
0.00001694 s |
0.000010131839981113443 s |
1.67 |
const_scatter / DefOpt / cpu / Forward |
0.000016749 s |
0.000014296299987108796 s |
1.17 |
const_scatter / IDefOpt / cpu / Forward |
0.000016804 s |
0.00001000824006041512 s |
1.68 |
const_scatter / JaXPipe / cpu / PreRev |
0.0004947759999999 s |
0.0003034807000403 s |
1.63 |
const_scatter / JaXPipe / cpu / PostRev |
0.0005280479999999 s |
0.0002982542200425 s |
1.77 |
const_scatter / JaXPipe / cpu / BothRev |
0.000522243 s |
0.000285023639999 s |
1.83 |
const_scatter / Jax / cpu / BothRev |
0.000516483 s |
0.0002842879400213 s |
1.82 |
const_scatter / HLOOpt / cpu / PreRev |
0.000515135 s |
0.0002853217399842 s |
1.81 |
const_scatter / HLOOpt / cpu / PostRev |
0.0005107709999999 s |
0.0002907451599548 s |
1.76 |
const_scatter / HLOOpt / cpu / BothRev |
0.000491423 s |
0.0002864135400341 s |
1.72 |
const_scatter / PartOpt / cpu / PreRev |
0.000503264 s |
0.0002835488199889 s |
1.77 |
const_scatter / PartOpt / cpu / PostRev |
0.00050828 s |
0.0002898025000013 s |
1.75 |
const_scatter / PartOpt / cpu / BothRev |
0.000525003 s |
0.0002821240400044 s |
1.86 |
const_scatter / IPartOpt / cpu / PreRev |
0.000500425 s |
0.0002919069400104 s |
1.71 |
const_scatter / IPartOpt / cpu / PostRev |
0.000511533 s |
0.0002938434999578 s |
1.74 |
const_scatter / IPartOpt / cpu / BothRev |
0.00050068 s |
0.0002843726199625 s |
1.76 |
const_scatter / DefOpt / cpu / PreRev |
0.0005138019999999 s |
0.0002895347999765 s |
1.77 |
const_scatter / DefOpt / cpu / PostRev |
0.000523517 s |
0.000291647899985 s |
1.80 |
const_scatter / DefOpt / cpu / BothRev |
0.000533237 s |
0.0002939875399533 s |
1.81 |
const_scatter / IDefOpt / cpu / PreRev |
0.000517185 s |
0.0002921339599561 s |
1.77 |
const_scatter / IDefOpt / cpu / PostRev |
0.000521314 s |
0.0002916513599848 s |
1.79 |
const_scatter / IDefOpt / cpu / BothRev |
0.000501794 s |
0.0002838417199109 s |
1.77 |
const_scatter / JaXPipe / cpu / Primal |
0.000008 s |
0.000007763400026306044 s |
1.03 |
const_scatter / Jax / cpu / Primal |
0.000008 s |
0.0000070458600202982776 s |
1.14 |
const_scatter / HLOOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006994660006967024 s |
1.29 |
const_scatter / PartOpt / cpu / Primal |
0.000011 s |
0.0000072007200014923 s |
1.53 |
const_scatter / IPartOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006805479988543084 s |
1.32 |
const_scatter / DefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006607559998883516 s |
1.36 |
const_scatter / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000006990259953454369 s |
1.29 |
const_scatter / JaXPipe / cpu / Forward |
0.000012 s |
0.000010474540013092336 s |
1.15 |
const_scatter / Jax / cpu / Forward |
0.000038 s |
0.000010772199975690456 s |
3.53 |
const_scatter / HLOOpt / cpu / Forward |
0.000013 s |
0.000014144699998723808 s |
0.92 |
const_scatter / PartOpt / cpu / Forward |
0.000012 s |
0.000015396779990624053 s |
0.78 |
const_scatter / IPartOpt / cpu / Forward |
0.000041 s |
0.000010131839981113443 s |
4.05 |
const_scatter / DefOpt / cpu / Forward |
0.000012 s |
0.000014296299987108796 s |
0.84 |
const_scatter / IDefOpt / cpu / Forward |
0.000013 s |
0.00001000824006041512 s |
1.30 |
const_scatter / JaXPipe / cpu / PreRev |
0.000499 s |
0.0003034807000403 s |
1.64 |
const_scatter / JaXPipe / cpu / PostRev |
0.00035 s |
0.0002982542200425 s |
1.17 |
const_scatter / JaXPipe / cpu / BothRev |
0.000347 s |
0.000285023639999 s |
1.22 |
const_scatter / Jax / cpu / BothRev |
0.000357 s |
0.0002842879400213 s |
1.26 |
const_scatter / HLOOpt / cpu / PreRev |
0.000389 s |
0.0002853217399842 s |
1.36 |
const_scatter / HLOOpt / cpu / PostRev |
0.0003489999999999 s |
0.0002907451599548 s |
1.20 |
const_scatter / HLOOpt / cpu / BothRev |
0.000535 s |
0.0002864135400341 s |
1.87 |
const_scatter / PartOpt / cpu / PreRev |
0.000347 s |
0.0002835488199889 s |
1.22 |
const_scatter / PartOpt / cpu / PostRev |
0.000363 s |
0.0002898025000013 s |
1.25 |
const_scatter / PartOpt / cpu / BothRev |
0.000406 s |
0.0002821240400044 s |
1.44 |
const_scatter / IPartOpt / cpu / PreRev |
0.0004129999999999 s |
0.0002919069400104 s |
1.41 |
const_scatter / IPartOpt / cpu / PostRev |
0.0003529999999999 s |
0.0002938434999578 s |
1.20 |
const_scatter / IPartOpt / cpu / BothRev |
0.000463 s |
0.0002843726199625 s |
1.63 |
const_scatter / DefOpt / cpu / PreRev |
0.000402 s |
0.0002895347999765 s |
1.39 |
const_scatter / DefOpt / cpu / PostRev |
0.000414 s |
0.000291647899985 s |
1.42 |
const_scatter / DefOpt / cpu / BothRev |
0.000385 s |
0.0002939875399533 s |
1.31 |
const_scatter / IDefOpt / cpu / PreRev |
0.000345 s |
0.0002921339599561 s |
1.18 |
const_scatter / IDefOpt / cpu / PostRev |
0.0004869999999999 s |
0.0002916513599848 s |
1.67 |
const_scatter / IDefOpt / cpu / BothRev |
0.00056 s |
0.0002838417199109 s |
1.97 |
GenDot / JaXPipe / cpu / Primal |
0.000009710840013212872 s |
0.000007976340002642246 s |
1.22 |
GenDot / Jax / cpu / Primal |
0.00000787888004197157 s |
0.000007622319990332471 s |
1.03 |
GenDot / HLOOpt / cpu / Primal |
0.000012147539982834132 s |
0.000012125279999963822 s |
1.00 |
GenDot / PartOpt / cpu / Primal |
0.000007728399941697716 s |
0.000007736880015727365 s |
1.00 |
GenDot / IPartOpt / cpu / Primal |
0.000008285080057248705 s |
0.000008522200005245395 s |
0.97 |
GenDot / DefOpt / cpu / Primal |
0.00001270487999136094 s |
0.000007448159985870006 s |
1.71 |
GenDot / IDefOpt / cpu / Primal |
0.000007625459993505501 s |
0.000007787959984852933 s |
0.98 |
GenDot / JaXPipe / cpu / Forward |
0.00001248287998350861 s |
0.000011303560040687445 s |
1.10 |
GenDot / Jax / cpu / Forward |
0.000011816540054496726 s |
0.000010663020048014004 s |
1.11 |
GenDot / HLOOpt / cpu / Forward |
0.000015159180029513664 s |
0.00001102295996133762 s |
1.38 |
GenDot / PartOpt / cpu / Forward |
0.000016477140015922486 s |
0.000016053339995778514 s |
1.03 |
GenDot / IPartOpt / cpu / Forward |
0.000012227399965922812 s |
0.000011285260025033494 s |
1.08 |
GenDot / DefOpt / cpu / Forward |
0.000016626900014671263 s |
0.00001600067995241261 s |
1.04 |
GenDot / IDefOpt / cpu / Forward |
0.000011610940027821926 s |
0.000010661779988367926 s |
1.09 |
GenDot / JaXPipe / cpu / PreRev |
0.0000125530399964191 s |
0.000012155380027252247 s |
1.03 |
GenDot / JaXPipe / cpu / PostRev |
0.000011516340000525816 s |
0.000011094039991803584 s |
1.04 |
GenDot / JaXPipe / cpu / BothRev |
0.000017316259954895942 s |
0.000011813439987236053 s |
1.47 |
GenDot / Jax / cpu / BothRev |
0.00001180726001621224 s |
0.000011727659957614378 s |
1.01 |
GenDot / HLOOpt / cpu / PreRev |
0.000012128299968026113 s |
0.000011548159973244765 s |
1.05 |
GenDot / HLOOpt / cpu / PostRev |
0.000016628180010229698 s |
0.000015893500003585358 s |
1.05 |
GenDot / HLOOpt / cpu / BothRev |
0.000014366500026881113 s |
0.000016876320014489465 s |
0.85 |
GenDot / PartOpt / cpu / PreRev |
0.000012396180018185987 s |
0.00001174090000858996 s |
1.06 |
GenDot / PartOpt / cpu / PostRev |
0.000012323360015216168 s |
0.000010544900023887748 s |
1.17 |
GenDot / PartOpt / cpu / BothRev |
0.000011786739978560943 s |
0.000011326679987178067 s |
1.04 |
GenDot / IPartOpt / cpu / PreRev |
0.000012197320002087508 s |
0.00001432055995792325 s |
0.85 |
GenDot / IPartOpt / cpu / PostRev |
0.000011331659989082254 s |
0.000010636360002536094 s |
1.07 |
GenDot / IPartOpt / cpu / BothRev |
0.000011832999998659945 s |
0.00001155084003585216 s |
1.02 |
GenDot / DefOpt / cpu / PreRev |
0.000012428019981598482 s |
0.000011734499958038214 s |
1.06 |
GenDot / DefOpt / cpu / PostRev |
0.00001270140003725828 s |
0.00001169869996374473 s |
1.09 |
GenDot / DefOpt / cpu / BothRev |
0.000012315360017964848 s |
0.000011772220004786504 s |
1.05 |
GenDot / IDefOpt / cpu / PreRev |
0.00001252021997970587 s |
0.000011911119991054876 s |
1.05 |
GenDot / IDefOpt / cpu / PostRev |
0.000012556040046547424 s |
0.000011878620016432253 s |
1.06 |
GenDot / IDefOpt / cpu / BothRev |
0.000012506619987107116 s |
0.000011148640032843104 s |
1.12 |
GenDot / JaXPipe / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / Jax / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / HLOOpt / cuda / Primal |
0.000002015 s |
0.000001984 s |
1.02 |
GenDot / PartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / DefOpt / cuda / Primal |
0.000002015 s |
0.000001984 s |
1.02 |
GenDot / IDefOpt / cuda / Primal |
0.000002015 s |
0.000001984 s |
1.02 |
GenDot / JaXPipe / cuda / Forward |
0.000009984 s |
0.000009856 s |
1.01 |
GenDot / Jax / cuda / Forward |
0.000010209 s |
0.00001008 s |
1.01 |
GenDot / HLOOpt / cuda / Forward |
0.000010207 s |
0.00000992 s |
1.03 |
GenDot / PartOpt / cuda / Forward |
0.000010272 s |
0.000010176 s |
1.01 |
GenDot / IPartOpt / cuda / Forward |
0.000010336 s |
0.000010304 s |
1.00 |
GenDot / DefOpt / cuda / Forward |
0.000010016 s |
0.000009984 s |
1.00 |
GenDot / IDefOpt / cuda / Forward |
0.000010369 s |
0.000010304 s |
1.01 |
GenDot / JaXPipe / cuda / PreRev |
0.00000992 s |
0.000010176 s |
0.97 |
GenDot / JaXPipe / cuda / PostRev |
0.000010272 s |
0.000010144 s |
1.01 |
GenDot / JaXPipe / cuda / BothRev |
0.000010048 s |
0.000010208 s |
0.98 |
GenDot / Jax / cuda / BothRev |
0.000010368 s |
0.000011488 s |
0.90 |
GenDot / HLOOpt / cuda / PreRev |
0.000010144 s |
0.000011104 s |
0.91 |
GenDot / HLOOpt / cuda / PostRev |
0.00001008 s |
0.000011103 s |
0.91 |
GenDot / HLOOpt / cuda / BothRev |
0.00001024 s |
0.00001136 s |
0.90 |
GenDot / PartOpt / cuda / PreRev |
0.000009984 s |
0.00001168 s |
0.85 |
GenDot / PartOpt / cuda / PostRev |
0.000010176 s |
0.000009696 s |
1.05 |
GenDot / PartOpt / cuda / BothRev |
0.000010208 s |
0.000009889 s |
1.03 |
GenDot / IPartOpt / cuda / PreRev |
0.000010208 s |
0.00001008 s |
1.01 |
GenDot / IPartOpt / cuda / PostRev |
0.000010528 s |
0.000011296 s |
0.93 |
GenDot / IPartOpt / cuda / BothRev |
0.000010815 s |
0.000010208 s |
1.06 |
GenDot / DefOpt / cuda / PreRev |
0.00001072 s |
0.000009664 s |
1.11 |
GenDot / DefOpt / cuda / PostRev |
0.000010592 s |
0.000009503 s |
1.11 |
GenDot / DefOpt / cuda / BothRev |
0.000009856 s |
0.000010016 s |
0.98 |
GenDot / IDefOpt / cuda / PreRev |
0.000010176 s |
0.000010112 s |
1.01 |
GenDot / IDefOpt / cuda / PostRev |
0.000010209 s |
0.000009889 s |
1.03 |
GenDot / IDefOpt / cuda / BothRev |
0.000009952 s |
0.000010144 s |
0.98 |
GenDot / JaXPipe / tpu / Primal |
9.256e-7 s |
9.30225e-7 s |
1.00 |
GenDot / Jax / tpu / Primal |
9.35825e-7 s |
9.357e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.0000015487 s |
0.0000015747 s |
0.98 |
GenDot / PartOpt / tpu / Primal |
9.357e-7 s |
9.36175e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.3595e-7 s |
9.4085e-7 s |
0.99 |
GenDot / DefOpt / tpu / Primal |
0.000001491675 s |
0.000001483575 s |
1.01 |
GenDot / IDefOpt / tpu / Primal |
0.00000155825 s |
0.0000015670249999999998 s |
0.99 |
GenDot / JaXPipe / tpu / Forward |
0.000003168825 s |
0.000003160525 s |
1.00 |
GenDot / Jax / tpu / Forward |
0.000002326225 s |
0.0000023322 s |
1.00 |
GenDot / HLOOpt / tpu / Forward |
0.00000312795 s |
0.0000031071500000000004 s |
1.01 |
GenDot / PartOpt / tpu / Forward |
0.0000032094 s |
0.0000032155250000000004 s |
1.00 |
GenDot / IPartOpt / tpu / Forward |
0.000003106475 s |
0.000003114575 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.000003209225 s |
0.000003211525 s |
1.00 |
GenDot / IDefOpt / tpu / Forward |
0.000003114225 s |
0.00000311245 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.000002947025 s |
0.00000295445 s |
1.00 |
GenDot / JaXPipe / tpu / PostRev |
0.000002405675 s |
0.000002400925 s |
1.00 |
GenDot / JaXPipe / tpu / BothRev |
0.0000029545500000000004 s |
0.0000029649750000000004 s |
1.00 |
GenDot / Jax / tpu / BothRev |
0.0000024078 s |
0.000002409925 s |
1.00 |
GenDot / HLOOpt / tpu / PreRev |
0.00000294865 s |
0.000002956325 s |
1.00 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029370000000000004 s |
0.00000293495 s |
1.00 |
GenDot / HLOOpt / tpu / BothRev |
0.000002953875 s |
0.000002958175 s |
1.00 |
GenDot / PartOpt / tpu / PreRev |
0.000002925925 s |
0.0000029199000000000006 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.000002384575 s |
0.000002395525 s |
1.00 |
GenDot / PartOpt / tpu / BothRev |
0.000002932875 s |
0.000002937575 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.0000029503000000000004 s |
0.00000295935 s |
1.00 |
GenDot / IPartOpt / tpu / PostRev |
0.00000240985 s |
0.000002409825 s |
1.00 |
GenDot / IPartOpt / tpu / BothRev |
0.000002945675 s |
0.00000296385 s |
0.99 |
GenDot / DefOpt / tpu / PreRev |
0.0000029353 s |
0.0000029279500000000005 s |
1.00 |
GenDot / DefOpt / tpu / PostRev |
0.000002952125 s |
0.000002959675 s |
1.00 |
GenDot / DefOpt / tpu / BothRev |
0.0000029304 s |
0.000002942425 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.000002957 s |
0.00000296965 s |
1.00 |
GenDot / IDefOpt / tpu / PostRev |
0.0000029308 s |
0.0000029244500000000004 s |
1.00 |
GenDot / IDefOpt / tpu / BothRev |
0.0000029569250000000005 s |
0.0000029509000000000004 s |
1.00 |
GenDot / JaXPipe / cpu / Primal |
0.000015042 s |
0.000007976340002642246 s |
1.89 |
GenDot / Jax / cpu / Primal |
0.000015412 s |
0.000007622319990332471 s |
2.02 |
GenDot / HLOOpt / cpu / Primal |
0.000014071 s |
0.000012125279999963822 s |
1.16 |
GenDot / PartOpt / cpu / Primal |
0.00001525 s |
0.000007736880015727365 s |
1.97 |
GenDot / IPartOpt / cpu / Primal |
0.000014611 s |
0.000008522200005245395 s |
1.71 |
GenDot / DefOpt / cpu / Primal |
0.000014126 s |
0.000007448159985870006 s |
1.90 |
GenDot / IDefOpt / cpu / Primal |
0.000014207 s |
0.000007787959984852933 s |
1.82 |
GenDot / JaXPipe / cpu / Forward |
0.00001935 s |
0.000011303560040687445 s |
1.71 |
GenDot / Jax / cpu / Forward |
0.000020797 s |
0.000010663020048014004 s |
1.95 |
GenDot / HLOOpt / cpu / Forward |
0.000018746 s |
0.00001102295996133762 s |
1.70 |
GenDot / PartOpt / cpu / Forward |
0.000019276000000000003 s |
0.000016053339995778514 s |
1.20 |
GenDot / IPartOpt / cpu / Forward |
0.000019659 s |
0.000011285260025033494 s |
1.74 |
GenDot / DefOpt / cpu / Forward |
0.000019225 s |
0.00001600067995241261 s |
1.20 |
GenDot / IDefOpt / cpu / Forward |
0.000019296 s |
0.000010661779988367926 s |
1.81 |
GenDot / JaXPipe / cpu / PreRev |
0.000019987 s |
0.000012155380027252247 s |
1.64 |
GenDot / JaXPipe / cpu / PostRev |
0.000021131 s |
0.000011094039991803584 s |
1.90 |
GenDot / JaXPipe / cpu / BothRev |
0.000020627 s |
0.000011813439987236053 s |
1.75 |
GenDot / Jax / cpu / BothRev |
0.00002108 s |
0.000011727659957614378 s |
1.80 |
GenDot / HLOOpt / cpu / PreRev |
0.000019222 s |
0.000011548159973244765 s |
1.66 |
GenDot / HLOOpt / cpu / PostRev |
0.000019594 s |
0.000015893500003585358 s |
1.23 |
GenDot / HLOOpt / cpu / BothRev |
0.000019729 s |
0.000016876320014489465 s |
1.17 |
GenDot / PartOpt / cpu / PreRev |
0.000019672 s |
0.00001174090000858996 s |
1.68 |
GenDot / PartOpt / cpu / PostRev |
0.00002183 s |
0.000010544900023887748 s |
2.07 |
GenDot / PartOpt / cpu / BothRev |
0.00001997 s |
0.000011326679987178067 s |
1.76 |
GenDot / IPartOpt / cpu / PreRev |
0.000019124 s |
0.00001432055995792325 s |
1.34 |
GenDot / IPartOpt / cpu / PostRev |
0.000021158 s |
0.000010636360002536094 s |
1.99 |
GenDot / IPartOpt / cpu / BothRev |
0.000020059 s |
0.00001155084003585216 s |
1.74 |
GenDot / DefOpt / cpu / PreRev |
0.000019013 s |
0.000011734499958038214 s |
1.62 |
GenDot / DefOpt / cpu / PostRev |
0.000020357 s |
0.00001169869996374473 s |
1.74 |
GenDot / DefOpt / cpu / BothRev |
0.000019279 s |
0.000011772220004786504 s |
1.64 |
GenDot / IDefOpt / cpu / PreRev |
0.000019516 s |
0.000011911119991054876 s |
1.64 |
GenDot / IDefOpt / cpu / PostRev |
0.000019900000000000003 s |
0.000011878620016432253 s |
1.68 |
GenDot / IDefOpt / cpu / BothRev |
0.00002027 s |
0.000011148640032843104 s |
1.82 |
GenDot / JaXPipe / cpu / Primal |
0.00001 s |
0.000007976340002642246 s |
1.25 |
GenDot / Jax / cpu / Primal |
0.00001 s |
0.000007622319990332471 s |
1.31 |
GenDot / HLOOpt / cpu / Primal |
0.000034 s |
0.000012125279999963822 s |
2.80 |
GenDot / PartOpt / cpu / Primal |
0.00001 s |
0.000007736880015727365 s |
1.29 |
GenDot / IPartOpt / cpu / Primal |
0.000013 s |
0.000008522200005245395 s |
1.53 |
GenDot / DefOpt / cpu / Primal |
0.00001 s |
0.000007448159985870006 s |
1.34 |
GenDot / IDefOpt / cpu / Primal |
0.000008999999999999999 s |
0.000007787959984852933 s |
1.16 |
GenDot / JaXPipe / cpu / Forward |
0.000014 s |
0.000011303560040687445 s |
1.24 |
GenDot / Jax / cpu / Forward |
0.000015 s |
0.000010663020048014004 s |
1.41 |
GenDot / HLOOpt / cpu / Forward |
0.000015 s |
0.00001102295996133762 s |
1.36 |
GenDot / PartOpt / cpu / Forward |
0.000016 s |
0.000016053339995778514 s |
1.00 |
GenDot / IPartOpt / cpu / Forward |
0.000014 s |
0.000011285260025033494 s |
1.24 |
GenDot / DefOpt / cpu / Forward |
0.000013 s |
0.00001600067995241261 s |
0.81 |
GenDot / IDefOpt / cpu / Forward |
0.000013 s |
0.000010661779988367926 s |
1.22 |
GenDot / JaXPipe / cpu / PreRev |
0.000014 s |
0.000012155380027252247 s |
1.15 |
GenDot / JaXPipe / cpu / PostRev |
0.000015 s |
0.000011094039991803584 s |
1.35 |
GenDot / JaXPipe / cpu / BothRev |
0.000026 s |
0.000011813439987236053 s |
2.20 |
GenDot / Jax / cpu / BothRev |
0.000015 s |
0.000011727659957614378 s |
1.28 |
GenDot / HLOOpt / cpu / PreRev |
0.000014 s |
0.000011548159973244765 s |
1.21 |
GenDot / HLOOpt / cpu / PostRev |
0.000019 s |
0.000015893500003585358 s |
1.20 |
GenDot / HLOOpt / cpu / BothRev |
0.000014 s |
0.000016876320014489465 s |
0.83 |
GenDot / PartOpt / cpu / PreRev |
0.000014 s |
0.00001174090000858996 s |
1.19 |
GenDot / PartOpt / cpu / PostRev |
0.000015 s |
0.000010544900023887748 s |
1.42 |
GenDot / PartOpt / cpu / BothRev |
0.000014 s |
0.000011326679987178067 s |
1.24 |
GenDot / IPartOpt / cpu / PreRev |
0.000014 s |
0.00001432055995792325 s |
0.98 |
GenDot / IPartOpt / cpu / PostRev |
0.000016 s |
0.000010636360002536094 s |
1.50 |
GenDot / IPartOpt / cpu / BothRev |
0.000014 s |
0.00001155084003585216 s |
1.21 |
GenDot / DefOpt / cpu / PreRev |
0.000013 s |
0.000011734499958038214 s |
1.11 |
GenDot / DefOpt / cpu / PostRev |
0.000015 s |
0.00001169869996374473 s |
1.28 |
GenDot / DefOpt / cpu / BothRev |
0.000014 s |
0.000011772220004786504 s |
1.19 |
GenDot / IDefOpt / cpu / PreRev |
0.000014 s |
0.000011911119991054876 s |
1.18 |
GenDot / IDefOpt / cpu / PostRev |
0.000015 s |
0.000011878620016432253 s |
1.26 |
GenDot / IDefOpt / cpu / BothRev |
0.000044 s |
0.000011148640032843104 s |
3.95 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000011161940019519534 s |
0.000011683219945552991 s |
0.96 |
hlo_ffi / Jax / cpu / Primal |
0.000011351999992257334 s |
0.00001093347996174998 s |
1.04 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000014426640000237968 s |
0.000010873880028157143 s |
1.33 |
hlo_ffi / PartOpt / cpu / Primal |
0.000011115759953099767 s |
0.000010604800017972591 s |
1.05 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000010841720040843938 s |
0.000010708660011005122 s |
1.01 |
hlo_ffi / DefOpt / cpu / Primal |
0.000014101220021984774 s |
0.000014841120018900257 s |
0.95 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000010800600048241904 s |
0.000011009920008291374 s |
0.98 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000016400420017816943 s |
0.000016810779980005462 s |
0.98 |
hlo_ffi / Jax / cpu / Forward |
0.000016374499982703127 s |
0.000016500559986525332 s |
0.99 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000015835140011404292 s |
0.000016243679965555202 s |
0.97 |
hlo_ffi / PartOpt / cpu / Forward |
0.000016182399967874516 s |
0.000016591320008956245 s |
0.98 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000016748439984439757 s |
0.0000167376999706903 s |
1.00 |
hlo_ffi / DefOpt / cpu / Forward |
0.00001623438002752664 s |
0.000016887200044948257 s |
0.96 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000016157259969986627 s |
0.00001706424001895357 s |
0.95 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000016021979954530254 s |
0.00001627081997867208 s |
0.98 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.0000160561800475989 s |
0.000016024260066842543 s |
1.00 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00001613188003830146 s |
0.000016259600006378605 s |
0.99 |
hlo_ffi / Jax / cpu / BothRev |
0.000016405300011683722 s |
0.000016137319989866227 s |
1.02 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.00001585205996889272 s |
0.000016437020030934944 s |
0.96 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000015580380049868837 s |
0.00001556615996378241 s |
1.00 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000017279060039072646 s |
0.000017776600006982335 s |
0.97 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000016523120020792703 s |
0.00001579934000801586 s |
1.05 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000016308779995597432 s |
0.0000158483600080217 s |
1.03 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000016716080008336576 s |
0.00001626846002181992 s |
1.03 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.00001639620000787545 s |
0.00001559686000291549 s |
1.05 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000016330520011251793 s |
0.000016143759967235383 s |
1.01 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000016686200033291244 s |
0.000016162379997695096 s |
1.03 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000015798820022610016 s |
0.000016102459967441972 s |
0.98 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000016521560019100436 s |
0.00001652375997764466 s |
1.00 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000015938539972921718 s |
0.00001630163998925127 s |
0.98 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000016353860000890564 s |
0.000016101619985420255 s |
1.02 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000015935660021568763 s |
0.000015861459987718264 s |
1.00 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000016035239987104433 s |
0.000016633619961794465 s |
0.96 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / Jax / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / Jax / cuda / Forward |
0.00000208 s |
0.000002047 s |
1.02 |
hlo_ffi / HLOOpt / cuda / Forward |
0.00000208 s |
0.000002047 s |
1.02 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.000002047 s |
1.02 |
hlo_ffi / IPartOpt / cuda / Forward |
0.00000208 s |
0.000002047 s |
1.02 |
hlo_ffi / DefOpt / cuda / Forward |
0.00000208 s |
0.000002048 s |
1.02 |
hlo_ffi / IDefOpt / cuda / Forward |
0.00000208 s |
0.000002048 s |
1.02 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / Jax / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Primal |
9.2765e-7 s |
9.3345e-7 s |
0.99 |
hlo_ffi / Jax / tpu / Primal |
9.4985e-7 s |
9.5385e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
9.0635e-7 s |
9.08725e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Primal |
9.50525e-7 s |
9.57e-7 s |
0.99 |
hlo_ffi / IPartOpt / tpu / Primal |
9.0945e-7 s |
9.13175e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Primal |
9.53925e-7 s |
9.671e-7 s |
0.99 |
hlo_ffi / IDefOpt / tpu / Primal |
9.0615e-7 s |
9.112e-7 s |
0.99 |
hlo_ffi / JaXPipe / tpu / Forward |
9.49575e-7 s |
9.48875e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.8165e-7 s |
9.8175e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.74625e-7 s |
9.73825e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.34375e-7 s |
9.34275e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.74175e-7 s |
9.736749999999998e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.3395e-7 s |
9.341e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.74e-7 s |
9.738e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.38775e-7 s |
9.388e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.64375e-7 s |
9.64375e-7 s |
1 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.6035e-7 s |
9.601e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.64675e-7 s |
9.648e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.60575e-7 s |
9.60525e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.6515e-7 s |
9.6485e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.60375e-7 s |
9.60475e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.6525e-7 s |
9.65375e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.60375e-7 s |
9.596e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.652e-7 s |
9.64525e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.6045e-7 s |
9.5985e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.65175e-7 s |
9.646e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.59875e-7 s |
9.6025e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.651e-7 s |
9.64575e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.6015e-7 s |
9.60475e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.6495e-7 s |
9.64275e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.59825e-7 s |
9.608e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.6495e-7 s |
9.64675e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.602e-7 s |
9.599e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000017666 s |
0.000011683219945552991 s |
1.51 |
hlo_ffi / Jax / cpu / Primal |
0.000017556 s |
0.00001093347996174998 s |
1.61 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000017354 s |
0.000010873880028157143 s |
1.60 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017032 s |
0.000010604800017972591 s |
1.61 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017371 s |
0.000010708660011005122 s |
1.62 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017774000000000003 s |
0.000014841120018900257 s |
1.20 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000016955999999999998 s |
0.000011009920008291374 s |
1.54 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000024754 s |
0.000016810779980005462 s |
1.47 |
hlo_ffi / Jax / cpu / Forward |
0.000024594 s |
0.000016500559986525332 s |
1.49 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000024073 s |
0.000016243679965555202 s |
1.48 |
hlo_ffi / PartOpt / cpu / Forward |
0.000023941 s |
0.000016591320008956245 s |
1.44 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000024387 s |
0.0000167376999706903 s |
1.46 |
hlo_ffi / DefOpt / cpu / Forward |
0.00002399 s |
0.000016887200044948257 s |
1.42 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000023929 s |
0.00001706424001895357 s |
1.40 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000024599 s |
0.00001627081997867208 s |
1.51 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.00002485 s |
0.000016024260066842543 s |
1.55 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000025 s |
0.000016259600006378605 s |
1.54 |
hlo_ffi / Jax / cpu / BothRev |
0.000024307 s |
0.000016137319989866227 s |
1.51 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000026238 s |
0.000016437020030934944 s |
1.60 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000025836 s |
0.00001556615996378241 s |
1.66 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000024421 s |
0.000017776600006982335 s |
1.37 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000024529 s |
0.00001579934000801586 s |
1.55 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000025199 s |
0.0000158483600080217 s |
1.59 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000025386 s |
0.00001626846002181992 s |
1.56 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000024544 s |
0.00001559686000291549 s |
1.57 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000024798 s |
0.000016143759967235383 s |
1.54 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000024542 s |
0.000016162379997695096 s |
1.52 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000024733 s |
0.000016102459967441972 s |
1.54 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000025887 s |
0.00001652375997764466 s |
1.57 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000025966 s |
0.00001630163998925127 s |
1.59 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000024077 s |
0.000016101619985420255 s |
1.50 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000025648 s |
0.000015861459987718264 s |
1.62 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.00002578 s |
0.000016633619961794465 s |
1.55 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000012 s |
0.000011683219945552991 s |
1.03 |
hlo_ffi / Jax / cpu / Primal |
0.000012 s |
0.00001093347996174998 s |
1.10 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000013 s |
0.000010873880028157143 s |
1.20 |
hlo_ffi / PartOpt / cpu / Primal |
0.000013 s |
0.000010604800017972591 s |
1.23 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000013 s |
0.000010708660011005122 s |
1.21 |
hlo_ffi / DefOpt / cpu / Primal |
0.000013 s |
0.000014841120018900257 s |
0.88 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000013 s |
0.000011009920008291374 s |
1.18 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000017999999999999997 s |
0.000016810779980005462 s |
1.07 |
hlo_ffi / Jax / cpu / Forward |
0.000017999999999999997 s |
0.000016500559986525332 s |
1.09 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000017 s |
0.000016243679965555202 s |
1.05 |
hlo_ffi / PartOpt / cpu / Forward |
0.000017 s |
0.000016591320008956245 s |
1.02 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000016 s |
0.0000167376999706903 s |
0.96 |
hlo_ffi / DefOpt / cpu / Forward |
0.000017 s |
0.000016887200044948257 s |
1.01 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000017999999999999997 s |
0.00001706424001895357 s |
1.05 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000017 s |
0.00001627081997867208 s |
1.04 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000017 s |
0.000016024260066842543 s |
1.06 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000017 s |
0.000016259600006378605 s |
1.05 |
hlo_ffi / Jax / cpu / BothRev |
0.000017 s |
0.000016137319989866227 s |
1.05 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000017 s |
0.000016437020030934944 s |
1.03 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000017 s |
0.00001556615996378241 s |
1.09 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000056 s |
0.000017776600006982335 s |
3.15 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000017999999999999997 s |
0.00001579934000801586 s |
1.14 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000017999999999999997 s |
0.0000158483600080217 s |
1.14 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000017 s |
0.00001626846002181992 s |
1.04 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000017 s |
0.00001559686000291549 s |
1.09 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000017999999999999997 s |
0.000016143759967235383 s |
1.11 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000017 s |
0.000016162379997695096 s |
1.05 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000017999999999999997 s |
0.000016102459967441972 s |
1.12 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000017 s |
0.00001652375997764466 s |
1.03 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000017 s |
0.00001630163998925127 s |
1.04 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000017 s |
0.000016101619985420255 s |
1.06 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000017 s |
0.000015861459987718264 s |
1.07 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000017 s |
0.000016633619961794465 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0011394164000193 s |
0.0010979923999911 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0010211476000222 s |
0.0009642218000408 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009822407999308 s |
0.0009914535999996 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0009377344001222 s |
0.0009346603998892 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0009606779999558 s |
0.0009413688000677 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0010039738000159 s |
0.001004744400052 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.000997136999922 s |
0.0010086406000482 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0028962450000108 s |
0.0027766954000981 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0023513898000601 s |
0.0023500500001318 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0024714459998904 s |
0.0022446068000135 s |
1.10 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0022732717999133 s |
0.0022493833998851 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0022815433999312 s |
0.0021829849999448 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022654426000372 s |
0.002215745400008 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0022645318000286 s |
0.0024366134001866 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0066986801999519 s |
0.0066071121999812 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0063430310000512 s |
0.0060330907999741 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0057026240001505 s |
0.0055423939998945 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0042185371999039 s |
0.0057159447999765 s |
0.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0061754990000736 s |
0.0056442914000399 s |
1.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0033079599999837 s |
0.0054469434000566 s |
0.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0057557478000489 s |
0.0055817624000155 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0048426846001348 s |
0.0072480184000596 s |
0.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0061545509999632 s |
0.0057854932002555 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0034732670001176 s |
0.0056285987998307 s |
0.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.005796855600056 s |
0.005813245399986 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0036269055999582 s |
0.0059623540000757 s |
0.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0058944771999449 s |
0.0058417539999027 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0034753680000903 s |
0.005578324200087 s |
0.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0056282336000549 s |
0.0057755715999519 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0040375607998612 s |
0.0058006471998851 s |
0.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.0057279065999864 s |
0.0057635594001112 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0034775234001244 s |
0.0058320177999121 s |
0.60 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0057100048000393 s |
0.0059808674001033 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000274145 s |
0.0002812799999999 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000272544 s |
0.000280961 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000287233 s |
0.000288864 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.0002725449999999 s |
0.00028032 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000274497 s |
0.00028272 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.000288032 s |
0.000289537 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.00028784 s |
0.000288833 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000558465 s |
0.000562049 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000538433 s |
0.000539745 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.0005577609999999 s |
0.000561377 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.00055824 s |
0.000561089 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000558881 s |
0.0005594249999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.00055853 s |
0.0005615689999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000558337 s |
0.0005614719999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001024482 s |
0.001054273 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000986849 s |
0.001012514 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001022466 s |
0.0010486109999999 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.000981601 s |
0.001013345 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001007138 s |
0.001035266 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001033346 s |
0.001061505 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.00100749 s |
0.001033346 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001024129 s |
0.00105165 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000970658 s |
0.000998978 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001024033 s |
0.001049281 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.00102237 s |
0.001049185 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000972354 s |
0.000999874 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001021825 s |
0.001049249 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001017825 s |
0.001045313 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000956705 s |
0.000981729 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001019874 s |
0.001048194 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001018177 s |
0.001044449 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001019554 s |
0.001045025 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001020769 s |
0.001046785 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.00012390525 s |
0.000124138 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.000126831 s |
0.0001265315 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.0001526522499999 s |
0.00015248825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013381225 s |
0.0001342567499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.000130815 s |
0.00013130825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.00014763875 s |
0.000148157 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.0001508275 s |
0.0001509649999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.0002119565 s |
0.00021195375 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.0002610409999999 s |
0.00026127075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.000211834 s |
0.0002120705 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.0002183465 s |
0.000218458 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.00021252675 s |
0.000212148 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.0002181719999999 s |
0.0002188435 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.00021192975 s |
0.00021205775 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035582275 s |
0.0003543785 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.00025678275 s |
0.00025620625 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.00035629325 s |
0.0003539884999999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.0002569964999999 s |
0.00025666725 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.00035625675 s |
0.000354111 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029120825 s |
0.0002908894999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.000356395 s |
0.00035398575 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.0003561189999999 s |
0.00035565625 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.0002720195 s |
0.00027080325 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.0003558807499999 s |
0.0003549992499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.000356267 s |
0.0003538355 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027263975 s |
0.0002722535 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.0003564195 s |
0.0003536705 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.00035795875 s |
0.0003572295 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.00028446375 s |
0.00028288975 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.00035822725 s |
0.00035754225 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.000358309 s |
0.00035657975 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.0003014005 s |
0.0003010629999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.0003587345 s |
0.00035641975 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002241534 s |
0.0010979923999911 s |
2.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.002216465 s |
0.0009642218000408 s |
2.30 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.00212408 s |
0.0009914535999996 s |
2.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002197731 s |
0.0009346603998892 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0021961119999999 s |
0.0009413688000677 s |
2.33 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002373761 s |
0.001004744400052 s |
2.36 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.00225613 s |
0.0010086406000482 s |
2.24 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.005529156 s |
0.0027766954000981 s |
1.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.005023817 s |
0.0023500500001318 s |
2.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.005419653 s |
0.0022446068000135 s |
2.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.005645761 s |
0.0022493833998851 s |
2.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.005693438 s |
0.0021829849999448 s |
2.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005283076 s |
0.002215745400008 s |
2.38 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0054864449999999 s |
0.0024366134001866 s |
2.25 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.010451774 s |
0.0066071121999812 s |
1.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.009404883 s |
0.0060330907999741 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.007992624 s |
0.0055423939998945 s |
1.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.008630675 s |
0.0057159447999765 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.009159435 s |
0.0056442914000399 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.007988127 s |
0.0054469434000566 s |
1.47 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.008718377 s |
0.0055817624000155 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.009059383 s |
0.0072480184000596 s |
1.25 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0090106369999999 s |
0.0057854932002555 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.00852264 s |
0.0056285987998307 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.007899994 s |
0.005813245399986 s |
1.36 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.008903211 s |
0.0059623540000757 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.009060197 s |
0.0058417539999027 s |
1.55 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.008628433 s |
0.005578324200087 s |
1.55 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0072552 s |
0.0057755715999519 s |
1.26 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0087431219999999 s |
0.0058006471998851 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.008413147 s |
0.0057635594001112 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.008607123 s |
0.0058320177999121 s |
1.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.008416237 s |
0.0059808674001033 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002305 s |
0.0010979923999911 s |
2.10 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.003407 s |
0.0009642218000408 s |
3.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.001609 s |
0.0009914535999996 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0019299999999999 s |
0.0009346603998892 s |
2.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.00162 s |
0.0009413688000677 s |
1.72 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.001964 s |
0.001004744400052 s |
1.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.00179 s |
0.0010086406000482 s |
1.77 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.004434 s |
0.0027766954000981 s |
1.60 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.00452 s |
0.0023500500001318 s |
1.92 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.00414 s |
0.0022446068000135 s |
1.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0041589999999999 s |
0.0022493833998851 s |
1.85 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.004301 s |
0.0021829849999448 s |
1.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0052089999999999 s |
0.002215745400008 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.004606 s |
0.0024366134001866 s |
1.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.011292 s |
0.0066071121999812 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0098339999999999 s |
0.0060330907999741 s |
1.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.00958 s |
0.0055423939998945 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.012016 s |
0.0057159447999765 s |
2.10 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.015022 s |
0.0056442914000399 s |
2.66 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.007902 s |
0.0054469434000566 s |
1.45 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.014385 s |
0.0055817624000155 s |
2.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.009478 s |
0.0072480184000596 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.011652 s |
0.0057854932002555 s |
2.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.006878 s |
0.0056285987998307 s |
1.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.011532 s |
0.005813245399986 s |
1.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.009721 s |
0.0059623540000757 s |
1.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0085069999999999 s |
0.0058417539999027 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0080139999999999 s |
0.005578324200087 s |
1.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.008335 s |
0.0057755715999519 s |
1.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.009495 s |
0.0058006471998851 s |
1.64 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.009662 s |
0.0057635594001112 s |
1.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.026705 s |
0.0058320177999121 s |
4.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.014134 s |
0.0059808674001033 s |
2.36 |
scatter_sum / JaXPipe / cpu / Primal |
0.000009147839991783255 s |
0.000008440359997621273 s |
1.08 |
scatter_sum / Jax / cpu / Primal |
0.000009004340035971836 s |
0.0000087820000044303 s |
1.03 |
scatter_sum / HLOOpt / cpu / Primal |
0.000012051700023221202 s |
0.000012110340012441157 s |
1.00 |
scatter_sum / PartOpt / cpu / Primal |
0.000009201820012094684 s |
0.000008506519961883895 s |
1.08 |
scatter_sum / IPartOpt / cpu / Primal |
0.00000890010001057817 s |
0.000008739120039535919 s |
1.02 |
scatter_sum / DefOpt / cpu / Primal |
0.000008363259912584908 s |
0.000008176920000551035 s |
1.02 |
scatter_sum / IDefOpt / cpu / Primal |
0.00000909112001863832 s |
0.000008401979957852746 s |
1.08 |
scatter_sum / JaXPipe / cpu / Forward |
0.000013786280014755904 s |
0.000012322020047577098 s |
1.12 |
scatter_sum / Jax / cpu / Forward |
0.000013691759995708708 s |
0.000011957599945162656 s |
1.15 |
scatter_sum / HLOOpt / cpu / Forward |
0.00001846332002969575 s |
0.000017227240014108248 s |
1.07 |
scatter_sum / PartOpt / cpu / Forward |
0.000019410379973123785 s |
0.000017908880017785123 s |
1.08 |
scatter_sum / IPartOpt / cpu / Forward |
0.000013634399983857291 s |
0.000012497800025812466 s |
1.09 |
scatter_sum / DefOpt / cpu / Forward |
0.000019325459998071893 s |
0.000017549120011608467 s |
1.10 |
scatter_sum / IDefOpt / cpu / Forward |
0.000013564039982156828 s |
0.000012501700020948191 s |
1.08 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000013040960011494462 s |
0.00001307078001445916 s |
1.00 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000012946059941896235 s |
0.00001225669997438672 s |
1.06 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000013096539960315568 s |
0.000017648939983700985 s |
0.74 |
scatter_sum / Jax / cpu / BothRev |
0.000013294119971760664 s |
0.000012138579986640252 s |
1.10 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000012564399994516862 s |
0.000012648520005313913 s |
0.99 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000016941559970291564 s |
0.000017356560047119274 s |
0.98 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000014695219970235483 s |
0.000014113280030869646 s |
1.04 |
scatter_sum / PartOpt / cpu / PreRev |
0.000013426379973680014 s |
0.000011910819985132548 s |
1.13 |
scatter_sum / PartOpt / cpu / PostRev |
0.000013138760023139184 s |
0.000012337380039753044 s |
1.06 |
scatter_sum / PartOpt / cpu / BothRev |
0.000013234120033303042 s |
0.000012810720018023855 s |
1.03 |
scatter_sum / IPartOpt / cpu / PreRev |
0.00001925157999721705 s |
0.00001804224000807153 s |
1.07 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000012707099976978496 s |
0.000013031860025876086 s |
0.98 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000013232420042186276 s |
0.00001302892000239808 s |
1.02 |
scatter_sum / DefOpt / cpu / PreRev |
0.000012935120066686068 s |
0.000012569900018206682 s |
1.03 |
scatter_sum / DefOpt / cpu / PostRev |
0.000012908339986097415 s |
0.000012809520012524444 s |
1.01 |
scatter_sum / DefOpt / cpu / BothRev |
0.000013276920026328298 s |
0.000012639920005312888 s |
1.05 |
scatter_sum / IDefOpt / cpu / PreRev |
0.00001304314001572493 s |
0.000011934459989788592 s |
1.09 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000013412940024863928 s |
0.000012639860005947413 s |
1.06 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000012983579972569716 s |
0.00001264491999791062 s |
1.03 |
scatter_sum / JaXPipe / cuda / Primal |
0.000010144 s |
0.000009888 s |
1.03 |
scatter_sum / Jax / cuda / Primal |
0.000009984 s |
0.000010016 s |
1.00 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009952 s |
0.00001008 s |
0.99 |
scatter_sum / PartOpt / cuda / Primal |
0.000009856 s |
0.00000992 s |
0.99 |
scatter_sum / IPartOpt / cuda / Primal |
0.00001008 s |
0.00001008 s |
1 |
scatter_sum / DefOpt / cuda / Primal |
0.000010112 s |
0.000009984 s |
1.01 |
scatter_sum / IDefOpt / cuda / Primal |
0.000011393 s |
0.000009824 s |
1.16 |
scatter_sum / JaXPipe / cuda / Forward |
0.000019936 s |
0.000017344 s |
1.15 |
scatter_sum / Jax / cuda / Forward |
0.000017247999999999998 s |
0.000017216 s |
1.00 |
scatter_sum / HLOOpt / cuda / Forward |
0.000017313 s |
0.000021664 s |
0.80 |
scatter_sum / PartOpt / cuda / Forward |
0.000016864 s |
0.000017375999999999998 s |
0.97 |
scatter_sum / IPartOpt / cuda / Forward |
0.000017056 s |
0.000017472 s |
0.98 |
scatter_sum / DefOpt / cuda / Forward |
0.000017631 s |
0.000017856 s |
0.99 |
scatter_sum / IDefOpt / cuda / Forward |
0.00001728 s |
0.000017247999999999998 s |
1.00 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000016864 s |
0.000017408 s |
0.97 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000016896000000000002 s |
0.000016705 s |
1.01 |
scatter_sum / JaXPipe / cuda / BothRev |
0.0000168 s |
0.000017375000000000002 s |
0.97 |
scatter_sum / Jax / cuda / BothRev |
0.000017152 s |
0.0000168 s |
1.02 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000016832 s |
0.000017472 s |
0.96 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000017152 s |
0.000017056 s |
1.01 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000016992 s |
0.000018208 s |
0.93 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017375999999999998 s |
0.000017152 s |
1.01 |
scatter_sum / PartOpt / cuda / PostRev |
0.00001712 s |
0.000026272 s |
0.65 |
scatter_sum / PartOpt / cuda / BothRev |
0.00001696 s |
0.000016576000000000002 s |
1.02 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000016992 s |
0.000017247999999999998 s |
0.99 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000016992 s |
0.000017952 s |
0.95 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000017792 s |
0.000017152 s |
1.04 |
scatter_sum / DefOpt / cuda / PreRev |
0.00001824 s |
0.000017696 s |
1.03 |
scatter_sum / DefOpt / cuda / PostRev |
0.000017375999999999998 s |
0.000016383999999999998 s |
1.06 |
scatter_sum / DefOpt / cuda / BothRev |
0.000016255999999999998 s |
0.000017664 s |
0.92 |
scatter_sum / IDefOpt / cuda / PreRev |
0.00001712 s |
0.000017344 s |
0.99 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016927999999999998 s |
0.000016896000000000002 s |
1.00 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000016927999999999998 s |
0.000017343 s |
0.98 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001351175 s |
0.0000013434250000000002 s |
1.01 |
scatter_sum / Jax / tpu / Primal |
0.0000013538 s |
0.000001413975 s |
0.96 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001360975 s |
0.0000013532 s |
1.01 |
scatter_sum / PartOpt / tpu / Primal |
0.000001353 s |
0.00000141455 s |
0.96 |
scatter_sum / IPartOpt / tpu / Primal |
0.0000013611 s |
0.000001352775 s |
1.01 |
scatter_sum / DefOpt / tpu / Primal |
0.000001353475 s |
0.000001414125 s |
0.96 |
scatter_sum / IDefOpt / tpu / Primal |
0.0000013609 s |
0.00000135315 s |
1.01 |
scatter_sum / JaXPipe / tpu / Forward |
0.0000026942500000000005 s |
0.00000271065 s |
0.99 |
scatter_sum / Jax / tpu / Forward |
0.000002735875 s |
0.000002731275 s |
1.00 |
scatter_sum / HLOOpt / tpu / Forward |
0.0000027027 s |
0.00000271145 s |
1.00 |
scatter_sum / PartOpt / tpu / Forward |
0.000002708575 s |
0.0000027006 s |
1.00 |
scatter_sum / IPartOpt / tpu / Forward |
0.00000270545 s |
0.0000027127000000000003 s |
1.00 |
scatter_sum / DefOpt / tpu / Forward |
0.00000271315 s |
0.00000270155 s |
1.00 |
scatter_sum / IDefOpt / tpu / Forward |
0.000002693175 s |
0.0000027107 s |
0.99 |
scatter_sum / JaXPipe / tpu / PreRev |
0.000002698325 s |
0.000002693900000000001 s |
1.00 |
scatter_sum / JaXPipe / tpu / PostRev |
0.00000268905 s |
0.000002693625 s |
1.00 |
scatter_sum / JaXPipe / tpu / BothRev |
0.0000027118 s |
0.000002707425 s |
1.00 |
scatter_sum / Jax / tpu / BothRev |
0.0000027470499999999994 s |
0.000002752975 s |
1.00 |
scatter_sum / HLOOpt / tpu / PreRev |
0.0000027194 s |
0.000002708075 s |
1.00 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000027416 s |
0.000002755075 s |
1.00 |
scatter_sum / HLOOpt / tpu / BothRev |
0.0000027108 s |
0.000002710625 s |
1.00 |
scatter_sum / PartOpt / tpu / PreRev |
0.0000027517000000000003 s |
0.0000027484750000000004 s |
1.00 |
scatter_sum / PartOpt / tpu / PostRev |
0.000002711075 s |
0.000002705775 s |
1.00 |
scatter_sum / PartOpt / tpu / BothRev |
0.00000274615 s |
0.000002757 s |
1.00 |
scatter_sum / IPartOpt / tpu / PreRev |
0.00000272 s |
0.000002708875 s |
1.00 |
scatter_sum / IPartOpt / tpu / PostRev |
0.00000275135 s |
0.0000027508500000000005 s |
1.00 |
scatter_sum / IPartOpt / tpu / BothRev |
0.000002714875 s |
0.0000027068 s |
1.00 |
scatter_sum / DefOpt / tpu / PreRev |
0.000002747 s |
0.0000027486 s |
1.00 |
scatter_sum / DefOpt / tpu / PostRev |
0.000002718175 s |
0.0000027060250000000005 s |
1.00 |
scatter_sum / DefOpt / tpu / BothRev |
0.000002749725 s |
0.00000275165 s |
1.00 |
scatter_sum / IDefOpt / tpu / PreRev |
0.0000027113250000000003 s |
0.000002707525 s |
1.00 |
scatter_sum / IDefOpt / tpu / PostRev |
0.0000027499 s |
0.0000027528750000000003 s |
1.00 |
scatter_sum / IDefOpt / tpu / BothRev |
0.0000027216000000000003 s |
0.0000027087 s |
1.00 |
scatter_sum / JaXPipe / cpu / Primal |
0.000016007 s |
0.000008440359997621273 s |
1.90 |
scatter_sum / Jax / cpu / Primal |
0.000015613 s |
0.0000087820000044303 s |
1.78 |
scatter_sum / HLOOpt / cpu / Primal |
0.000016389999999999997 s |
0.000012110340012441157 s |
1.35 |
scatter_sum / PartOpt / cpu / Primal |
0.000016182 s |
0.000008506519961883895 s |
1.90 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015813 s |
0.000008739120039535919 s |
1.81 |
scatter_sum / DefOpt / cpu / Primal |
0.000016063999999999997 s |
0.000008176920000551035 s |
1.96 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015775 s |
0.000008401979957852746 s |
1.88 |
scatter_sum / JaXPipe / cpu / Forward |
0.000023522000000000003 s |
0.000012322020047577098 s |
1.91 |
scatter_sum / Jax / cpu / Forward |
0.000022868 s |
0.000011957599945162656 s |
1.91 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022327 s |
0.000017227240014108248 s |
1.30 |
scatter_sum / PartOpt / cpu / Forward |
0.000022705 s |
0.000017908880017785123 s |
1.27 |
scatter_sum / IPartOpt / cpu / Forward |
0.000022456 s |
0.000012497800025812466 s |
1.80 |
scatter_sum / DefOpt / cpu / Forward |
0.000023163 s |
0.000017549120011608467 s |
1.32 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022182 s |
0.000012501700020948191 s |
1.77 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000022991 s |
0.00001307078001445916 s |
1.76 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000024367 s |
0.00001225669997438672 s |
1.99 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000023811 s |
0.000017648939983700985 s |
1.35 |
scatter_sum / Jax / cpu / BothRev |
0.000022798 s |
0.000012138579986640252 s |
1.88 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022226 s |
0.000012648520005313913 s |
1.76 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000023701 s |
0.000017356560047119274 s |
1.37 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000022956 s |
0.000014113280030869646 s |
1.63 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022595 s |
0.000011910819985132548 s |
1.90 |
scatter_sum / PartOpt / cpu / PostRev |
0.000024134 s |
0.000012337380039753044 s |
1.96 |
scatter_sum / PartOpt / cpu / BothRev |
0.000023868 s |
0.000012810720018023855 s |
1.86 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000022124 s |
0.00001804224000807153 s |
1.23 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000023774 s |
0.000013031860025876086 s |
1.82 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000023657 s |
0.00001302892000239808 s |
1.82 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022659 s |
0.000012569900018206682 s |
1.80 |
scatter_sum / DefOpt / cpu / PostRev |
0.000023969 s |
0.000012809520012524444 s |
1.87 |
scatter_sum / DefOpt / cpu / BothRev |
0.00002365 s |
0.000012639920005312888 s |
1.87 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022522 s |
0.000011934459989788592 s |
1.89 |
scatter_sum / IDefOpt / cpu / PostRev |
0.00002277 s |
0.000012639860005947413 s |
1.80 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000023648 s |
0.00001264491999791062 s |
1.87 |
scatter_sum / JaXPipe / cpu / Primal |
0.000011 s |
0.000008440359997621273 s |
1.30 |
scatter_sum / Jax / cpu / Primal |
0.000035000000000000004 s |
0.0000087820000044303 s |
3.99 |
scatter_sum / HLOOpt / cpu / Primal |
0.000011 s |
0.000012110340012441157 s |
0.91 |
scatter_sum / PartOpt / cpu / Primal |
0.000011 s |
0.000008506519961883895 s |
1.29 |
scatter_sum / IPartOpt / cpu / Primal |
0.000016 s |
0.000008739120039535919 s |
1.83 |
scatter_sum / DefOpt / cpu / Primal |
0.000015 s |
0.000008176920000551035 s |
1.83 |
scatter_sum / IDefOpt / cpu / Primal |
0.000011 s |
0.000008401979957852746 s |
1.31 |
scatter_sum / JaXPipe / cpu / Forward |
0.000016 s |
0.000012322020047577098 s |
1.30 |
scatter_sum / Jax / cpu / Forward |
0.000016 s |
0.000011957599945162656 s |
1.34 |
scatter_sum / HLOOpt / cpu / Forward |
0.000016 s |
0.000017227240014108248 s |
0.93 |
scatter_sum / PartOpt / cpu / Forward |
0.000016 s |
0.000017908880017785123 s |
0.89 |
scatter_sum / IPartOpt / cpu / Forward |
0.000016 s |
0.000012497800025812466 s |
1.28 |
scatter_sum / DefOpt / cpu / Forward |
0.000019 s |
0.000017549120011608467 s |
1.08 |
scatter_sum / IDefOpt / cpu / Forward |
0.000017 s |
0.000012501700020948191 s |
1.36 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000016 s |
0.00001307078001445916 s |
1.22 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000016 s |
0.00001225669997438672 s |
1.31 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000024 s |
0.000017648939983700985 s |
1.36 |
scatter_sum / Jax / cpu / BothRev |
0.000019 s |
0.000012138579986640252 s |
1.57 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000017 s |
0.000012648520005313913 s |
1.34 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000017 s |
0.000017356560047119274 s |
0.98 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000019 s |
0.000014113280030869646 s |
1.35 |
scatter_sum / PartOpt / cpu / PreRev |
0.000017999999999999997 s |
0.000011910819985132548 s |
1.51 |
scatter_sum / PartOpt / cpu / PostRev |
0.000016 s |
0.000012337380039753044 s |
1.30 |
scatter_sum / PartOpt / cpu / BothRev |
0.000017 s |
0.000012810720018023855 s |
1.33 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000017 s |
0.00001804224000807153 s |
0.94 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000016 s |
0.000013031860025876086 s |
1.23 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000017 s |
0.00001302892000239808 s |
1.30 |
scatter_sum / DefOpt / cpu / PreRev |
0.000016 s |
0.000012569900018206682 s |
1.27 |
scatter_sum / DefOpt / cpu / PostRev |
0.000017 s |
0.000012809520012524444 s |
1.33 |
scatter_sum / DefOpt / cpu / BothRev |
0.000016 s |
0.000012639920005312888 s |
1.27 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000017 s |
0.000011934459989788592 s |
1.42 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000017 s |
0.000012639860005947413 s |
1.34 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000017 s |
0.00001264491999791062 s |
1.34 |
slicing / JaXPipe / cpu / Primal |
0.000008105140004772693 s |
0.000007139699982872117 s |
1.14 |
slicing / Jax / cpu / Primal |
0.000006579500004590954 s |
0.000006225620008990518 s |
1.06 |
slicing / HLOOpt / cpu / Primal |
0.000011492300009194878 s |
0.000009641319975344231 s |
1.19 |
slicing / PartOpt / cpu / Primal |
0.000007086600035108858 s |
0.000006203919992913143 s |
1.14 |
slicing / IPartOpt / cpu / Primal |
0.000008129899988489342 s |
0.000006752039989805781 s |
1.20 |
slicing / DefOpt / cpu / Primal |
0.000011949499930778983 s |
0.000010541800020291702 s |
1.13 |
slicing / IDefOpt / cpu / Primal |
0.000006886060036777053 s |
0.000006175259986775927 s |
1.12 |
slicing / JaXPipe / cpu / Forward |
0.000010785320000650245 s |
0.000009888000040518818 s |
1.09 |
slicing / Jax / cpu / Forward |
0.000011396900008548984 s |
0.000010671199988792067 s |
1.07 |
slicing / HLOOpt / cpu / Forward |
0.000016003540022211382 s |
0.000014793900018048587 s |
1.08 |
slicing / PartOpt / cpu / Forward |
0.000017621219967622892 s |
0.000014559259980160278 s |
1.21 |
slicing / IPartOpt / cpu / Forward |
0.000010517399987293176 s |
0.000009506959941063542 s |
1.11 |
slicing / DefOpt / cpu / Forward |
0.00001519784000265645 s |
0.000015004759989096783 s |
1.01 |
slicing / IDefOpt / cpu / Forward |
0.00001048092000928591 s |
0.00000969529999565566 s |
1.08 |
slicing / JaXPipe / cpu / PreRev |
0.00001156928000455082 s |
0.000010796019987537876 s |
1.07 |
slicing / JaXPipe / cpu / PostRev |
0.000011554560023796511 s |
0.000010617620000630268 s |
1.09 |
slicing / JaXPipe / cpu / BothRev |
0.00001537107999865839 s |
0.000013817860008202842 s |
1.11 |
slicing / Jax / cpu / BothRev |
0.000011821740017694535 s |
0.000010457820008014095 s |
1.13 |
slicing / HLOOpt / cpu / PreRev |
0.00001087052000912081 s |
0.000010395860017524684 s |
1.05 |
slicing / HLOOpt / cpu / PostRev |
0.000011583280002014365 s |
0.00001077822001207096 s |
1.07 |
slicing / HLOOpt / cpu / BothRev |
0.000012874519979959588 s |
0.000012587820001499495 s |
1.02 |
slicing / PartOpt / cpu / PreRev |
0.000011304140052743603 s |
0.000010113780035680977 s |
1.12 |
slicing / PartOpt / cpu / PostRev |
0.00001165402000879112 s |
0.00001018205997752375 s |
1.14 |
slicing / PartOpt / cpu / BothRev |
0.000011771480030802197 s |
0.000010186500012423496 s |
1.16 |
slicing / IPartOpt / cpu / PreRev |
0.000016095400005724515 s |
0.000010268260002703756 s |
1.57 |
slicing / IPartOpt / cpu / PostRev |
0.000011564819997147425 s |
0.000011290060001556412 s |
1.02 |
slicing / IPartOpt / cpu / BothRev |
0.000011689720013237093 s |
0.0000105920400164905 s |
1.10 |
slicing / DefOpt / cpu / PreRev |
0.00001079020001270692 s |
0.000010875059988393332 s |
0.99 |
slicing / DefOpt / cpu / PostRev |
0.000011439800018706592 s |
0.000011181340005350648 s |
1.02 |
slicing / DefOpt / cpu / BothRev |
0.000011289100029898693 s |
0.000011073839996242896 s |
1.02 |
slicing / IDefOpt / cpu / PreRev |
0.000010925060050794855 s |
0.000010779179983728682 s |
1.01 |
slicing / IDefOpt / cpu / PostRev |
0.000011775099965234404 s |
0.00001098723997529305 s |
1.07 |
slicing / IDefOpt / cpu / BothRev |
0.00001111052003579971 s |
0.000010734000024967828 s |
1.04 |
slicing / JaXPipe / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / HLOOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / PartOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
slicing / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
slicing / DefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / IDefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / JaXPipe / cuda / Forward |
0.000010656 s |
0.000010016 s |
1.06 |
slicing / Jax / cuda / Forward |
0.000010528 s |
0.000009824 s |
1.07 |
slicing / HLOOpt / cuda / Forward |
0.000010176 s |
0.000009728 s |
1.05 |
slicing / PartOpt / cuda / Forward |
0.00000976 s |
0.000009824 s |
0.99 |
slicing / IPartOpt / cuda / Forward |
0.000010016 s |
0.000009921 s |
1.01 |
slicing / DefOpt / cuda / Forward |
0.000009888 s |
0.000009631 s |
1.03 |
slicing / IDefOpt / cuda / Forward |
0.000010176 s |
0.00000992 s |
1.03 |
slicing / JaXPipe / cuda / PreRev |
0.000009952 s |
0.000010273 s |
0.97 |
slicing / JaXPipe / cuda / PostRev |
0.000009889 s |
0.000010176 s |
0.97 |
slicing / JaXPipe / cuda / BothRev |
0.000010207 s |
0.000009952 s |
1.03 |
slicing / Jax / cuda / BothRev |
0.000009568 s |
0.000010816 s |
0.88 |
slicing / HLOOpt / cuda / PreRev |
0.000010112 s |
0.00001024 s |
0.99 |
slicing / HLOOpt / cuda / PostRev |
0.000009792 s |
0.000009792 s |
1 |
slicing / HLOOpt / cuda / BothRev |
0.000009536 s |
0.000010048 s |
0.95 |
slicing / PartOpt / cuda / PreRev |
0.00001008 s |
0.000009696 s |
1.04 |
slicing / PartOpt / cuda / PostRev |
0.000009951 s |
0.000009792 s |
1.02 |
slicing / PartOpt / cuda / BothRev |
0.000009824 s |
0.000010176 s |
0.97 |
slicing / IPartOpt / cuda / PreRev |
0.000010016 s |
0.00000992 s |
1.01 |
slicing / IPartOpt / cuda / PostRev |
0.00001008 s |
0.00000944 s |
1.07 |
slicing / IPartOpt / cuda / BothRev |
0.000009856 s |
0.000009856 s |
1 |
slicing / DefOpt / cuda / PreRev |
0.000009888 s |
0.000009856 s |
1.00 |
slicing / DefOpt / cuda / PostRev |
0.000010432 s |
0.00000992 s |
1.05 |
slicing / DefOpt / cuda / BothRev |
0.00001008 s |
0.000010017 s |
1.01 |
slicing / IDefOpt / cuda / PreRev |
0.000015136 s |
0.000010496 s |
1.44 |
slicing / IDefOpt / cuda / PostRev |
0.000010048 s |
0.000010112 s |
0.99 |
slicing / IDefOpt / cuda / BothRev |
0.000009697 s |
0.000009984 s |
0.97 |
slicing / JaXPipe / tpu / Primal |
9.688499999999998e-7 s |
0.000001015375 s |
0.95 |
slicing / Jax / tpu / Primal |
9.666750000000002e-7 s |
9.71525e-7 s |
1.00 |
slicing / HLOOpt / tpu / Primal |
9.628e-7 s |
0.00000103085 s |
0.93 |
slicing / PartOpt / tpu / Primal |
9.5975e-7 s |
9.64475e-7 s |
1.00 |
slicing / IPartOpt / tpu / Primal |
9.59775e-7 s |
0.000001018 s |
0.94 |
slicing / DefOpt / tpu / Primal |
9.65725e-7 s |
9.70225e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
9.601e-7 s |
0.000001015725 s |
0.95 |
slicing / JaXPipe / tpu / Forward |
0.00000140885 s |
0.000001419375 s |
0.99 |
slicing / Jax / tpu / Forward |
0.00000141925 s |
0.000001471425 s |
0.96 |
slicing / HLOOpt / tpu / Forward |
0.000001517975 s |
0.0000015144750000000002 s |
1.00 |
slicing / PartOpt / tpu / Forward |
0.00000144435 s |
0.000001496475 s |
0.97 |
slicing / IPartOpt / tpu / Forward |
0.00000152035 s |
0.00000151695 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.000001438075 s |
0.00000148795 s |
0.97 |
slicing / IDefOpt / tpu / Forward |
0.0000015166750000000002 s |
0.0000015108750000000002 s |
1.00 |
slicing / JaXPipe / tpu / PreRev |
0.000002331775 s |
0.00000256505 s |
0.91 |
slicing / JaXPipe / tpu / PostRev |
0.00000250855 s |
0.0000025232500000000004 s |
0.99 |
slicing / JaXPipe / tpu / BothRev |
0.0000023656 s |
0.00000262075 s |
0.90 |
slicing / Jax / tpu / BothRev |
0.0000025276749999999995 s |
0.0000025406 s |
0.99 |
slicing / HLOOpt / tpu / PreRev |
0.00000235305 s |
0.000002599725 s |
0.91 |
slicing / HLOOpt / tpu / PostRev |
0.000002521575 s |
0.000002542675 s |
0.99 |
slicing / HLOOpt / tpu / BothRev |
0.0000023554500000000005 s |
0.000002587675 s |
0.91 |
slicing / PartOpt / tpu / PreRev |
0.0000025455250000000005 s |
0.0000025377499999999995 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.000002347025 s |
0.000002587575 s |
0.91 |
slicing / PartOpt / tpu / BothRev |
0.000002534375 s |
0.00000254335 s |
1.00 |
slicing / IPartOpt / tpu / PreRev |
0.000002359275 s |
0.0000025787 s |
0.91 |
slicing / IPartOpt / tpu / PostRev |
0.000002540675 s |
0.000002556 s |
0.99 |
slicing / IPartOpt / tpu / BothRev |
0.000002349425 s |
0.000002581425 s |
0.91 |
slicing / DefOpt / tpu / PreRev |
0.00000253495 s |
0.0000025392 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.000002354125 s |
0.0000025871 s |
0.91 |
slicing / DefOpt / tpu / BothRev |
0.00000253195 s |
0.000002539625 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.00000234255 s |
0.0000025818000000000004 s |
0.91 |
slicing / IDefOpt / tpu / PostRev |
0.00000252715 s |
0.0000025369 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.000002348875 s |
0.0000025960250000000003 s |
0.90 |
slicing / JaXPipe / cpu / Primal |
0.000012989 s |
0.000007139699982872117 s |
1.82 |
slicing / Jax / cpu / Primal |
0.000013042 s |
0.000006225620008990518 s |
2.09 |
slicing / HLOOpt / cpu / Primal |
0.000012918 s |
0.000009641319975344231 s |
1.34 |
slicing / PartOpt / cpu / Primal |
0.000012983 s |
0.000006203919992913143 s |
2.09 |
slicing / IPartOpt / cpu / Primal |
0.00001277 s |
0.000006752039989805781 s |
1.89 |
slicing / DefOpt / cpu / Primal |
0.000013083 s |
0.000010541800020291702 s |
1.24 |
slicing / IDefOpt / cpu / Primal |
0.000012736 s |
0.000006175259986775927 s |
2.06 |
slicing / JaXPipe / cpu / Forward |
0.000017434000000000003 s |
0.000009888000040518818 s |
1.76 |
slicing / Jax / cpu / Forward |
0.000016664000000000002 s |
0.000010671199988792067 s |
1.56 |
slicing / HLOOpt / cpu / Forward |
0.000016676 s |
0.000014793900018048587 s |
1.13 |
slicing / PartOpt / cpu / Forward |
0.000017174 s |
0.000014559259980160278 s |
1.18 |
slicing / IPartOpt / cpu / Forward |
0.000016823 s |
0.000009506959941063542 s |
1.77 |
slicing / DefOpt / cpu / Forward |
0.000016843 s |
0.000015004759989096783 s |
1.12 |
slicing / IDefOpt / cpu / Forward |
0.000016721 s |
0.00000969529999565566 s |
1.72 |
slicing / JaXPipe / cpu / PreRev |
0.000017229 s |
0.000010796019987537876 s |
1.60 |
slicing / JaXPipe / cpu / PostRev |
0.000017779 s |
0.000010617620000630268 s |
1.67 |
slicing / JaXPipe / cpu / BothRev |
0.000017587999999999998 s |
0.000013817860008202842 s |
1.27 |
slicing / Jax / cpu / BothRev |
0.000018046 s |
0.000010457820008014095 s |
1.73 |
slicing / HLOOpt / cpu / PreRev |
0.000017004 s |
0.000010395860017524684 s |
1.64 |
slicing / HLOOpt / cpu / PostRev |
0.000017477 s |
0.00001077822001207096 s |
1.62 |
slicing / HLOOpt / cpu / BothRev |
0.000017706 s |
0.000012587820001499495 s |
1.41 |
slicing / PartOpt / cpu / PreRev |
0.00001706 s |
0.000010113780035680977 s |
1.69 |
slicing / PartOpt / cpu / PostRev |
0.000018226 s |
0.00001018205997752375 s |
1.79 |
slicing / PartOpt / cpu / BothRev |
0.000018010000000000002 s |
0.000010186500012423496 s |
1.77 |
slicing / IPartOpt / cpu / PreRev |
0.000017629 s |
0.000010268260002703756 s |
1.72 |
slicing / IPartOpt / cpu / PostRev |
0.000018687 s |
0.000011290060001556412 s |
1.66 |
slicing / IPartOpt / cpu / BothRev |
0.000017926 s |
0.0000105920400164905 s |
1.69 |
slicing / DefOpt / cpu / PreRev |
0.000017676999999999997 s |
0.000010875059988393332 s |
1.63 |
slicing / DefOpt / cpu / PostRev |
0.000018077 s |
0.000011181340005350648 s |
1.62 |
slicing / DefOpt / cpu / BothRev |
0.000017523 s |
0.000011073839996242896 s |
1.58 |
slicing / IDefOpt / cpu / PreRev |
0.000017451999999999998 s |
0.000010779179983728682 s |
1.62 |
slicing / IDefOpt / cpu / PostRev |
0.000017395000000000002 s |
0.00001098723997529305 s |
1.58 |
slicing / IDefOpt / cpu / BothRev |
0.000017852 s |
0.000010734000024967828 s |
1.66 |
slicing / JaXPipe / cpu / Primal |
0.000008999999999999999 s |
0.000007139699982872117 s |
1.26 |
slicing / Jax / cpu / Primal |
0.000008999999999999999 s |
0.000006225620008990518 s |
1.45 |
slicing / HLOOpt / cpu / Primal |
0.000008 s |
0.000009641319975344231 s |
0.83 |
slicing / PartOpt / cpu / Primal |
0.000008 s |
0.000006203919992913143 s |
1.29 |
slicing / IPartOpt / cpu / Primal |
0.000031 s |
0.000006752039989805781 s |
4.59 |
slicing / DefOpt / cpu / Primal |
0.00003 s |
0.000010541800020291702 s |
2.85 |
slicing / IDefOpt / cpu / Primal |
0.000019 s |
0.000006175259986775927 s |
3.08 |
slicing / JaXPipe / cpu / Forward |
0.000012 s |
0.000009888000040518818 s |
1.21 |
slicing / Jax / cpu / Forward |
0.000012 s |
0.000010671199988792067 s |
1.12 |
slicing / HLOOpt / cpu / Forward |
0.000013 s |
0.000014793900018048587 s |
0.88 |
slicing / PartOpt / cpu / Forward |
0.000011 s |
0.000014559259980160278 s |
0.76 |
slicing / IPartOpt / cpu / Forward |
0.000012 s |
0.000009506959941063542 s |
1.26 |
slicing / DefOpt / cpu / Forward |
0.000012 s |
0.000015004759989096783 s |
0.80 |
slicing / IDefOpt / cpu / Forward |
0.000012 s |
0.00000969529999565566 s |
1.24 |
slicing / JaXPipe / cpu / PreRev |
0.000012 s |
0.000010796019987537876 s |
1.11 |
slicing / JaXPipe / cpu / PostRev |
0.000012 s |
0.000010617620000630268 s |
1.13 |
slicing / JaXPipe / cpu / BothRev |
0.000012 s |
0.000013817860008202842 s |
0.87 |
slicing / Jax / cpu / BothRev |
0.000012 s |
0.000010457820008014095 s |
1.15 |
slicing / HLOOpt / cpu / PreRev |
0.000014 s |
0.000010395860017524684 s |
1.35 |
slicing / HLOOpt / cpu / PostRev |
0.000012 s |
0.00001077822001207096 s |
1.11 |
slicing / HLOOpt / cpu / BothRev |
0.000012 s |
0.000012587820001499495 s |
0.95 |
slicing / PartOpt / cpu / PreRev |
0.000013 s |
0.000010113780035680977 s |
1.29 |
slicing / PartOpt / cpu / PostRev |
0.000012 s |
0.00001018205997752375 s |
1.18 |
slicing / PartOpt / cpu / BothRev |
0.000012 s |
0.000010186500012423496 s |
1.18 |
slicing / IPartOpt / cpu / PreRev |
0.000012 s |
0.000010268260002703756 s |
1.17 |
slicing / IPartOpt / cpu / PostRev |
0.000012 s |
0.000011290060001556412 s |
1.06 |
slicing / IPartOpt / cpu / BothRev |
0.000012 s |
0.0000105920400164905 s |
1.13 |
slicing / DefOpt / cpu / PreRev |
0.000012 s |
0.000010875059988393332 s |
1.10 |
slicing / DefOpt / cpu / PostRev |
0.000012 s |
0.000011181340005350648 s |
1.07 |
slicing / DefOpt / cpu / BothRev |
0.000012 s |
0.000011073839996242896 s |
1.08 |
slicing / IDefOpt / cpu / PreRev |
0.000012 s |
0.000010779179983728682 s |
1.11 |
slicing / IDefOpt / cpu / PostRev |
0.000013 s |
0.00001098723997529305 s |
1.18 |
slicing / IDefOpt / cpu / BothRev |
0.000012 s |
0.000010734000024967828 s |
1.12 |
sum / JaXPipe / cpu / Primal |
0.00000946393995945982 s |
0.000009182520043395926 s |
1.03 |
sum / Jax / cpu / Primal |
0.000008301480011141394 s |
0.000008828279960653163 s |
0.94 |
sum / HLOOpt / cpu / Primal |
0.000012508000036177691 s |
0.000012311380023675156 s |
1.02 |
sum / PartOpt / cpu / Primal |
0.000008498340012010886 s |
0.000008517040032529622 s |
1.00 |
sum / IPartOpt / cpu / Primal |
0.000008799700026429491 s |
0.00000768539998716733 s |
1.14 |
sum / DefOpt / cpu / Primal |
0.00001349050000499119 s |
0.000008223459944929346 s |
1.64 |
sum / IDefOpt / cpu / Primal |
0.000008610579989181133 s |
0.00000776467998548469 s |
1.11 |
sum / JaXPipe / cpu / Forward |
0.000012908859998788102 s |
0.000011874920019181446 s |
1.09 |
sum / Jax / cpu / Forward |
0.000012571580036819796 s |
0.0000123945599898434 s |
1.01 |
sum / HLOOpt / cpu / Forward |
0.00001828192002903961 s |
0.00001648988003580598 s |
1.11 |
sum / PartOpt / cpu / Forward |
0.000017889519949676468 s |
0.000016962420004347224 s |
1.05 |
sum / IPartOpt / cpu / Forward |
0.000012520399995992194 s |
0.000011852979978357326 s |
1.06 |
sum / DefOpt / cpu / Forward |
0.00001823218001845817 s |
0.000017227739990630654 s |
1.06 |
sum / IDefOpt / cpu / Forward |
0.000013308400066307512 s |
0.000012147320021540508 s |
1.10 |
sum / JaXPipe / cpu / PreRev |
0.000011959299999944053 s |
0.00001209846000165271 s |
0.99 |
sum / JaXPipe / cpu / PostRev |
0.000012749879997500104 s |
0.000012334540024312446 s |
1.03 |
sum / JaXPipe / cpu / BothRev |
0.000016190540027309907 s |
0.00001149539999460103 s |
1.41 |
sum / Jax / cpu / BothRev |
0.000012685779975072364 s |
0.000011714179963746574 s |
1.08 |
sum / HLOOpt / cpu / PreRev |
0.000012113940028939396 s |
0.00001129744000536448 s |
1.07 |
sum / HLOOpt / cpu / PostRev |
0.000016572340045968302 s |
0.00001198017997921852 s |
1.38 |
sum / HLOOpt / cpu / BothRev |
0.000013566459992944146 s |
0.000013544299981731456 s |
1.00 |
sum / PartOpt / cpu / PreRev |
0.000012229200028741617 s |
0.000011844500004372094 s |
1.03 |
sum / PartOpt / cpu / PostRev |
0.00001193619998957729 s |
0.000011333900010868092 s |
1.05 |
sum / PartOpt / cpu / BothRev |
0.000012072679955963397 s |
0.000011511460006659036 s |
1.05 |
sum / IPartOpt / cpu / PreRev |
0.00001194627997392672 s |
0.000013520800030164537 s |
0.88 |
sum / IPartOpt / cpu / PostRev |
0.000012495500031945994 s |
0.000011325380037305876 s |
1.10 |
sum / IPartOpt / cpu / BothRev |
0.000011917880019609584 s |
0.000011000579979736358 s |
1.08 |
sum / DefOpt / cpu / PreRev |
0.000012012799988951885 s |
0.000011095880036009477 s |
1.08 |
sum / DefOpt / cpu / PostRev |
0.00001192972001263115 s |
0.000011411460000090302 s |
1.05 |
sum / DefOpt / cpu / BothRev |
0.000012338120004642406 s |
0.00001147460001448053 s |
1.08 |
sum / IDefOpt / cpu / PreRev |
0.000012404099925333868 s |
0.000010913199957940378 s |
1.14 |
sum / IDefOpt / cpu / PostRev |
0.000012005920025330852 s |
0.000011719559970515548 s |
1.02 |
sum / IDefOpt / cpu / BothRev |
0.000012019479954687996 s |
0.000011089339977843338 s |
1.08 |
sum / JaXPipe / cuda / Primal |
0.000002048 s |
0.000002048 s |
1 |
sum / Jax / cuda / Primal |
0.000002047 s |
0.000002048 s |
1.00 |
sum / HLOOpt / cuda / Primal |
0.000002047 s |
0.000002047 s |
1 |
sum / PartOpt / cuda / Primal |
0.000002048 s |
0.00000208 s |
0.98 |
sum / IPartOpt / cuda / Primal |
0.000002048 s |
0.00000208 s |
0.98 |
sum / DefOpt / cuda / Primal |
0.000002048 s |
0.00000208 s |
0.98 |
sum / IDefOpt / cuda / Primal |
0.000002047 s |
0.00000208 s |
0.98 |
sum / JaXPipe / cuda / Forward |
0.000010016 s |
0.000010176 s |
0.98 |
sum / Jax / cuda / Forward |
0.000010016 s |
0.00001008 s |
0.99 |
sum / HLOOpt / cuda / Forward |
0.000010016 s |
0.00001008 s |
0.99 |
sum / PartOpt / cuda / Forward |
0.000010048 s |
0.000010272 s |
0.98 |
sum / IPartOpt / cuda / Forward |
0.00001024 s |
0.000010272 s |
1.00 |
sum / DefOpt / cuda / Forward |
0.000010272 s |
0.000009952 s |
1.03 |
sum / IDefOpt / cuda / Forward |
0.000009504 s |
0.000010336 s |
0.92 |
sum / JaXPipe / cuda / PreRev |
0.000009984 s |
0.000009856 s |
1.01 |
sum / JaXPipe / cuda / PostRev |
0.000009952 s |
0.00000976 s |
1.02 |
sum / JaXPipe / cuda / BothRev |
0.000009696 s |
0.000010048 s |
0.96 |
sum / Jax / cuda / BothRev |
0.000009345 s |
0.000009888 s |
0.95 |
sum / HLOOpt / cuda / PreRev |
0.000009824 s |
0.00000928 s |
1.06 |
sum / HLOOpt / cuda / PostRev |
0.000010176 s |
0.000009792 s |
1.04 |
sum / HLOOpt / cuda / BothRev |
0.000009856 s |
0.000009664 s |
1.02 |
sum / PartOpt / cuda / PreRev |
0.000009696 s |
0.000009824 s |
0.99 |
sum / PartOpt / cuda / PostRev |
0.00001008 s |
0.000009792 s |
1.03 |
sum / PartOpt / cuda / BothRev |
0.000010112 s |
0.000009952 s |
1.02 |
sum / IPartOpt / cuda / PreRev |
0.000009984 s |
0.000009663 s |
1.03 |
sum / IPartOpt / cuda / PostRev |
0.00001024 s |
0.000009536 s |
1.07 |
sum / IPartOpt / cuda / BothRev |
0.000009856 s |
0.000009536 s |
1.03 |
sum / DefOpt / cuda / PreRev |
0.00000976 s |
0.000009888 s |
0.99 |
sum / DefOpt / cuda / PostRev |
0.000009728 s |
0.000009729 s |
1.00 |
sum / DefOpt / cuda / BothRev |
0.00001024 s |
0.000009504 s |
1.08 |
sum / IDefOpt / cuda / PreRev |
0.0000104 s |
0.00000992 s |
1.05 |
sum / IDefOpt / cuda / PostRev |
0.00000992 s |
0.000009664 s |
1.03 |
sum / IDefOpt / cuda / BothRev |
0.000009568 s |
0.000009504 s |
1.01 |
sum / JaXPipe / tpu / Primal |
5.03e-7 s |
5.10575e-7 s |
0.99 |
sum / Jax / tpu / Primal |
5.566999999999999e-7 s |
5.584499999999999e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.1305e-7 s |
5.2135e-7 s |
0.98 |
sum / PartOpt / tpu / Primal |
5.57075e-7 s |
5.5805e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.129000000000001e-7 s |
5.21425e-7 s |
0.98 |
sum / DefOpt / tpu / Primal |
5.5695e-7 s |
5.58625e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.1275e-7 s |
5.215750000000001e-7 s |
0.98 |
sum / JaXPipe / tpu / Forward |
0.0000015509 s |
0.0000015555749999999995 s |
1.00 |
sum / Jax / tpu / Forward |
0.000001492 s |
0.00000149725 s |
1.00 |
sum / HLOOpt / tpu / Forward |
0.000001528575 s |
0.000001531225 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.000001492575 s |
0.000001504575 s |
0.99 |
sum / IPartOpt / tpu / Forward |
0.000001531775 s |
0.0000015308250000000005 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.0000014939749999999995 s |
0.000001492625 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.000001528425 s |
0.0000015318 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
9.9025e-7 s |
0.000001052375 s |
0.94 |
sum / JaXPipe / tpu / PostRev |
0.000001051775 s |
0.00000109095 s |
0.96 |
sum / JaXPipe / tpu / BothRev |
9.921e-7 s |
0.0000010573 s |
0.94 |
sum / Jax / tpu / BothRev |
0.0000010337499999999998 s |
0.0000010866 s |
0.95 |
sum / HLOOpt / tpu / PreRev |
9.92875e-7 s |
0.000001045525 s |
0.95 |
sum / HLOOpt / tpu / PostRev |
0.000001040125 s |
0.00000108695 s |
0.96 |
sum / HLOOpt / tpu / BothRev |
9.90525e-7 s |
0.0000010473499999999998 s |
0.95 |
sum / PartOpt / tpu / PreRev |
0.000001033275 s |
0.00000108655 s |
0.95 |
sum / PartOpt / tpu / PostRev |
9.96925e-7 s |
0.000001055375 s |
0.94 |
sum / PartOpt / tpu / BothRev |
0.0000010386 s |
0.000001086675 s |
0.96 |
sum / IPartOpt / tpu / PreRev |
9.92825e-7 s |
0.0000010486 s |
0.95 |
sum / IPartOpt / tpu / PostRev |
0.00000104065 s |
0.000001084325 s |
0.96 |
sum / IPartOpt / tpu / BothRev |
9.93825e-7 s |
0.000001049875 s |
0.95 |
sum / DefOpt / tpu / PreRev |
0.00000103695 s |
0.000001085475 s |
0.96 |
sum / DefOpt / tpu / PostRev |
9.962e-7 s |
0.0000010529 s |
0.95 |
sum / DefOpt / tpu / BothRev |
0.0000010416499999999998 s |
0.00000108915 s |
0.96 |
sum / IDefOpt / tpu / PreRev |
9.96175e-7 s |
0.00000104825 s |
0.95 |
sum / IDefOpt / tpu / PostRev |
0.000001046875 s |
0.00000109085 s |
0.96 |
sum / IDefOpt / tpu / BothRev |
9.876e-7 s |
0.000001048925 s |
0.94 |
sum / JaXPipe / cpu / Primal |
0.000014772 s |
0.000009182520043395926 s |
1.61 |
sum / Jax / cpu / Primal |
0.000014617 s |
0.000008828279960653163 s |
1.66 |
sum / HLOOpt / cpu / Primal |
0.00001506 s |
0.000012311380023675156 s |
1.22 |
sum / PartOpt / cpu / Primal |
0.000014692 s |
0.000008517040032529622 s |
1.73 |
sum / IPartOpt / cpu / Primal |
0.000014378 s |
0.00000768539998716733 s |
1.87 |
sum / DefOpt / cpu / Primal |
0.000014507 s |
0.000008223459944929346 s |
1.76 |
sum / IDefOpt / cpu / Primal |
0.000014602 s |
0.00000776467998548469 s |
1.88 |
sum / JaXPipe / cpu / Forward |
0.000019846 s |
0.000011874920019181446 s |
1.67 |
sum / Jax / cpu / Forward |
0.000020383 s |
0.0000123945599898434 s |
1.64 |
sum / HLOOpt / cpu / Forward |
0.000020161 s |
0.00001648988003580598 s |
1.22 |
sum / PartOpt / cpu / Forward |
0.000020074 s |
0.000016962420004347224 s |
1.18 |
sum / IPartOpt / cpu / Forward |
0.000020236 s |
0.000011852979978357326 s |
1.71 |
sum / DefOpt / cpu / Forward |
0.000020103 s |
0.000017227739990630654 s |
1.17 |
sum / IDefOpt / cpu / Forward |
0.000020028 s |
0.000012147320021540508 s |
1.65 |
sum / JaXPipe / cpu / PreRev |
0.000019266 s |
0.00001209846000165271 s |
1.59 |
sum / JaXPipe / cpu / PostRev |
0.000019116 s |
0.000012334540024312446 s |
1.55 |
sum / JaXPipe / cpu / BothRev |
0.000019605 s |
0.00001149539999460103 s |
1.71 |
sum / Jax / cpu / BothRev |
0.000019227 s |
0.000011714179963746574 s |
1.64 |
sum / HLOOpt / cpu / PreRev |
0.000018577 s |
0.00001129744000536448 s |
1.64 |
sum / HLOOpt / cpu / PostRev |
0.000019268 s |
0.00001198017997921852 s |
1.61 |
sum / HLOOpt / cpu / BothRev |
0.000019988 s |
0.000013544299981731456 s |
1.48 |
sum / PartOpt / cpu / PreRev |
0.000019255 s |
0.000011844500004372094 s |
1.63 |
sum / PartOpt / cpu / PostRev |
0.000019342 s |
0.000011333900010868092 s |
1.71 |
sum / PartOpt / cpu / BothRev |
0.000019047 s |
0.000011511460006659036 s |
1.65 |
sum / IPartOpt / cpu / PreRev |
0.000019193 s |
0.000013520800030164537 s |
1.42 |
sum / IPartOpt / cpu / PostRev |
0.000020248 s |
0.000011325380037305876 s |
1.79 |
sum / IPartOpt / cpu / BothRev |
0.00001947 s |
0.000011000579979736358 s |
1.77 |
sum / DefOpt / cpu / PreRev |
0.0000194 s |
0.000011095880036009477 s |
1.75 |
sum / DefOpt / cpu / PostRev |
0.000019899 s |
0.000011411460000090302 s |
1.74 |
sum / DefOpt / cpu / BothRev |
0.000019542 s |
0.00001147460001448053 s |
1.70 |
sum / IDefOpt / cpu / PreRev |
0.000019256 s |
0.000010913199957940378 s |
1.76 |
sum / IDefOpt / cpu / PostRev |
0.000019245000000000003 s |
0.000011719559970515548 s |
1.64 |
sum / IDefOpt / cpu / BothRev |
0.000019839 s |
0.000011089339977843338 s |
1.79 |
sum / JaXPipe / cpu / Primal |
0.00001 s |
0.000009182520043395926 s |
1.09 |
sum / Jax / cpu / Primal |
0.00001 s |
0.000008828279960653163 s |
1.13 |
sum / HLOOpt / cpu / Primal |
0.00001 s |
0.000012311380023675156 s |
0.81 |
sum / PartOpt / cpu / Primal |
0.00001 s |
0.000008517040032529622 s |
1.17 |
sum / IPartOpt / cpu / Primal |
0.00001 s |
0.00000768539998716733 s |
1.30 |
sum / DefOpt / cpu / Primal |
0.00001 s |
0.000008223459944929346 s |
1.22 |
sum / IDefOpt / cpu / Primal |
0.00001 s |
0.00000776467998548469 s |
1.29 |
sum / JaXPipe / cpu / Forward |
0.000045 s |
0.000011874920019181446 s |
3.79 |
sum / Jax / cpu / Forward |
0.000015 s |
0.0000123945599898434 s |
1.21 |
sum / HLOOpt / cpu / Forward |
0.000015 s |
0.00001648988003580598 s |
0.91 |
sum / PartOpt / cpu / Forward |
0.000015 s |
0.000016962420004347224 s |
0.88 |
sum / IPartOpt / cpu / Forward |
0.000014 s |
0.000011852979978357326 s |
1.18 |
sum / DefOpt / cpu / Forward |
0.000014 s |
0.000017227739990630654 s |
0.81 |
sum / IDefOpt / cpu / Forward |
0.000014 s |
0.000012147320021540508 s |
1.15 |
sum / JaXPipe / cpu / PreRev |
0.000016 s |
0.00001209846000165271 s |
1.32 |
sum / JaXPipe / cpu / PostRev |
0.000014 s |
0.000012334540024312446 s |
1.14 |
sum / JaXPipe / cpu / BothRev |
0.000013 s |
0.00001149539999460103 s |
1.13 |
sum / Jax / cpu / BothRev |
0.000014 s |
0.000011714179963746574 s |
1.20 |
sum / HLOOpt / cpu / PreRev |
0.000013 s |
0.00001129744000536448 s |
1.15 |
sum / HLOOpt / cpu / PostRev |
0.000013 s |
0.00001198017997921852 s |
1.09 |
sum / HLOOpt / cpu / BothRev |
0.000013 s |
0.000013544299981731456 s |
0.96 |
sum / PartOpt / cpu / PreRev |
0.000042 s |
0.000011844500004372094 s |
3.55 |
sum / PartOpt / cpu / PostRev |
0.000019 s |
0.000011333900010868092 s |
1.68 |
sum / PartOpt / cpu / BothRev |
0.000014 s |
0.000011511460006659036 s |
1.22 |
sum / IPartOpt / cpu / PreRev |
0.000044 s |
0.000013520800030164537 s |
3.25 |
sum / IPartOpt / cpu / PostRev |
0.000013 s |
0.000011325380037305876 s |
1.15 |
sum / IPartOpt / cpu / BothRev |
0.000014 s |
0.000011000579979736358 s |
1.27 |
sum / DefOpt / cpu / PreRev |
0.000016 s |
0.000011095880036009477 s |
1.44 |
sum / DefOpt / cpu / PostRev |
0.000013 s |
0.000011411460000090302 s |
1.14 |
sum / DefOpt / cpu / BothRev |
0.000014 s |
0.00001147460001448053 s |
1.22 |
sum / IDefOpt / cpu / PreRev |
0.000013 s |
0.000010913199957940378 s |
1.19 |
sum / IDefOpt / cpu / PostRev |
0.000013 s |
0.000011719559970515548 s |
1.11 |
sum / IDefOpt / cpu / BothRev |
0.000013 s |
0.000011089339977843338 s |
1.17 |
value_and_grad / JaXPipe / cpu / Primal |
0.000016502819989909766 s |
0.000015938259984977776 s |
1.04 |
value_and_grad / Jax / cpu / Primal |
0.000016065919953689445 s |
0.00001551725998069742 s |
1.04 |
value_and_grad / HLOOpt / cpu / Primal |
0.00001577392002218403 s |
0.000014285360011854209 s |
1.10 |
value_and_grad / PartOpt / cpu / Primal |
0.00001569205999658152 s |
0.000014652620029664833 s |
1.07 |
value_and_grad / IPartOpt / cpu / Primal |
0.00001533910000944161 s |
0.00001474139997299062 s |
1.04 |
value_and_grad / DefOpt / cpu / Primal |
0.000015350619987657412 s |
0.000014477040012934594 s |
1.06 |
value_and_grad / IDefOpt / cpu / Primal |
0.000016010740037017967 s |
0.000015675319991714788 s |
1.02 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033759999999999995 s |
0.000032769 s |
1.03 |
value_and_grad / Jax / cuda / Primal |
0.000034176 s |
0.000032736 s |
1.04 |
value_and_grad / HLOOpt / cuda / Primal |
0.000033856 s |
0.000033024 s |
1.03 |
value_and_grad / PartOpt / cuda / Primal |
0.000033664 s |
0.000033024 s |
1.02 |
value_and_grad / IPartOpt / cuda / Primal |
0.000034528000000000006 s |
0.000033728 s |
1.02 |
value_and_grad / DefOpt / cuda / Primal |
0.000034432 s |
0.00003328 s |
1.03 |
value_and_grad / IDefOpt / cuda / Primal |
0.000034688 s |
0.000032928 s |
1.05 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000023964 s |
0.000015938259984977776 s |
1.50 |
value_and_grad / Jax / cpu / Primal |
0.000022842000000000003 s |
0.00001551725998069742 s |
1.47 |
value_and_grad / HLOOpt / cpu / Primal |
0.000023582 s |
0.000014285360011854209 s |
1.65 |
value_and_grad / PartOpt / cpu / Primal |
0.000023285 s |
0.000014652620029664833 s |
1.59 |
value_and_grad / IPartOpt / cpu / Primal |
0.000024032 s |
0.00001474139997299062 s |
1.63 |
value_and_grad / DefOpt / cpu / Primal |
0.00002405 s |
0.000014477040012934594 s |
1.66 |
value_and_grad / IDefOpt / cpu / Primal |
0.000022984 s |
0.000015675319991714788 s |
1.47 |
value_and_grad / JaXPipe / cpu / Primal |
0.000016 s |
0.000015938259984977776 s |
1.00 |
value_and_grad / Jax / cpu / Primal |
0.000017 s |
0.00001551725998069742 s |
1.10 |
value_and_grad / HLOOpt / cpu / Primal |
0.000016 s |
0.000014285360011854209 s |
1.12 |
value_and_grad / PartOpt / cpu / Primal |
0.000016 s |
0.000014652620029664833 s |
1.09 |
value_and_grad / IPartOpt / cpu / Primal |
0.000016 s |
0.00001474139997299062 s |
1.09 |
value_and_grad / DefOpt / cpu / Primal |
0.000016 s |
0.000014477040012934594 s |
1.11 |
value_and_grad / IDefOpt / cpu / Primal |
0.000052 s |
0.000015675319991714788 s |
3.32 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001538818 s |
0.001489091 s |
1.03 |
jaxmd20 / Jax / cuda / Primal |
0.001473666 s |
0.001436738 s |
1.03 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001105762 s |
0.001079234 s |
1.02 |
jaxmd20 / PartOpt / cuda / Primal |
0.001343555 s |
0.001282083 s |
1.05 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001350146 s |
0.001299937 s |
1.04 |
jaxmd20 / DefOpt / cuda / Primal |
0.000527233 s |
0.000553953 s |
0.95 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000501089 s |
0.000493377 s |
1.02 |
jaxmd20 / JaXPipe / cuda / Forward |
0.000812449 s |
0.000823393 s |
0.99 |
jaxmd20 / Jax / cuda / Forward |
0.001788387 s |
0.001806562 s |
0.99 |
jaxmd20 / HLOOpt / cuda / Forward |
0.000828225 s |
0.000826082 s |
1.00 |
jaxmd20 / PartOpt / cuda / Forward |
0.000824097 s |
0.000817729 s |
1.01 |
jaxmd20 / IPartOpt / cuda / Forward |
0.000823585 s |
0.0008304009999999 s |
0.99 |
jaxmd20 / DefOpt / cuda / Forward |
0.000843456 s |
0.000861762 s |
0.98 |
jaxmd20 / IDefOpt / cuda / Forward |
0.000825538 s |
0.000861697 s |
0.96 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.001698179 s |
0.001678978 s |
1.01 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005286792 s |
0.005292679 s |
1.00 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.001687587 s |
0.001657699 s |
1.02 |
jaxmd20 / Jax / cuda / BothRev |
0.005284104 s |
0.005276871 s |
1.00 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.001715586 s |
0.001739106 s |
0.99 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005190983 s |
0.0051658 s |
1.00 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.0016418589999999 s |
0.00163197 s |
1.01 |
jaxmd20 / PartOpt / cuda / PreRev |
0.001729571 s |
0.001712547 s |
1.01 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005369672 s |
0.0053235919999999 s |
1.01 |
jaxmd20 / PartOpt / cuda / BothRev |
0.001691267 s |
0.0016367069999999 s |
1.03 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.001720162 s |
0.001706818 s |
1.01 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005377992 s |
0.005346889 s |
1.01 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.0016504369999999 s |
0.001632098 s |
1.01 |
jaxmd20 / DefOpt / cuda / PreRev |
0.001719395 s |
0.001707459 s |
1.01 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002747779 s |
0.002712549 s |
1.01 |
jaxmd20 / DefOpt / cuda / BothRev |
0.001641153 s |
0.0016335059999999 s |
1.00 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.00174365 s |
0.001709282 s |
1.02 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.001989765 s |
0.001992451 s |
1.00 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.001643491 s |
0.001657283 s |
0.99 |
jaxmd20 / JaXPipe / tpu / Primal |
0.009274043125 s |
0.0092747475 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.009265734375 s |
0.009263269375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.0091657775 s |
0.009166279375 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.0091969462499999 s |
0.009197805625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.00920220875 s |
0.0092011981249999 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.008745315625 s |
0.0087459237499999 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.008631079375 s |
0.0086314499999999 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.017264915 s |
0.0172624974999999 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.01872638375 s |
0.018729604375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.017236946875 s |
0.017236786875 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.017267438125 s |
0.017269618125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.0172611375 s |
0.0172633 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.017265241875 s |
0.017262036875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.017264225 s |
0.017262990625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.025345483125 s |
0.02533983375 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.0218921199999999 s |
0.021892336875 s |
1.00 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.0253572975 s |
0.025355716875 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.021891861875 s |
0.021893774375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.025349535 s |
0.02535478625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.02098512875 s |
0.020731351875 s |
1.01 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.025265225625 s |
0.0252639975 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.02536283625 s |
0.0253615693749999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.021509740625 s |
0.021509948125 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.0252895174999999 s |
0.02528850625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.02534875375 s |
0.025349745 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.0215413031249999 s |
0.02154356125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.025262615 s |
0.025260910625 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.0253608881249999 s |
0.025364399375 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.01877164375 s |
0.018772566875 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.0252865962499999 s |
0.02528473125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.02534712375 s |
0.02534748625 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.01811546125 s |
0.01811826 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.025260875625 s |
0.02526199 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.088139626 s |
0.064389043 s |
1.37 |
jaxmd40 / Jax / cpu / Primal |
0.07484411 s |
0.075098109 s |
1.00 |
jaxmd40 / HLOOpt / cpu / Primal |
0.08396951 s |
0.09393414 s |
0.89 |
jaxmd40 / PartOpt / cpu / Primal |
0.063254848 s |
0.065813923 s |
0.96 |
jaxmd40 / IPartOpt / cpu / Primal |
0.074285567 s |
0.071054981 s |
1.05 |
jaxmd40 / DefOpt / cpu / Primal |
0.090359072 s |
0.088649606 s |
1.02 |
jaxmd40 / IDefOpt / cpu / Primal |
0.092169058 s |
0.08105066 s |
1.14 |
jaxmd40 / JaXPipe / cpu / Forward |
0.164415489 s |
0.1586726459999999 s |
1.04 |
jaxmd40 / Jax / cpu / Forward |
0.104059021 s |
0.090318597 s |
1.15 |
jaxmd40 / HLOOpt / cpu / Forward |
0.1636139049999999 s |
0.164154542 s |
1.00 |
jaxmd40 / PartOpt / cpu / Forward |
0.161517569 s |
0.156601635 s |
1.03 |
jaxmd40 / IPartOpt / cpu / Forward |
0.166110405 s |
0.164045594 s |
1.01 |
jaxmd40 / DefOpt / cpu / Forward |
0.172983424 s |
0.153992216 s |
1.12 |
jaxmd40 / IDefOpt / cpu / Forward |
0.174806117 s |
0.156165278 s |
1.12 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.256795694 s |
0.2168472209999999 s |
1.18 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.143777397 s |
0.137391516 s |
1.05 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.239807983 s |
0.224425981 s |
1.07 |
jaxmd40 / Jax / cpu / BothRev |
0.147690925 s |
0.1403212519999999 s |
1.05 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.221257155 s |
0.229696034 s |
0.96 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.181558132 s |
0.1734641039999999 s |
1.05 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.2519533419999999 s |
0.238734951 s |
1.06 |
jaxmd40 / PartOpt / cpu / PreRev |
0.2345179339999999 s |
0.230188916 s |
1.02 |
jaxmd40 / PartOpt / cpu / PostRev |
0.139092571 s |
0.1301312979999999 s |
1.07 |
jaxmd40 / PartOpt / cpu / BothRev |
0.237424331 s |
0.2444349479999999 s |
0.97 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.241023134 s |
0.21103832 s |
1.14 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.126814143 s |
0.133954614 s |
0.95 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.248520878 s |
0.235675218 s |
1.05 |
jaxmd40 / DefOpt / cpu / PreRev |
0.224910529 s |
0.218206029 s |
1.03 |
jaxmd40 / DefOpt / cpu / PostRev |
0.16998 s |
0.170328459 s |
1.00 |
jaxmd40 / DefOpt / cpu / BothRev |
0.257545449 s |
0.25450203 s |
1.01 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.217652286 s |
0.221404833 s |
0.98 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.17534624 s |
0.1760200079999999 s |
1.00 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.257813627 s |
0.22752861 s |
1.13 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.701544392 s |
1.705519003 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.704584304 s |
1.707962428 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.716863181 s |
1.7195978399999998 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.696285122 s |
1.699366622 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.695293262 s |
1.69751975 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.6638795279999998 s |
1.6681674169999998 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.910072394 s |
1.915522691 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.988526180625 s |
4.01699938375 s |
0.99 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.038666975625 s |
3.03880095375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.121071391875 s |
3.121094105625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.059029013125 s |
3.059155715 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.05898760625 s |
3.059077515 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.102623036875 s |
2.102721625625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
4.356143315000001 s |
4.356271070625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.223370612 s |
5.851205109 s |
1.06 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.110045841 s |
5.961654927 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.118277398999999 s |
6.033430212 s |
1.01 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.192022785 s |
6.018784604 s |
1.03 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.124048152 s |
5.926606443 s |
1.03 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.375033473 s |
2.251407652 s |
1.05 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.706174891 s |
6.355471251 s |
1.06 |
This comment was automatically generated by workflow using github-action-benchmark.
a251345 to
e8acf0b
Compare
|
A (positive) rotation amount signifies rotation to the left. I've added this in RotateOp's description |
|
needs rebase otherwise good to merge! |
ref: #1949