Skip to content

Conversation

@jumerckx
Copy link
Collaborator

@jumerckx jumerckx commented Jan 21, 2026

ref: #1949

// Check density - only combine if utilization is reasonable
// Avoid creating a MultiRotateOp with many unused intermediate results
if (totalResults > 2 * static_cast<int32_t>(rotates.size()))
return failure();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need such a check? This one might be too lenient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should throw if there's any gaps [also checking if an op is use_empty], though we don't need to return failure, we can just resize [and select the relevant subset].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added filtering for a contiguous sequence that neighbors or includes rotate(x, 0)

@jumerckx jumerckx self-assigned this Jan 21, 2026
@jumerckx jumerckx force-pushed the jm/multirotate_recognize branch from aecb25c to 424244f Compare January 21, 2026 21:51
return failure();

// Create the MultiRotateOp
auto newOp = rewriter.create<enzymexla::MultiRotateOp>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to move the newOp to be defined right after the operand, otherwise we risk a dominace errror

@jumerckx jumerckx force-pushed the jm/multirotate_recognize branch from 424244f to 1936050 Compare January 22, 2026 10:08
@jumerckx
Copy link
Collaborator Author

I've been confusing myself.
Do you consider a positive rotation amount to be a leftwards or rightwards rotation?
i.e.

a = rotate(x, amount=0)
b = rotate(x, amount=1)
c = rotate(x, amount=2)
-->
c, b, a = multirotate(x, left=2, right=0)

or

a, b, c = multirotate(x, left=0, right=2)

Currently it's implemented as the former but I now think the latter might be more logical?

@jumerckx jumerckx force-pushed the jm/multirotate_recognize branch from 1936050 to 48557ca Compare January 22, 2026 10:09
Base automatically changed from jm/multirotate_and_multislice to main January 23, 2026 02:56
}
}
contiguousGroups.push_back({groupStart, groupEnd});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do the contiguous group check before the 2 rotates [in case we reduce the number because not contiguous]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a second check as well:

    // No qualifying groups found
    if (qualifyingAmounts.size() < 2)
      return failure();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay

jumerckx and others added 6 commits January 23, 2026 02:48
* check if the rotates are actually used
* only merge rotates if they form a contiguous sequence that neighbors
(or includes) the identity rotation
@jumerckx jumerckx force-pushed the jm/multirotate_recognize branch from 14bea1f to 74d31ac Compare January 23, 2026 09:08
@wsmoses
Copy link
Member

wsmoses commented Jan 23, 2026

Re the existing rotate id need to check, but look at the existing rotate lower and recognize?

also perhaps we should add better docs on?

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnzymeJAX Benchmarks

Details
Benchmark suite Current: 1a288d6 Previous: c6280c7 Ratio
actmtch / JaXPipe / cpu / Primal 0.000007811020013832604 s 0.0000070109200623846845 s 1.11
actmtch / Jax / cpu / Primal 0.000007301200012079789 s 0.00000701169999956619 s 1.04
actmtch / HLOOpt / cpu / Primal 0.000011415199996918091 s 0.000010657899974830796 s 1.07
actmtch / PartOpt / cpu / Primal 0.000006702959972244571 s 0.000006732300034855143 s 1.00
actmtch / IPartOpt / cpu / Primal 0.000006998340013524284 s 0.0000068570599705708445 s 1.02
actmtch / DefOpt / cpu / Primal 0.000007483140007025213 s 0.000011533039996720618 s 0.65
actmtch / IDefOpt / cpu / Primal 0.000007393980004053447 s 0.000007239939995997702 s 1.02
actmtch / JaXPipe / cpu / Forward 0.000011722619992724504 s 0.00001105510002162191 s 1.06
actmtch / Jax / cpu / Forward 0.000010145140004169662 s 0.000009777999985089993 s 1.04
actmtch / HLOOpt / cpu / Forward 0.000016489940016981563 s 0.000014957540024624904 s 1.10
actmtch / PartOpt / cpu / Forward 0.000016076980073194137 s 0.000015249399966705823 s 1.05
actmtch / IPartOpt / cpu / Forward 0.00001142214004175912 s 0.000010884359990086522 s 1.05
actmtch / DefOpt / cpu / Forward 0.000015992240023479098 s 0.000015586240015181828 s 1.03
actmtch / IDefOpt / cpu / Forward 0.000011382200000298326 s 0.000010999039996022474 s 1.03
actmtch / JaXPipe / cpu / PreRev 0.00001244667998435034 s 0.00001187840001875884 s 1.05
actmtch / JaXPipe / cpu / PostRev 0.00001178777999484737 s 0.000011447939969002618 s 1.03
actmtch / JaXPipe / cpu / BothRev 0.000012524619969553895 s 0.00001139407995651709 s 1.10
actmtch / Jax / cpu / BothRev 0.000011000319946106174 s 0.0000104005599678203 s 1.06
actmtch / HLOOpt / cpu / PreRev 0.00001269331999537826 s 0.000011989220020041105 s 1.06
actmtch / HLOOpt / cpu / PostRev 0.00001652812003158033 s 0.000015558620016236092 s 1.06
actmtch / HLOOpt / cpu / BothRev 0.000014140020011836896 s 0.000013079480004307695 s 1.08
actmtch / PartOpt / cpu / PreRev 0.00001252550003300712 s 0.00001192708000417042 s 1.05
actmtch / PartOpt / cpu / PostRev 0.000011804919950009207 s 0.000010795760035762214 s 1.09
actmtch / PartOpt / cpu / BothRev 0.00001279191999856266 s 0.000011267960035183933 s 1.14
actmtch / IPartOpt / cpu / PreRev 0.000012690299972746287 s 0.000013439059994198031 s 0.94
actmtch / IPartOpt / cpu / PostRev 0.000011765780018322402 s 0.000010422020004625665 s 1.13
actmtch / IPartOpt / cpu / BothRev 0.000012225419995957054 s 0.000011947300017709494 s 1.02
actmtch / DefOpt / cpu / PreRev 0.000011876260005010407 s 0.00001179932004561124 s 1.01
actmtch / DefOpt / cpu / PostRev 0.00001234206004482985 s 0.00001211730002069089 s 1.02
actmtch / DefOpt / cpu / BothRev 0.000012431639988790269 s 0.000011895040051967951 s 1.05
actmtch / IDefOpt / cpu / PreRev 0.000012736880025840946 s 0.000011312639981042591 s 1.13
actmtch / IDefOpt / cpu / PostRev 0.000012277780006115791 s 0.000012055839970344096 s 1.02
actmtch / IDefOpt / cpu / BothRev 0.000012370519953037728 s 0.00001205063998895639 s 1.03
actmtch / JaXPipe / cuda / Primal 0.000002015 s 0.000002015 s 1
actmtch / Jax / cuda / Primal 0.000002015 s 0.000002016 s 1.00
actmtch / HLOOpt / cuda / Primal 0.000002016 s 0.000002016 s 1
actmtch / PartOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
actmtch / IPartOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
actmtch / DefOpt / cuda / Primal 0.000002016 s 0.000002016 s 1
actmtch / IDefOpt / cuda / Primal 0.000002015 s 0.000002016 s 1.00
actmtch / JaXPipe / cuda / Forward 0.000010528 s 0.0000104 s 1.01
actmtch / Jax / cuda / Forward 0.000010432 s 0.000011488 s 0.91
actmtch / HLOOpt / cuda / Forward 0.000010208 s 0.000011710999999999998 s 0.87
actmtch / PartOpt / cuda / Forward 0.00000976 s 0.000010113 s 0.97
actmtch / IPartOpt / cuda / Forward 0.000010208 s 0.000010176 s 1.00
actmtch / DefOpt / cuda / Forward 0.000009888 s 0.000009888 s 1
actmtch / IDefOpt / cuda / Forward 0.00001008 s 0.000009952 s 1.01
actmtch / JaXPipe / cuda / PreRev 0.000010144 s 0.000010112 s 1.00
actmtch / JaXPipe / cuda / PostRev 0.000010464 s 0.000010784 s 0.97
actmtch / JaXPipe / cuda / BothRev 0.000010527 s 0.00000992 s 1.06
actmtch / Jax / cuda / BothRev 0.000012 s 0.000010111 s 1.19
actmtch / HLOOpt / cuda / PreRev 0.000010049 s 0.000010112 s 0.99
actmtch / HLOOpt / cuda / PostRev 0.000010048 s 0.000010048 s 1
actmtch / HLOOpt / cuda / BothRev 0.000010016 s 0.000010208 s 0.98
actmtch / PartOpt / cuda / PreRev 0.000009952 s 0.000011136 s 0.89
actmtch / PartOpt / cuda / PostRev 0.000009888 s 0.000009888 s 1
actmtch / PartOpt / cuda / BothRev 0.000010048 s 0.000009952 s 1.01
actmtch / IPartOpt / cuda / PreRev 0.000010272 s 0.000010017 s 1.03
actmtch / IPartOpt / cuda / PostRev 0.000010144 s 0.000010112 s 1.00
actmtch / IPartOpt / cuda / BothRev 0.000009952 s 0.000009536 s 1.04
actmtch / DefOpt / cuda / PreRev 0.000010208 s 0.000010016 s 1.02
actmtch / DefOpt / cuda / PostRev 0.000010336 s 0.000010049 s 1.03
actmtch / DefOpt / cuda / BothRev 0.000009536 s 0.000010111 s 0.94
actmtch / IDefOpt / cuda / PreRev 0.0000104 s 0.00000976 s 1.07
actmtch / IDefOpt / cuda / PostRev 0.000010304 s 0.000010432 s 0.99
actmtch / IDefOpt / cuda / BothRev 0.000009792 s 0.000009984 s 0.98
actmtch / JaXPipe / tpu / Primal 5.63475e-7 s 5.63675e-7 s 1.00
actmtch / Jax / tpu / Primal 6.06775e-7 s 6.068500000000001e-7 s 1.00
actmtch / HLOOpt / tpu / Primal 0.0000021026 s 0.00000210545 s 1.00
actmtch / PartOpt / tpu / Primal 6.06275e-7 s 6.06625e-7 s 1.00
actmtch / IPartOpt / tpu / Primal 5.628999999999999e-7 s 5.62425e-7 s 1.00
actmtch / DefOpt / tpu / Primal 0.000002165475 s 0.000002154425 s 1.01
actmtch / IDefOpt / tpu / Primal 0.000002111075 s 0.000002103825 s 1.00
actmtch / JaXPipe / tpu / Forward 0.000003830625 s 0.000003831975 s 1.00
actmtch / Jax / tpu / Forward 0.00000122195 s 0.000001222375 s 1.00
actmtch / HLOOpt / tpu / Forward 0.0000039358250000000006 s 0.000003929075000000001 s 1.00
actmtch / PartOpt / tpu / Forward 0.000003911525 s 0.00000391155 s 1.00
actmtch / IPartOpt / tpu / Forward 0.000003925375 s 0.000003937450000000001 s 1.00
actmtch / DefOpt / tpu / Forward 0.000003915025 s 0.000003910825 s 1.00
actmtch / IDefOpt / tpu / Forward 0.000003938025 s 0.000003938025 s 1
actmtch / JaXPipe / tpu / PreRev 0.000003467450000000001 s 0.000003491825 s 0.99
actmtch / JaXPipe / tpu / PostRev 0.00000164135 s 0.000001641675 s 1.00
actmtch / JaXPipe / tpu / BothRev 0.0000034801 s 0.0000034855750000000005 s 1.00
actmtch / Jax / tpu / BothRev 0.00000164795 s 0.00000163335 s 1.01
actmtch / HLOOpt / tpu / PreRev 0.0000034749000000000004 s 0.000003482125 s 1.00
actmtch / HLOOpt / tpu / PostRev 0.000003407175 s 0.0000034108 s 1.00
actmtch / HLOOpt / tpu / BothRev 0.00000348205 s 0.0000034754 s 1.00
actmtch / PartOpt / tpu / PreRev 0.000003413975 s 0.00000341845 s 1.00
actmtch / PartOpt / tpu / PostRev 0.0000015952 s 0.000001588075 s 1.00
actmtch / PartOpt / tpu / BothRev 0.00000340825 s 0.0000034159499999999995 s 1.00
actmtch / IPartOpt / tpu / PreRev 0.000003471125 s 0.0000034752250000000003 s 1.00
actmtch / IPartOpt / tpu / PostRev 0.0000016378 s 0.000001642975 s 1.00
actmtch / IPartOpt / tpu / BothRev 0.0000034876 s 0.0000034939750000000003 s 1.00
actmtch / DefOpt / tpu / PreRev 0.000003413075 s 0.000003421725 s 1.00
actmtch / DefOpt / tpu / PostRev 0.000003418125 s 0.000003405425 s 1.00
actmtch / DefOpt / tpu / BothRev 0.00000340895 s 0.000003413275 s 1.00
actmtch / IDefOpt / tpu / PreRev 0.000003473825 s 0.0000034806 s 1.00
actmtch / IDefOpt / tpu / PostRev 0.00000341255 s 0.0000033974500000000004 s 1.00
actmtch / IDefOpt / tpu / BothRev 0.0000034711750000000003 s 0.0000034731750000000004 s 1.00
actmtch / JaXPipe / cpu / Primal 0.000013682 s 0.0000070109200623846845 s 1.95
actmtch / Jax / cpu / Primal 0.000014005 s 0.00000701169999956619 s 2.00
actmtch / HLOOpt / cpu / Primal 0.000014474 s 0.000010657899974830796 s 1.36
actmtch / PartOpt / cpu / Primal 0.000013438 s 0.000006732300034855143 s 2.00
actmtch / IPartOpt / cpu / Primal 0.00001357 s 0.0000068570599705708445 s 1.98
actmtch / DefOpt / cpu / Primal 0.000014415 s 0.000011533039996720618 s 1.25
actmtch / IDefOpt / cpu / Primal 0.000014195 s 0.000007239939995997702 s 1.96
actmtch / JaXPipe / cpu / Forward 0.000019397 s 0.00001105510002162191 s 1.75
actmtch / Jax / cpu / Forward 0.000017956 s 0.000009777999985089993 s 1.84
actmtch / HLOOpt / cpu / Forward 0.000019521 s 0.000014957540024624904 s 1.31
actmtch / PartOpt / cpu / Forward 0.000018891000000000003 s 0.000015249399966705823 s 1.24
actmtch / IPartOpt / cpu / Forward 0.000019068 s 0.000010884359990086522 s 1.75
actmtch / DefOpt / cpu / Forward 0.000019484 s 0.000015586240015181828 s 1.25
actmtch / IDefOpt / cpu / Forward 0.0000191 s 0.000010999039996022474 s 1.74
actmtch / JaXPipe / cpu / PreRev 0.000019634 s 0.00001187840001875884 s 1.65
actmtch / JaXPipe / cpu / PostRev 0.000017126 s 0.000011447939969002618 s 1.50
actmtch / JaXPipe / cpu / BothRev 0.000020024 s 0.00001139407995651709 s 1.76
actmtch / Jax / cpu / BothRev 0.000017763000000000003 s 0.0000104005599678203 s 1.71
actmtch / HLOOpt / cpu / PreRev 0.000019076 s 0.000011989220020041105 s 1.59
actmtch / HLOOpt / cpu / PostRev 0.000020375 s 0.000015558620016236092 s 1.31
actmtch / HLOOpt / cpu / BothRev 0.000019764 s 0.000013079480004307695 s 1.51
actmtch / PartOpt / cpu / PreRev 0.000019239 s 0.00001192708000417042 s 1.61
actmtch / PartOpt / cpu / PostRev 0.000017598 s 0.000010795760035762214 s 1.63
actmtch / PartOpt / cpu / BothRev 0.000019187 s 0.000011267960035183933 s 1.70
actmtch / IPartOpt / cpu / PreRev 0.00001939 s 0.000013439059994198031 s 1.44
actmtch / IPartOpt / cpu / PostRev 0.000017676999999999997 s 0.000010422020004625665 s 1.70
actmtch / IPartOpt / cpu / BothRev 0.000019786 s 0.000011947300017709494 s 1.66
actmtch / DefOpt / cpu / PreRev 0.00001996 s 0.00001179932004561124 s 1.69
actmtch / DefOpt / cpu / PostRev 0.000019288 s 0.00001211730002069089 s 1.59
actmtch / DefOpt / cpu / BothRev 0.000019826 s 0.000011895040051967951 s 1.67
actmtch / IDefOpt / cpu / PreRev 0.00001941 s 0.000011312639981042591 s 1.72
actmtch / IDefOpt / cpu / PostRev 0.000019606 s 0.000012055839970344096 s 1.63
actmtch / IDefOpt / cpu / BothRev 0.000019789 s 0.00001205063998895639 s 1.64
actmtch / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.0000070109200623846845 s 1.28
actmtch / Jax / cpu / Primal 0.000008999999999999999 s 0.00000701169999956619 s 1.28
actmtch / HLOOpt / cpu / Primal 0.00001 s 0.000010657899974830796 s 0.94
actmtch / PartOpt / cpu / Primal 0.000008999999999999999 s 0.000006732300034855143 s 1.34
actmtch / IPartOpt / cpu / Primal 0.000008999999999999999 s 0.0000068570599705708445 s 1.31
actmtch / DefOpt / cpu / Primal 0.000008999999999999999 s 0.000011533039996720618 s 0.78
actmtch / IDefOpt / cpu / Primal 0.00001 s 0.000007239939995997702 s 1.38
actmtch / JaXPipe / cpu / Forward 0.000013 s 0.00001105510002162191 s 1.18
actmtch / Jax / cpu / Forward 0.000012 s 0.000009777999985089993 s 1.23
actmtch / HLOOpt / cpu / Forward 0.000014 s 0.000014957540024624904 s 0.94
actmtch / PartOpt / cpu / Forward 0.000014 s 0.000015249399966705823 s 0.92
actmtch / IPartOpt / cpu / Forward 0.000013 s 0.000010884359990086522 s 1.19
actmtch / DefOpt / cpu / Forward 0.000013 s 0.000015586240015181828 s 0.83
actmtch / IDefOpt / cpu / Forward 0.000013 s 0.000010999039996022474 s 1.18
actmtch / JaXPipe / cpu / PreRev 0.000014 s 0.00001187840001875884 s 1.18
actmtch / JaXPipe / cpu / PostRev 0.000012 s 0.000011447939969002618 s 1.05
actmtch / JaXPipe / cpu / BothRev 0.000014 s 0.00001139407995651709 s 1.23
actmtch / Jax / cpu / BothRev 0.000012 s 0.0000104005599678203 s 1.15
actmtch / HLOOpt / cpu / PreRev 0.000013 s 0.000011989220020041105 s 1.08
actmtch / HLOOpt / cpu / PostRev 0.000013 s 0.000015558620016236092 s 0.84
actmtch / HLOOpt / cpu / BothRev 0.000014 s 0.000013079480004307695 s 1.07
actmtch / PartOpt / cpu / PreRev 0.000014 s 0.00001192708000417042 s 1.17
actmtch / PartOpt / cpu / PostRev 0.000013 s 0.000010795760035762214 s 1.20
actmtch / PartOpt / cpu / BothRev 0.000015 s 0.000011267960035183933 s 1.33
actmtch / IPartOpt / cpu / PreRev 0.000014 s 0.000013439059994198031 s 1.04
actmtch / IPartOpt / cpu / PostRev 0.000012 s 0.000010422020004625665 s 1.15
actmtch / IPartOpt / cpu / BothRev 0.000014 s 0.000011947300017709494 s 1.17
actmtch / DefOpt / cpu / PreRev 0.000014 s 0.00001179932004561124 s 1.19
actmtch / DefOpt / cpu / PostRev 0.000013 s 0.00001211730002069089 s 1.07
actmtch / DefOpt / cpu / BothRev 0.000013 s 0.000011895040051967951 s 1.09
actmtch / IDefOpt / cpu / PreRev 0.000013 s 0.000011312639981042591 s 1.15
actmtch / IDefOpt / cpu / PostRev 0.000013 s 0.000012055839970344096 s 1.08
actmtch / IDefOpt / cpu / BothRev 0.000013 s 0.00001205063998895639 s 1.08
add_one / JaXPipe / cpu / Primal 0.000008299000010083546 s 0.000007610080019730958 s 1.09
add_one / Jax / cpu / Primal 0.000008125500007736264 s 0.000007455760060111061 s 1.09
add_one / HLOOpt / cpu / Primal 0.000011582559964153917 s 0.000009914300017044298 s 1.17
add_one / PartOpt / cpu / Primal 0.000007942099991851136 s 0.000006775040001230081 s 1.17
add_one / IPartOpt / cpu / Primal 0.000007567600023321574 s 0.0000064760799796204085 s 1.17
add_one / DefOpt / cpu / Primal 0.000011937939971176091 s 0.000010284079962730177 s 1.16
add_one / IDefOpt / cpu / Primal 0.000007066259986459044 s 0.000006921240037627286 s 1.02
add_one / JaXPipe / cpu / Forward 0.000011016719954568545 s 0.00001079442002264841 s 1.02
add_one / Jax / cpu / Forward 0.00001118312003200117 s 0.000010645579977790476 s 1.05
add_one / HLOOpt / cpu / Forward 0.00001585782006259251 s 0.000014617440019719651 s 1.08
add_one / PartOpt / cpu / Forward 0.000016053759991336848 s 0.000015689119982198464 s 1.02
add_one / IPartOpt / cpu / Forward 0.000011629020036707516 s 0.000010346240042053978 s 1.12
add_one / DefOpt / cpu / Forward 0.00001581619997523376 s 0.000010955160005323706 s 1.44
add_one / IDefOpt / cpu / Forward 0.000011246000003666267 s 0.00001038683998558554 s 1.08
add_one / JaXPipe / cpu / PreRev 0.000012999200016565738 s 0.00001245874000233016 s 1.04
add_one / JaXPipe / cpu / PostRev 0.00001272888001949468 s 0.000012182040009065532 s 1.04
add_one / JaXPipe / cpu / BothRev 0.00001680824000686698 s 0.000011788040019382608 s 1.43
add_one / Jax / cpu / BothRev 0.000012509200014392264 s 0.000012370640006338362 s 1.01
add_one / HLOOpt / cpu / PreRev 0.000012748379995173308 s 0.000012197079968245816 s 1.05
add_one / HLOOpt / cpu / PostRev 0.000016790900017440434 s 0.00001592787995832623 s 1.05
add_one / HLOOpt / cpu / BothRev 0.000014303660000223316 s 0.000014361539979290682 s 1.00
add_one / PartOpt / cpu / PreRev 0.000012691959982475964 s 0.000012441599956218852 s 1.02
add_one / PartOpt / cpu / PostRev 0.000012468760014598956 s 0.000012768499964295188 s 0.98
add_one / PartOpt / cpu / BothRev 0.0000124934000086796 s 0.000012874940011897706 s 0.97
add_one / IPartOpt / cpu / PreRev 0.000017635099993640324 s 0.000015121999986149604 s 1.17
add_one / IPartOpt / cpu / PostRev 0.000012381859978631835 s 0.000012084039990440944 s 1.02
add_one / IPartOpt / cpu / BothRev 0.000012873900004706227 s 0.000011877479973918524 s 1.08
add_one / DefOpt / cpu / PreRev 0.00001259321998077212 s 0.000012192400008643745 s 1.03
add_one / DefOpt / cpu / PostRev 0.000013065480043223945 s 0.000012204380000184756 s 1.07
add_one / DefOpt / cpu / BothRev 0.00001291836000746116 s 0.000012285359971428987 s 1.05
add_one / IDefOpt / cpu / PreRev 0.000012939040043420392 s 0.0000128007200146385 s 1.01
add_one / IDefOpt / cpu / PostRev 0.000013799180051137227 s 0.000012469899966163211 s 1.11
add_one / IDefOpt / cpu / BothRev 0.000013007160032429965 s 0.00001239260000147624 s 1.05
add_one / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / Jax / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / HLOOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / IPartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / DefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / JaXPipe / cuda / Forward 0.000010593 s 0.000010047 s 1.05
add_one / Jax / cuda / Forward 0.000009728 s 0.000009472 s 1.03
add_one / HLOOpt / cuda / Forward 0.000010336 s 0.000010176 s 1.02
add_one / PartOpt / cuda / Forward 0.000010048 s 0.000009888 s 1.02
add_one / IPartOpt / cuda / Forward 0.000010049 s 0.000009952 s 1.01
add_one / DefOpt / cuda / Forward 0.00001024 s 0.000010015 s 1.02
add_one / IDefOpt / cuda / Forward 0.00001024 s 0.00001008 s 1.02
add_one / JaXPipe / cuda / PreRev 0.000025408 s 0.000025792 s 0.99
add_one / JaXPipe / cuda / PostRev 0.00002592 s 0.000025568 s 1.01
add_one / JaXPipe / cuda / BothRev 0.00002576 s 0.000024991 s 1.03
add_one / Jax / cuda / BothRev 0.000025312 s 0.00002528 s 1.00
add_one / HLOOpt / cuda / PreRev 0.00002528 s 0.00002576 s 0.98
add_one / HLOOpt / cuda / PostRev 0.000025025 s 0.000025056 s 1.00
add_one / HLOOpt / cuda / BothRev 0.000025184 s 0.00002512 s 1.00
add_one / PartOpt / cuda / PreRev 0.00002464 s 0.000025215 s 0.98
add_one / PartOpt / cuda / PostRev 0.000025632 s 0.000024768 s 1.03
add_one / PartOpt / cuda / BothRev 0.000026975 s 0.000025024 s 1.08
add_one / IPartOpt / cuda / PreRev 0.000025632 s 0.0000264 s 0.97
add_one / IPartOpt / cuda / PostRev 0.00002576 s 0.000025632 s 1.00
add_one / IPartOpt / cuda / BothRev 0.00002528 s 0.000024607 s 1.03
add_one / DefOpt / cuda / PreRev 0.000025504 s 0.000024896 s 1.02
add_one / DefOpt / cuda / PostRev 0.000025633 s 0.000025184 s 1.02
add_one / DefOpt / cuda / BothRev 0.00002608 s 0.000024928 s 1.05
add_one / IDefOpt / cuda / PreRev 0.000030496 s 0.000024512 s 1.24
add_one / IDefOpt / cuda / PostRev 0.000031008 s 0.000030144 s 1.03
add_one / IDefOpt / cuda / BothRev 0.000025536 s 0.000025215 s 1.01
add_one / JaXPipe / tpu / Primal 0.0000014255000000000002 s 0.0000014410500000000002 s 0.99
add_one / Jax / tpu / Primal 0.000001399025 s 0.00000141065 s 0.99
add_one / HLOOpt / tpu / Primal 0.0000014289250000000005 s 0.0000014297999999999998 s 1.00
add_one / PartOpt / tpu / Primal 0.000001402975 s 0.00000142075 s 0.99
add_one / IPartOpt / tpu / Primal 0.000001433675 s 0.0000014248250000000002 s 1.01
add_one / DefOpt / tpu / Primal 0.000001402475 s 0.0000013976 s 1.00
add_one / IDefOpt / tpu / Primal 0.000001423575 s 0.0000014301 s 1.00
add_one / JaXPipe / tpu / Forward 0.000001855425 s 0.0000018513 s 1.00
add_one / Jax / tpu / Forward 0.000001841025 s 0.000001847325 s 1.00
add_one / HLOOpt / tpu / Forward 0.000001855625 s 0.000001859675 s 1.00
add_one / PartOpt / tpu / Forward 0.000001834875 s 0.000001844325 s 0.99
add_one / IPartOpt / tpu / Forward 0.00000185095 s 0.000001851 s 1.00
add_one / DefOpt / tpu / Forward 0.000001838275 s 0.0000018474 s 1.00
add_one / IDefOpt / tpu / Forward 0.000001848525 s 0.0000018475 s 1.00
add_one / JaXPipe / tpu / PreRev 0.000002244475 s 0.0000022364 s 1.00
add_one / JaXPipe / tpu / PostRev 0.0000022402 s 0.0000022444 s 1.00
add_one / JaXPipe / tpu / BothRev 0.000002243775 s 0.0000022395000000000003 s 1.00
add_one / Jax / tpu / BothRev 0.000002241 s 0.0000022446 s 1.00
add_one / HLOOpt / tpu / PreRev 0.000002235875 s 0.00000223825 s 1.00
add_one / HLOOpt / tpu / PostRev 0.000002241075 s 0.0000022425 s 1.00
add_one / HLOOpt / tpu / BothRev 0.000002239575 s 0.000002243675 s 1.00
add_one / PartOpt / tpu / PreRev 0.00000224675 s 0.000002238825 s 1.00
add_one / PartOpt / tpu / PostRev 0.0000022424 s 0.00000223935 s 1.00
add_one / PartOpt / tpu / BothRev 0.000002242625 s 0.0000022354750000000004 s 1.00
add_one / IPartOpt / tpu / PreRev 0.000002237075 s 0.0000022375999999999995 s 1.00
add_one / IPartOpt / tpu / PostRev 0.000002234775 s 0.00000224575 s 1.00
add_one / IPartOpt / tpu / BothRev 0.0000022371750000000003 s 0.00000224635 s 1.00
add_one / DefOpt / tpu / PreRev 0.000002240325 s 0.0000022418250000000003 s 1.00
add_one / DefOpt / tpu / PostRev 0.0000022512 s 0.000002233525 s 1.01
add_one / DefOpt / tpu / BothRev 0.00000224285 s 0.0000022407 s 1.00
add_one / IDefOpt / tpu / PreRev 0.0000022366 s 0.00000223925 s 1.00
add_one / IDefOpt / tpu / PostRev 0.000002240025 s 0.00000224175 s 1.00
add_one / IDefOpt / tpu / BothRev 0.000002239425 s 0.00000223755 s 1.00
add_one / JaXPipe / cpu / Primal 0.000013098 s 0.000007610080019730958 s 1.72
add_one / Jax / cpu / Primal 0.000013449 s 0.000007455760060111061 s 1.80
add_one / HLOOpt / cpu / Primal 0.000013149 s 0.000009914300017044298 s 1.33
add_one / PartOpt / cpu / Primal 0.000013058 s 0.000006775040001230081 s 1.93
add_one / IPartOpt / cpu / Primal 0.000013057 s 0.0000064760799796204085 s 2.02
add_one / DefOpt / cpu / Primal 0.000012978 s 0.000010284079962730177 s 1.26
add_one / IDefOpt / cpu / Primal 0.000013122 s 0.000006921240037627286 s 1.90
add_one / JaXPipe / cpu / Forward 0.000018002 s 0.00001079442002264841 s 1.67
add_one / Jax / cpu / Forward 0.000017641 s 0.000010645579977790476 s 1.66
add_one / HLOOpt / cpu / Forward 0.000017415 s 0.000014617440019719651 s 1.19
add_one / PartOpt / cpu / Forward 0.000017259 s 0.000015689119982198464 s 1.10
add_one / IPartOpt / cpu / Forward 0.000017624999999999998 s 0.000010346240042053978 s 1.70
add_one / DefOpt / cpu / Forward 0.000017826000000000002 s 0.000010955160005323706 s 1.63
add_one / IDefOpt / cpu / Forward 0.00001763 s 0.00001038683998558554 s 1.70
add_one / JaXPipe / cpu / PreRev 0.000020413 s 0.00001245874000233016 s 1.64
add_one / JaXPipe / cpu / PostRev 0.000020815 s 0.000012182040009065532 s 1.71
add_one / JaXPipe / cpu / BothRev 0.000020504 s 0.000011788040019382608 s 1.74
add_one / Jax / cpu / BothRev 0.000019606 s 0.000012370640006338362 s 1.58
add_one / HLOOpt / cpu / PreRev 0.000019897000000000003 s 0.000012197079968245816 s 1.63
add_one / HLOOpt / cpu / PostRev 0.000020442 s 0.00001592787995832623 s 1.28
add_one / HLOOpt / cpu / BothRev 0.000019423 s 0.000014361539979290682 s 1.35
add_one / PartOpt / cpu / PreRev 0.000019763 s 0.000012441599956218852 s 1.59
add_one / PartOpt / cpu / PostRev 0.000020281 s 0.000012768499964295188 s 1.59
add_one / PartOpt / cpu / BothRev 0.000020445 s 0.000012874940011897706 s 1.59
add_one / IPartOpt / cpu / PreRev 0.000019924 s 0.000015121999986149604 s 1.32
add_one / IPartOpt / cpu / PostRev 0.000020368 s 0.000012084039990440944 s 1.69
add_one / IPartOpt / cpu / BothRev 0.000020543 s 0.000011877479973918524 s 1.73
add_one / DefOpt / cpu / PreRev 0.00002032 s 0.000012192400008643745 s 1.67
add_one / DefOpt / cpu / PostRev 0.000020467 s 0.000012204380000184756 s 1.68
add_one / DefOpt / cpu / BothRev 0.000020383 s 0.000012285359971428987 s 1.66
add_one / IDefOpt / cpu / PreRev 0.000019767 s 0.0000128007200146385 s 1.54
add_one / IDefOpt / cpu / PostRev 0.000020056 s 0.000012469899966163211 s 1.61
add_one / IDefOpt / cpu / BothRev 0.000019824 s 0.00001239260000147624 s 1.60
add_one / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.000007610080019730958 s 1.18
add_one / Jax / cpu / Primal 0.000008 s 0.000007455760060111061 s 1.07
add_one / HLOOpt / cpu / Primal 0.000008999999999999999 s 0.000009914300017044298 s 0.91
add_one / PartOpt / cpu / Primal 0.000008 s 0.000006775040001230081 s 1.18
add_one / IPartOpt / cpu / Primal 0.000008999999999999999 s 0.0000064760799796204085 s 1.39
add_one / DefOpt / cpu / Primal 0.000008999999999999999 s 0.000010284079962730177 s 0.88
add_one / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.000006921240037627286 s 1.30
add_one / JaXPipe / cpu / Forward 0.000012 s 0.00001079442002264841 s 1.11
add_one / Jax / cpu / Forward 0.000013 s 0.000010645579977790476 s 1.22
add_one / HLOOpt / cpu / Forward 0.000012 s 0.000014617440019719651 s 0.82
add_one / PartOpt / cpu / Forward 0.000012 s 0.000015689119982198464 s 0.76
add_one / IPartOpt / cpu / Forward 0.000013 s 0.000010346240042053978 s 1.26
add_one / DefOpt / cpu / Forward 0.000012 s 0.000010955160005323706 s 1.10
add_one / IDefOpt / cpu / Forward 0.000012 s 0.00001038683998558554 s 1.16
add_one / JaXPipe / cpu / PreRev 0.000014 s 0.00001245874000233016 s 1.12
add_one / JaXPipe / cpu / PostRev 0.000014 s 0.000012182040009065532 s 1.15
add_one / JaXPipe / cpu / BothRev 0.000013 s 0.000011788040019382608 s 1.10
add_one / Jax / cpu / BothRev 0.000013 s 0.000012370640006338362 s 1.05
add_one / HLOOpt / cpu / PreRev 0.000013 s 0.000012197079968245816 s 1.07
add_one / HLOOpt / cpu / PostRev 0.000014 s 0.00001592787995832623 s 0.88
add_one / HLOOpt / cpu / BothRev 0.000014 s 0.000014361539979290682 s 0.97
add_one / PartOpt / cpu / PreRev 0.000013 s 0.000012441599956218852 s 1.04
add_one / PartOpt / cpu / PostRev 0.000014 s 0.000012768499964295188 s 1.10
add_one / PartOpt / cpu / BothRev 0.000014 s 0.000012874940011897706 s 1.09
add_one / IPartOpt / cpu / PreRev 0.000043 s 0.000015121999986149604 s 2.84
add_one / IPartOpt / cpu / PostRev 0.000017 s 0.000012084039990440944 s 1.41
add_one / IPartOpt / cpu / BothRev 0.000014 s 0.000011877479973918524 s 1.18
add_one / DefOpt / cpu / PreRev 0.000014 s 0.000012192400008643745 s 1.15
add_one / DefOpt / cpu / PostRev 0.000014 s 0.000012204380000184756 s 1.15
add_one / DefOpt / cpu / BothRev 0.000015 s 0.000012285359971428987 s 1.22
add_one / IDefOpt / cpu / PreRev 0.000014 s 0.0000128007200146385 s 1.09
add_one / IDefOpt / cpu / PostRev 0.000029 s 0.000012469899966163211 s 2.33
add_one / IDefOpt / cpu / BothRev 0.000015 s 0.00001239260000147624 s 1.21
add_two / JaXPipe / cpu / Primal 0.000008081239984676358 s 0.000008162819976860192 s 0.99
add_two / Jax / cpu / Primal 0.000006833220004409668 s 0.000007155640005294117 s 0.95
add_two / HLOOpt / cpu / Primal 0.000011251159985476989 s 0.000010254219951093546 s 1.10
add_two / PartOpt / cpu / Primal 0.00000686575999679917 s 0.00000718430002052628 s 0.96
add_two / IPartOpt / cpu / Primal 0.00000699128000633209 s 0.000007414760048050084 s 0.94
add_two / DefOpt / cpu / Primal 0.000011676739995891694 s 0.000011154560015711467 s 1.05
add_two / IDefOpt / cpu / Primal 0.000007351920003202395 s 0.0000070370199864555614 s 1.04
add_two / JaXPipe / cpu / Forward 0.000010913339965554769 s 0.00001075002000106906 s 1.02
add_two / Jax / cpu / Forward 0.00001083413998458127 s 0.000010894060023929342 s 0.99
add_two / HLOOpt / cpu / Forward 0.00001650566001444531 s 0.00001475107999795 s 1.12
add_two / PartOpt / cpu / Forward 0.00001595563996488636 s 0.00001495366000199283 s 1.07
add_two / IPartOpt / cpu / Forward 0.000011088040009781253 s 0.000011008659985236593 s 1.01
add_two / DefOpt / cpu / Forward 0.000016130100048030725 s 0.000015363799975602886 s 1.05
add_two / IDefOpt / cpu / Forward 0.00001074757999049325 s 0.000011219039997740765 s 0.96
add_two / JaXPipe / cpu / PreRev 0.000015303739965020212 s 0.000014846300009594416 s 1.03
add_two / JaXPipe / cpu / PostRev 0.00001550980005049496 s 0.000014568139995390084 s 1.06
add_two / JaXPipe / cpu / BothRev 0.00001549651997265755 s 0.0000145148999945377 s 1.07
add_two / Jax / cpu / BothRev 0.00001604763999239367 s 0.000015333739975176285 s 1.05
add_two / HLOOpt / cpu / PreRev 0.000015250699989337593 s 0.000014390839960469749 s 1.06
add_two / HLOOpt / cpu / PostRev 0.000015635000036127167 s 0.000014757260023543497 s 1.06
add_two / HLOOpt / cpu / BothRev 0.00001686870000412455 s 0.000016447680000055698 s 1.03
add_two / PartOpt / cpu / PreRev 0.000015333320006902796 s 0.000014473599985649344 s 1.06
add_two / PartOpt / cpu / PostRev 0.000015136059973883677 s 0.000014772960003028856 s 1.02
add_two / PartOpt / cpu / BothRev 0.00001565096005833766 s 0.000014620759993704268 s 1.07
add_two / IPartOpt / cpu / PreRev 0.00001541286003885034 s 0.00001467320001211192 s 1.05
add_two / IPartOpt / cpu / PostRev 0.000015139179977268212 s 0.000014904119971106412 s 1.02
add_two / IPartOpt / cpu / BothRev 0.000014759420009795576 s 0.000015015659992059229 s 0.98
add_two / DefOpt / cpu / PreRev 0.000015000220018919208 s 0.000014867620029690442 s 1.01
add_two / DefOpt / cpu / PostRev 0.00001492532003794622 s 0.000014927759948477615 s 1.00
add_two / DefOpt / cpu / BothRev 0.00001518806000603945 s 0.000014545599970006152 s 1.04
add_two / IDefOpt / cpu / PreRev 0.000015363639986389897 s 0.000015273900016836707 s 1.01
add_two / IDefOpt / cpu / PostRev 0.000016271180002149776 s 0.000015041340038806083 s 1.08
add_two / IDefOpt / cpu / BothRev 0.0000153297399810981 s 0.000015036300028441474 s 1.02
add_two / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / Jax / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / HLOOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / IPartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / DefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / JaXPipe / cuda / Forward 0.000009952 s 0.000009984 s 1.00
add_two / Jax / cuda / Forward 0.00000992 s 0.000010081 s 0.98
add_two / HLOOpt / cuda / Forward 0.000010048 s 0.000009248 s 1.09
add_two / PartOpt / cuda / Forward 0.000009536 s 0.000009504 s 1.00
add_two / IPartOpt / cuda / Forward 0.000009985 s 0.000009312000000000002 s 1.07
add_two / DefOpt / cuda / Forward 0.000009728 s 0.000009825 s 0.99
add_two / IDefOpt / cuda / Forward 0.000009728 s 0.000009568 s 1.02
add_two / JaXPipe / cuda / PreRev 0.000033216 s 0.000031648 s 1.05
add_two / JaXPipe / cuda / PostRev 0.000033376 s 0.00003248 s 1.03
add_two / JaXPipe / cuda / BothRev 0.000032864 s 0.000032832 s 1.00
add_two / Jax / cuda / BothRev 0.000033376 s 0.00003184 s 1.05
add_two / HLOOpt / cuda / PreRev 0.000032416 s 0.000031616 s 1.03
add_two / HLOOpt / cuda / PostRev 0.00003488 s 0.00003232 s 1.08
add_two / HLOOpt / cuda / BothRev 0.00003296 s 0.000032032 s 1.03
add_two / PartOpt / cuda / PreRev 0.000033056 s 0.000032736 s 1.01
add_two / PartOpt / cuda / PostRev 0.000033119999999999995 s 0.000033088 s 1.00
add_two / PartOpt / cuda / BothRev 0.000037504000000000005 s 0.000031551 s 1.19
add_two / IPartOpt / cuda / PreRev 0.000032544 s 0.000032160000000000004 s 1.01
add_two / IPartOpt / cuda / PostRev 0.000032767999999999995 s 0.000031456 s 1.04
add_two / IPartOpt / cuda / BothRev 0.000037857 s 0.000042144 s 0.90
add_two / DefOpt / cuda / PreRev 0.000032511 s 0.000032800000000000004 s 0.99
add_two / DefOpt / cuda / PostRev 0.000033152000000000004 s 0.000032416 s 1.02
add_two / DefOpt / cuda / BothRev 0.000033119999999999995 s 0.000031424 s 1.05
add_two / IDefOpt / cuda / PreRev 0.000033216 s 0.000032928 s 1.01
add_two / IDefOpt / cuda / PostRev 0.00003328 s 0.000032864 s 1.01
add_two / IDefOpt / cuda / BothRev 0.000032928 s 0.000032704 s 1.01
add_two / JaXPipe / tpu / Primal 0.000001437425 s 0.0000014289250000000005 s 1.01
add_two / Jax / tpu / Primal 0.000001474975 s 0.00000147225 s 1.00
add_two / HLOOpt / tpu / Primal 0.000001439975 s 0.0000014288999999999998 s 1.01
add_two / PartOpt / tpu / Primal 0.000001481575 s 0.0000014854 s 1.00
add_two / IPartOpt / tpu / Primal 0.0000014359749999999998 s 0.0000014466750000000002 s 0.99
add_two / DefOpt / tpu / Primal 0.0000014767 s 0.00000148395 s 1.00
add_two / IDefOpt / tpu / Primal 0.000001453475 s 0.000001445225 s 1.01
add_two / JaXPipe / tpu / Forward 0.00000182825 s 0.00000184105 s 0.99
add_two / Jax / tpu / Forward 0.000001831375 s 0.000001822225 s 1.01
add_two / HLOOpt / tpu / Forward 0.0000018235 s 0.000001831225 s 1.00
add_two / PartOpt / tpu / Forward 0.000001841225 s 0.000001831625 s 1.01
add_two / IPartOpt / tpu / Forward 0.000001835575 s 0.000001822125 s 1.01
add_two / DefOpt / tpu / Forward 0.00000183175 s 0.000001831125 s 1.00
add_two / IDefOpt / tpu / Forward 0.000001822 s 0.000001831125 s 1.00
add_two / JaXPipe / tpu / PreRev 0.00000284215 s 0.0000028381250000000005 s 1.00
add_two / JaXPipe / tpu / PostRev 0.000002771025 s 0.0000027414 s 1.01
add_two / JaXPipe / tpu / BothRev 0.000002847975 s 0.0000028322750000000003 s 1.01
add_two / Jax / tpu / BothRev 0.00000275845 s 0.0000027434750000000004 s 1.01
add_two / HLOOpt / tpu / PreRev 0.000002842975 s 0.0000028451 s 1.00
add_two / HLOOpt / tpu / PostRev 0.0000027551750000000003 s 0.000002758575 s 1.00
add_two / HLOOpt / tpu / BothRev 0.000002841875 s 0.000002828825 s 1.00
add_two / PartOpt / tpu / PreRev 0.0000027578000000000005 s 0.00000274625 s 1.00
add_two / PartOpt / tpu / PostRev 0.000002840025 s 0.000002832975 s 1.00
add_two / PartOpt / tpu / BothRev 0.0000027477 s 0.000002757475 s 1.00
add_two / IPartOpt / tpu / PreRev 0.00000283175 s 0.000002834275 s 1.00
add_two / IPartOpt / tpu / PostRev 0.0000027611 s 0.00000275975 s 1.00
add_two / IPartOpt / tpu / BothRev 0.000002843725 s 0.000002849325 s 1.00
add_two / DefOpt / tpu / PreRev 0.0000027543 s 0.000002746525 s 1.00
add_two / DefOpt / tpu / PostRev 0.000002845525 s 0.000002838725 s 1.00
add_two / DefOpt / tpu / BothRev 0.00000275045 s 0.00000275765 s 1.00
add_two / IDefOpt / tpu / PreRev 0.0000028452250000000004 s 0.000002837125 s 1.00
add_two / IDefOpt / tpu / PostRev 0.00000274845 s 0.000002754725 s 1.00
add_two / IDefOpt / tpu / BothRev 0.0000028410000000000004 s 0.00000283905 s 1.00
add_two / JaXPipe / cpu / Primal 0.000013492 s 0.000008162819976860192 s 1.65
add_two / Jax / cpu / Primal 0.000013586 s 0.000007155640005294117 s 1.90
add_two / HLOOpt / cpu / Primal 0.000013784 s 0.000010254219951093546 s 1.34
add_two / PartOpt / cpu / Primal 0.000013663 s 0.00000718430002052628 s 1.90
add_two / IPartOpt / cpu / Primal 0.000013583 s 0.000007414760048050084 s 1.83
add_two / DefOpt / cpu / Primal 0.000013295 s 0.000011154560015711467 s 1.19
add_two / IDefOpt / cpu / Primal 0.000013198 s 0.0000070370199864555614 s 1.88
add_two / JaXPipe / cpu / Forward 0.000018122 s 0.00001075002000106906 s 1.69
add_two / Jax / cpu / Forward 0.000017913 s 0.000010894060023929342 s 1.64
add_two / HLOOpt / cpu / Forward 0.000018168 s 0.00001475107999795 s 1.23
add_two / PartOpt / cpu / Forward 0.00001777 s 0.00001495366000199283 s 1.19
add_two / IPartOpt / cpu / Forward 0.000017646 s 0.000011008659985236593 s 1.60
add_two / DefOpt / cpu / Forward 0.000017829999999999997 s 0.000015363799975602886 s 1.16
add_two / IDefOpt / cpu / Forward 0.000017694 s 0.000011219039997740765 s 1.58
add_two / JaXPipe / cpu / PreRev 0.000023832 s 0.000014846300009594416 s 1.61
add_two / JaXPipe / cpu / PostRev 0.00002366 s 0.000014568139995390084 s 1.62
add_two / JaXPipe / cpu / BothRev 0.00002468 s 0.0000145148999945377 s 1.70
add_two / Jax / cpu / BothRev 0.000023778 s 0.000015333739975176285 s 1.55
add_two / HLOOpt / cpu / PreRev 0.000023849 s 0.000014390839960469749 s 1.66
add_two / HLOOpt / cpu / PostRev 0.000024789 s 0.000014757260023543497 s 1.68
add_two / HLOOpt / cpu / BothRev 0.000023377 s 0.000016447680000055698 s 1.42
add_two / PartOpt / cpu / PreRev 0.000023058 s 0.000014473599985649344 s 1.59
add_two / PartOpt / cpu / PostRev 0.0000247 s 0.000014772960003028856 s 1.67
add_two / PartOpt / cpu / BothRev 0.000024312 s 0.000014620759993704268 s 1.66
add_two / IPartOpt / cpu / PreRev 0.000023005 s 0.00001467320001211192 s 1.57
add_two / IPartOpt / cpu / PostRev 0.000024355000000000003 s 0.000014904119971106412 s 1.63
add_two / IPartOpt / cpu / BothRev 0.000024546 s 0.000015015659992059229 s 1.63
add_two / DefOpt / cpu / PreRev 0.000023485 s 0.000014867620029690442 s 1.58
add_two / DefOpt / cpu / PostRev 0.000024138 s 0.000014927759948477615 s 1.62
add_two / DefOpt / cpu / BothRev 0.00002461 s 0.000014545599970006152 s 1.69
add_two / IDefOpt / cpu / PreRev 0.000023423 s 0.000015273900016836707 s 1.53
add_two / IDefOpt / cpu / PostRev 0.00003692 s 0.000015041340038806083 s 2.45
add_two / IDefOpt / cpu / BothRev 0.000024064 s 0.000015036300028441474 s 1.60
add_two / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.000008162819976860192 s 1.10
add_two / Jax / cpu / Primal 0.000008999999999999999 s 0.000007155640005294117 s 1.26
add_two / HLOOpt / cpu / Primal 0.000008999999999999999 s 0.000010254219951093546 s 0.88
add_two / PartOpt / cpu / Primal 0.000008 s 0.00000718430002052628 s 1.11
add_two / IPartOpt / cpu / Primal 0.000008999999999999999 s 0.000007414760048050084 s 1.21
add_two / DefOpt / cpu / Primal 0.000008999999999999999 s 0.000011154560015711467 s 0.81
add_two / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.0000070370199864555614 s 1.28
add_two / JaXPipe / cpu / Forward 0.000013 s 0.00001075002000106906 s 1.21
add_two / Jax / cpu / Forward 0.000012 s 0.000010894060023929342 s 1.10
add_two / HLOOpt / cpu / Forward 0.000013 s 0.00001475107999795 s 0.88
add_two / PartOpt / cpu / Forward 0.000012 s 0.00001495366000199283 s 0.80
add_two / IPartOpt / cpu / Forward 0.000012 s 0.000011008659985236593 s 1.09
add_two / DefOpt / cpu / Forward 0.000013 s 0.000015363799975602886 s 0.85
add_two / IDefOpt / cpu / Forward 0.000014 s 0.000011219039997740765 s 1.25
add_two / JaXPipe / cpu / PreRev 0.000016 s 0.000014846300009594416 s 1.08
add_two / JaXPipe / cpu / PostRev 0.000017 s 0.000014568139995390084 s 1.17
add_two / JaXPipe / cpu / BothRev 0.000017 s 0.0000145148999945377 s 1.17
add_two / Jax / cpu / BothRev 0.000016 s 0.000015333739975176285 s 1.04
add_two / HLOOpt / cpu / PreRev 0.000017 s 0.000014390839960469749 s 1.18
add_two / HLOOpt / cpu / PostRev 0.000016 s 0.000014757260023543497 s 1.08
add_two / HLOOpt / cpu / BothRev 0.000017 s 0.000016447680000055698 s 1.03
add_two / PartOpt / cpu / PreRev 0.000016 s 0.000014473599985649344 s 1.11
add_two / PartOpt / cpu / PostRev 0.000017 s 0.000014772960003028856 s 1.15
add_two / PartOpt / cpu / BothRev 0.000016 s 0.000014620759993704268 s 1.09
add_two / IPartOpt / cpu / PreRev 0.000052 s 0.00001467320001211192 s 3.54
add_two / IPartOpt / cpu / PostRev 0.000016 s 0.000014904119971106412 s 1.07
add_two / IPartOpt / cpu / BothRev 0.000016 s 0.000015015659992059229 s 1.07
add_two / DefOpt / cpu / PreRev 0.000017 s 0.000014867620029690442 s 1.14
add_two / DefOpt / cpu / PostRev 0.000016 s 0.000014927759948477615 s 1.07
add_two / DefOpt / cpu / BothRev 0.000016 s 0.000014545599970006152 s 1.10
add_two / IDefOpt / cpu / PreRev 0.000015 s 0.000015273900016836707 s 0.98
add_two / IDefOpt / cpu / PostRev 0.000016 s 0.000015041340038806083 s 1.06
add_two / IDefOpt / cpu / BothRev 0.000016 s 0.000015036300028441474 s 1.06
cache / JaXPipe / cpu / Primal 0.000007032439998511108 s 0.000007066839998515206 s 1.00
cache / Jax / cpu / Primal 0.0000069124799665587485 s 0.000007472959996448481 s 0.92
cache / HLOOpt / cpu / Primal 0.000006889740006954526 s 0.000006779060013286653 s 1.02
cache / PartOpt / cpu / Primal 0.000007360560002780403 s 0.000006673139987469767 s 1.10
cache / IPartOpt / cpu / Primal 0.000007197780032583978 s 0.000007064180017550825 s 1.02
cache / DefOpt / cpu / Primal 0.000007525120045102085 s 0.000006838919962319778 s 1.10
cache / IDefOpt / cpu / Primal 0.000007224980026876437 s 0.0000064925199876597614 s 1.11
cache / JaXPipe / cpu / Forward 0.000014878199963277438 s 0.00001453619999665534 s 1.02
cache / Jax / cpu / Forward 0.000014946679984859656 s 0.000014236959959816886 s 1.05
cache / HLOOpt / cpu / Forward 0.00001939286004017049 s 0.000019094519993814176 s 1.02
cache / PartOpt / cpu / Forward 0.000020110299992666117 s 0.000018744980025076075 s 1.07
cache / IPartOpt / cpu / Forward 0.000014990820027378504 s 0.000013940860044385773 s 1.08
cache / DefOpt / cpu / Forward 0.00002035759999671427 s 0.000023298279975279 s 0.87
cache / IDefOpt / cpu / Forward 0.000015212139987852423 s 0.000013929639981142828 s 1.09
cache / JaXPipe / cpu / PreRev 0.000016578160011704312 s 0.00001627784004085697 s 1.02
cache / JaXPipe / cpu / PostRev 0.00002125364003404684 s 0.00002106017997903109 s 1.01
cache / JaXPipe / cpu / BothRev 0.00001933483998072916 s 0.000015854440007387894 s 1.22
cache / Jax / cpu / BothRev 0.000020725140029753677 s 0.000021164679992580204 s 0.98
cache / HLOOpt / cpu / PreRev 0.00001697658003649849 s 0.00001555754000946763 s 1.09
cache / HLOOpt / cpu / PostRev 0.000017357039996568347 s 0.00001774892000867112 s 0.98
cache / HLOOpt / cpu / BothRev 0.00001915232001010736 s 0.000017897059997267206 s 1.07
cache / PartOpt / cpu / PreRev 0.000016229460006798034 s 0.000015700039957664556 s 1.03
cache / PartOpt / cpu / PostRev 0.000020753379967572986 s 0.000026045420017908327 s 0.80
cache / PartOpt / cpu / BothRev 0.000016577880023760373 s 0.000015716900006736977 s 1.05
cache / IPartOpt / cpu / PreRev 0.00002309939998667687 s 0.00001823539995712053 s 1.27
cache / IPartOpt / cpu / PostRev 0.00002162253998903907 s 0.000021174960011194344 s 1.02
cache / IPartOpt / cpu / BothRev 0.000016900039963729795 s 0.000015372979996755022 s 1.10
cache / DefOpt / cpu / PreRev 0.000017515500003355557 s 0.000015136819974941318 s 1.16
cache / DefOpt / cpu / PostRev 0.00002289114001541748 s 0.000016664260010657016 s 1.37
cache / DefOpt / cpu / BothRev 0.00001779193999027484 s 0.000016260779993899632 s 1.09
cache / IDefOpt / cpu / PreRev 0.000017509880008219626 s 0.000016183900052055832 s 1.08
cache / IDefOpt / cpu / PostRev 0.00001735616003315954 s 0.000015626220010744875 s 1.11
cache / IDefOpt / cpu / BothRev 0.000018139799940399823 s 0.000015151899979173325 s 1.20
cache / JaXPipe / cuda / Primal 0.000002304 s 0.0000023050000000000004 s 1.00
cache / Jax / cuda / Primal 0.000002272 s 0.000002272 s 1
cache / HLOOpt / cuda / Primal 0.000002272 s 0.000002273 s 1.00
cache / PartOpt / cuda / Primal 0.00000224 s 0.00000224 s 1
cache / IPartOpt / cuda / Primal 0.000002273 s 0.000002272 s 1.00
cache / DefOpt / cuda / Primal 0.000002272 s 0.00000224 s 1.01
cache / IDefOpt / cuda / Primal 0.000002304 s 0.000002304 s 1
cache / JaXPipe / cuda / Forward 0.000002335 s 0.000002336 s 1.00
cache / Jax / cuda / Forward 0.000002304 s 0.000002304 s 1
cache / HLOOpt / cuda / Forward 0.000002336 s 0.000002335 s 1.00
cache / PartOpt / cuda / Forward 0.000002335 s 0.000002335 s 1
cache / IPartOpt / cuda / Forward 0.000002273 s 0.000002304 s 0.99
cache / DefOpt / cuda / Forward 0.00000224 s 0.000002272 s 0.99
cache / IDefOpt / cuda / Forward 0.000002273 s 0.000002304 s 0.99
cache / JaXPipe / cuda / PreRev 0.000011520000000000002 s 0.000011264 s 1.02
cache / JaXPipe / cuda / PostRev 0.000011616 s 0.000011520000000000002 s 1.01
cache / JaXPipe / cuda / BothRev 0.000011584 s 0.000012192 s 0.95
cache / Jax / cuda / BothRev 0.000011488 s 0.000011520000000000002 s 1.00
cache / HLOOpt / cuda / PreRev 0.000013504 s 0.000013248 s 1.02
cache / HLOOpt / cuda / PostRev 0.000013535 s 0.000013215 s 1.02
cache / HLOOpt / cuda / BothRev 0.000013504 s 0.000013216 s 1.02
cache / PartOpt / cuda / PreRev 0.0000112 s 0.000011775 s 0.95
cache / PartOpt / cuda / PostRev 0.00001184 s 0.000011584 s 1.02
cache / PartOpt / cuda / BothRev 0.000011071 s 0.000011264 s 0.98
cache / IPartOpt / cuda / PreRev 0.000012096 s 0.000011584 s 1.04
cache / IPartOpt / cuda / PostRev 0.000011488 s 0.00001168 s 0.98
cache / IPartOpt / cuda / BothRev 0.000011392 s 0.000011712 s 0.97
cache / DefOpt / cuda / PreRev 0.000011328 s 0.00001136 s 1.00
cache / DefOpt / cuda / PostRev 0.000011392 s 0.000011776 s 0.97
cache / DefOpt / cuda / BothRev 0.000011296 s 0.000011392 s 0.99
cache / IDefOpt / cuda / PreRev 0.0000112 s 0.000011488 s 0.97
cache / IDefOpt / cuda / PostRev 0.000011520000000000002 s 0.000011488 s 1.00
cache / IDefOpt / cuda / BothRev 0.000011712 s 0.000011936 s 0.98
cache / JaXPipe / tpu / Primal 0.000002481025 s 0.0000024641 s 1.01
cache / Jax / tpu / Primal 0.000002459425 s 0.0000024560250000000003 s 1.00
cache / HLOOpt / tpu / Primal 0.000002454125 s 0.000002472575 s 0.99
cache / PartOpt / tpu / Primal 0.00000246745 s 0.000002467225 s 1.00
cache / IPartOpt / tpu / Primal 0.000002488475 s 0.000002477975 s 1.00
cache / DefOpt / tpu / Primal 0.0000024755750000000004 s 0.000002458625 s 1.01
cache / IDefOpt / tpu / Primal 0.000002447475 s 0.0000024739 s 0.99
cache / JaXPipe / tpu / Forward 0.000003560075 s 0.000003538975 s 1.01
cache / Jax / tpu / Forward 0.00000354345 s 0.000003541325 s 1.00
cache / HLOOpt / tpu / Forward 0.00000353675 s 0.000003561925 s 0.99
cache / PartOpt / tpu / Forward 0.0000035326000000000003 s 0.0000035528749999999994 s 0.99
cache / IPartOpt / tpu / Forward 0.0000035527 s 0.0000035541 s 1.00
cache / DefOpt / tpu / Forward 0.0000035262749999999995 s 0.000003528425 s 1.00
cache / IDefOpt / tpu / Forward 0.0000035397 s 0.00000352755 s 1.00
cache / JaXPipe / tpu / PreRev 0.000004992750000000001 s 0.000004963125 s 1.01
cache / JaXPipe / tpu / PostRev 0.000005001175 s 0.000004970225 s 1.01
cache / JaXPipe / tpu / BothRev 0.000005002 s 0.00000497175 s 1.01
cache / Jax / tpu / BothRev 0.000005012075000000001 s 0.00000498175 s 1.01
cache / HLOOpt / tpu / PreRev 0.00000398085 s 0.0000039377 s 1.01
cache / HLOOpt / tpu / PostRev 0.0000041382 s 0.000004116650000000001 s 1.01
cache / HLOOpt / tpu / BothRev 0.00000398985 s 0.0000039415 s 1.01
cache / PartOpt / tpu / PreRev 0.00000503165 s 0.00000497795 s 1.01
cache / PartOpt / tpu / PostRev 0.00000502305 s 0.000004961175 s 1.01
cache / PartOpt / tpu / BothRev 0.0000050404 s 0.000004959475 s 1.02
cache / IPartOpt / tpu / PreRev 0.0000050637750000000005 s 0.000004961225000000001 s 1.02
cache / IPartOpt / tpu / PostRev 0.000005027975000000001 s 0.000004974049999999999 s 1.01
cache / IPartOpt / tpu / BothRev 0.000005021975 s 0.000004962675 s 1.01
cache / DefOpt / tpu / PreRev 0.000005029125 s 0.000004962375 s 1.01
cache / DefOpt / tpu / PostRev 0.0000050181 s 0.000004966875 s 1.01
cache / DefOpt / tpu / BothRev 0.000005005950000000001 s 0.00000497465 s 1.01
cache / IDefOpt / tpu / PreRev 0.000005043174999999999 s 0.00000495725 s 1.02
cache / IDefOpt / tpu / PostRev 0.000005048850000000001 s 0.000004968925 s 1.02
cache / IDefOpt / tpu / BothRev 0.000005042225 s 0.000004959 s 1.02
cache / JaXPipe / cpu / Primal 0.000013327 s 0.000007066839998515206 s 1.89
cache / Jax / cpu / Primal 0.00001314 s 0.000007472959996448481 s 1.76
cache / HLOOpt / cpu / Primal 0.000012941 s 0.000006779060013286653 s 1.91
cache / PartOpt / cpu / Primal 0.00001315 s 0.000006673139987469767 s 1.97
cache / IPartOpt / cpu / Primal 0.000012747 s 0.000007064180017550825 s 1.80
cache / DefOpt / cpu / Primal 0.000012732 s 0.000006838919962319778 s 1.86
cache / IDefOpt / cpu / Primal 0.000012795 s 0.0000064925199876597614 s 1.97
cache / JaXPipe / cpu / Forward 0.000017901 s 0.00001453619999665534 s 1.23
cache / Jax / cpu / Forward 0.000017477 s 0.000014236959959816886 s 1.23
cache / HLOOpt / cpu / Forward 0.000017316 s 0.000019094519993814176 s 0.91
cache / PartOpt / cpu / Forward 0.000017473 s 0.000018744980025076075 s 0.93
cache / IPartOpt / cpu / Forward 0.000017687000000000002 s 0.000013940860044385773 s 1.27
cache / DefOpt / cpu / Forward 0.000017196 s 0.000023298279975279 s 0.74
cache / IDefOpt / cpu / Forward 0.000017542 s 0.000013929639981142828 s 1.26
cache / JaXPipe / cpu / PreRev 0.000017177000000000002 s 0.00001627784004085697 s 1.06
cache / JaXPipe / cpu / PostRev 0.000019992 s 0.00002106017997903109 s 0.95
cache / JaXPipe / cpu / BothRev 0.000017902000000000002 s 0.000015854440007387894 s 1.13
cache / Jax / cpu / BothRev 0.000020881 s 0.000021164679992580204 s 0.99
cache / HLOOpt / cpu / PreRev 0.000018121 s 0.00001555754000946763 s 1.16
cache / HLOOpt / cpu / PostRev 0.000017554 s 0.00001774892000867112 s 0.99
cache / HLOOpt / cpu / BothRev 0.000017754 s 0.000017897059997267206 s 0.99
cache / PartOpt / cpu / PreRev 0.000017331 s 0.000015700039957664556 s 1.10
cache / PartOpt / cpu / PostRev 0.000020391 s 0.000026045420017908327 s 0.78
cache / PartOpt / cpu / BothRev 0.000018021 s 0.000015716900006736977 s 1.15
cache / IPartOpt / cpu / PreRev 0.00001814 s 0.00001823539995712053 s 0.99
cache / IPartOpt / cpu / PostRev 0.000020742 s 0.000021174960011194344 s 0.98
cache / IPartOpt / cpu / BothRev 0.00001804 s 0.000015372979996755022 s 1.17
cache / DefOpt / cpu / PreRev 0.000017622 s 0.000015136819974941318 s 1.16
cache / DefOpt / cpu / PostRev 0.000017899999999999998 s 0.000016664260010657016 s 1.07
cache / DefOpt / cpu / BothRev 0.000017853 s 0.000016260779993899632 s 1.10
cache / IDefOpt / cpu / PreRev 0.000017255 s 0.000016183900052055832 s 1.07
cache / IDefOpt / cpu / PostRev 0.000018713 s 0.000015626220010744875 s 1.20
cache / IDefOpt / cpu / BothRev 0.000018281 s 0.000015151899979173325 s 1.21
cache / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.000007066839998515206 s 1.27
cache / Jax / cpu / Primal 0.00003 s 0.000007472959996448481 s 4.01
cache / HLOOpt / cpu / Primal 0.000008 s 0.000006779060013286653 s 1.18
cache / PartOpt / cpu / Primal 0.000008 s 0.000006673139987469767 s 1.20
cache / IPartOpt / cpu / Primal 0.000008 s 0.000007064180017550825 s 1.13
cache / DefOpt / cpu / Primal 0.000008 s 0.000006838919962319778 s 1.17
cache / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.0000064925199876597614 s 1.39
cache / JaXPipe / cpu / Forward 0.000035000000000000004 s 0.00001453619999665534 s 2.41
cache / Jax / cpu / Forward 0.000025 s 0.000014236959959816886 s 1.76
cache / HLOOpt / cpu / Forward 0.000035000000000000004 s 0.000019094519993814176 s 1.83
cache / PartOpt / cpu / Forward 0.000037 s 0.000018744980025076075 s 1.97
cache / IPartOpt / cpu / Forward 0.00001 s 0.000013940860044385773 s 0.72
cache / DefOpt / cpu / Forward 0.000017 s 0.000023298279975279 s 0.73
cache / IDefOpt / cpu / Forward 0.000017999999999999997 s 0.000013929639981142828 s 1.29
cache / JaXPipe / cpu / PreRev 0.000011 s 0.00001627784004085697 s 0.68
cache / JaXPipe / cpu / PostRev 0.000046 s 0.00002106017997903109 s 2.18
cache / JaXPipe / cpu / BothRev 0.000035999999999999994 s 0.000015854440007387894 s 2.27
cache / Jax / cpu / BothRev 0.000035999999999999994 s 0.000021164679992580204 s 1.70
cache / HLOOpt / cpu / PreRev 0.000011 s 0.00001555754000946763 s 0.71
cache / HLOOpt / cpu / PostRev 0.000011 s 0.00001774892000867112 s 0.62
cache / HLOOpt / cpu / BothRev 0.000014 s 0.000017897059997267206 s 0.78
cache / PartOpt / cpu / PreRev 0.000013 s 0.000015700039957664556 s 0.83
cache / PartOpt / cpu / PostRev 0.000013 s 0.000026045420017908327 s 0.50
cache / PartOpt / cpu / BothRev 0.000011 s 0.000015716900006736977 s 0.70
cache / IPartOpt / cpu / PreRev 0.000011 s 0.00001823539995712053 s 0.60
cache / IPartOpt / cpu / PostRev 0.00003 s 0.000021174960011194344 s 1.42
cache / IPartOpt / cpu / BothRev 0.000017 s 0.000015372979996755022 s 1.11
cache / DefOpt / cpu / PreRev 0.000011 s 0.000015136819974941318 s 0.73
cache / DefOpt / cpu / PostRev 0.000011 s 0.000016664260010657016 s 0.66
cache / DefOpt / cpu / BothRev 0.000035000000000000004 s 0.000016260779993899632 s 2.15
cache / IDefOpt / cpu / PreRev 0.000011 s 0.000016183900052055832 s 0.68
cache / IDefOpt / cpu / PostRev 0.00001 s 0.000015626220010744875 s 0.64
cache / IDefOpt / cpu / BothRev 0.000011 s 0.000015151899979173325 s 0.73
Concat / JaXPipe / cpu / Primal 0.00000829906004582881 s 0.00000728931998310145 s 1.14
Concat / Jax / cpu / Primal 0.000007768720042804489 s 0.000007061659989631152 s 1.10
Concat / HLOOpt / cpu / Primal 0.000010710960032156435 s 0.000009589940036676126 s 1.12
Concat / PartOpt / cpu / Primal 0.000007859679999455694 s 0.00000695204000294325 s 1.13
Concat / IPartOpt / cpu / Primal 0.000007252259983943075 s 0.000006810799986851635 s 1.06
Concat / DefOpt / cpu / Primal 0.000011469840028439647 s 0.000010120979968633036 s 1.13
Concat / IDefOpt / cpu / Primal 0.000007617159963047016 s 0.000006804140002714121 s 1.12
Concat / JaXPipe / cpu / Forward 0.000011274760008745944 s 0.000010196720004387315 s 1.11
Concat / Jax / cpu / Forward 0.000011028039998564057 s 0.00001051378000738623 s 1.05
Concat / HLOOpt / cpu / Forward 0.00001542666002023907 s 0.000014242419974834774 s 1.08
Concat / PartOpt / cpu / Forward 0.00001604325998414424 s 0.000014837000007901224 s 1.08
Concat / IPartOpt / cpu / Forward 0.000011542860002009548 s 0.000010930519947578431 s 1.06
Concat / DefOpt / cpu / Forward 0.0000161349199788674 s 0.00001516530002845684 s 1.06
Concat / IDefOpt / cpu / Forward 0.00001182561995847209 s 0.000010654359984982876 s 1.11
Concat / JaXPipe / cpu / PreRev 0.000013263659984659171 s 0.000012291220000406611 s 1.08
Concat / JaXPipe / cpu / PostRev 0.000013080599965178408 s 0.000012073859988959156 s 1.08
Concat / JaXPipe / cpu / BothRev 0.000012982260068383768 s 0.000011819039991678438 s 1.10
Concat / Jax / cpu / BothRev 0.000012999519994991716 s 0.000012705760009339428 s 1.02
Concat / HLOOpt / cpu / PreRev 0.00001275908000934578 s 0.000011614619997999398 s 1.10
Concat / HLOOpt / cpu / PostRev 0.000016885180011740887 s 0.00001240181997673062 s 1.36
Concat / HLOOpt / cpu / BothRev 0.000014768179989914642 s 0.000013595179925687262 s 1.09
Concat / PartOpt / cpu / PreRev 0.000012778419986716473 s 0.000011691240015352378 s 1.09
Concat / PartOpt / cpu / PostRev 0.000013143920041329691 s 0.00001208772000609315 s 1.09
Concat / PartOpt / cpu / BothRev 0.00001331450002908241 s 0.000011353420031809946 s 1.17
Concat / IPartOpt / cpu / PreRev 0.000018031419967883268 s 0.000012248699995325296 s 1.47
Concat / IPartOpt / cpu / PostRev 0.000012552600010167224 s 0.000011892720031028149 s 1.06
Concat / IPartOpt / cpu / BothRev 0.000012262579994057886 s 0.00001183884000965918 s 1.04
Concat / DefOpt / cpu / PreRev 0.000012731639999401523 s 0.000012121159907110268 s 1.05
Concat / DefOpt / cpu / PostRev 0.000013212040003054426 s 0.000011294820005787189 s 1.17
Concat / DefOpt / cpu / BothRev 0.000012634260001505026 s 0.000012426319981386767 s 1.02
Concat / IDefOpt / cpu / PreRev 0.000012809499994546058 s 0.000011653779965854484 s 1.10
Concat / IDefOpt / cpu / PostRev 0.000012846619965785066 s 0.00001219744000081846 s 1.05
Concat / IDefOpt / cpu / BothRev 0.000012837839985877508 s 0.00001177469997855951 s 1.09
Concat / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / Jax / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / HLOOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / IPartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / DefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
Concat / JaXPipe / cuda / Forward 0.000010209 s 0.000009729 s 1.05
Concat / Jax / cuda / Forward 0.000009952 s 0.000009824 s 1.01
Concat / HLOOpt / cuda / Forward 0.000010176 s 0.000009951 s 1.02
Concat / PartOpt / cuda / Forward 0.000010048 s 0.00000976 s 1.03
Concat / IPartOpt / cuda / Forward 0.000010336 s 0.000009824 s 1.05
Concat / DefOpt / cuda / Forward 0.000009792 s 0.000009792 s 1
Concat / IDefOpt / cuda / Forward 0.000010176 s 0.000009856 s 1.03
Concat / JaXPipe / cuda / PreRev 0.000016704 s 0.00001664 s 1.00
Concat / JaXPipe / cuda / PostRev 0.000016544 s 0.000016576000000000002 s 1.00
Concat / JaXPipe / cuda / BothRev 0.000016448000000000002 s 0.000016544 s 0.99
Concat / Jax / cuda / BothRev 0.000016576000000000002 s 0.000016544 s 1.00
Concat / HLOOpt / cuda / PreRev 0.000016864 s 0.000016672 s 1.01
Concat / HLOOpt / cuda / PostRev 0.000016608 s 0.000016448000000000002 s 1.01
Concat / HLOOpt / cuda / BothRev 0.000016352 s 0.00001664 s 0.98
Concat / PartOpt / cuda / PreRev 0.0000168 s 0.000016832 s 1.00
Concat / PartOpt / cuda / PostRev 0.000016321 s 0.000016608 s 0.98
Concat / PartOpt / cuda / BothRev 0.000017087 s 0.000016608 s 1.03
Concat / IPartOpt / cuda / PreRev 0.000016352 s 0.000016768000000000003 s 0.98
Concat / IPartOpt / cuda / PostRev 0.000016224 s 0.000016 s 1.01
Concat / IPartOpt / cuda / BothRev 0.000016416 s 0.00001632 s 1.01
Concat / DefOpt / cuda / PreRev 0.000016736 s 0.000016416 s 1.02
Concat / DefOpt / cuda / PostRev 0.000016608 s 0.000016768000000000003 s 0.99
Concat / DefOpt / cuda / BothRev 0.000015904000000000002 s 0.000016737 s 0.95
Concat / IDefOpt / cuda / PreRev 0.000016832 s 0.000017151 s 0.98
Concat / IDefOpt / cuda / PostRev 0.000016768000000000003 s 0.000018272 s 0.92
Concat / IDefOpt / cuda / BothRev 0.000016575 s 0.000016161 s 1.03
Concat / JaXPipe / tpu / Primal 0.0000015347 s 0.00000153695 s 1.00
Concat / Jax / tpu / Primal 0.000001520925 s 0.0000015382000000000002 s 0.99
Concat / HLOOpt / tpu / Primal 0.000001532775 s 0.0000015347 s 1.00
Concat / PartOpt / tpu / Primal 0.0000015308250000000005 s 0.000001523 s 1.01
Concat / IPartOpt / tpu / Primal 0.000001540875 s 0.0000015328249999999998 s 1.01
Concat / DefOpt / tpu / Primal 0.000001525625 s 0.000001518975 s 1.00
Concat / IDefOpt / tpu / Primal 0.00000153045 s 0.00000152715 s 1.00
Concat / JaXPipe / tpu / Forward 0.0000015790499999999995 s 0.0000015834999999999995 s 1.00
Concat / Jax / tpu / Forward 0.0000015596750000000002 s 0.000001556725 s 1.00
Concat / HLOOpt / tpu / Forward 0.0000015848750000000002 s 0.0000015840750000000002 s 1.00
Concat / PartOpt / tpu / Forward 0.000001563025 s 0.0000015588750000000002 s 1.00
Concat / IPartOpt / tpu / Forward 0.000001587725 s 0.0000015833 s 1.00
Concat / DefOpt / tpu / Forward 0.000001566775 s 0.000001565725 s 1.00
Concat / IDefOpt / tpu / Forward 0.000001583725 s 0.0000015870750000000002 s 1.00
Concat / JaXPipe / tpu / PreRev 0.000002010675 s 0.00000201065 s 1.00
Concat / JaXPipe / tpu / PostRev 0.00000207025 s 0.0000020853 s 0.99
Concat / JaXPipe / tpu / BothRev 0.00000201945 s 0.0000020050000000000003 s 1.01
Concat / Jax / tpu / BothRev 0.000002076 s 0.000002069025 s 1.00
Concat / HLOOpt / tpu / PreRev 0.000002023475 s 0.000002011075 s 1.01
Concat / HLOOpt / tpu / PostRev 0.000002075375 s 0.000002083 s 1.00
Concat / HLOOpt / tpu / BothRev 0.0000020157 s 0.0000020125 s 1.00
Concat / PartOpt / tpu / PreRev 0.00000207855 s 0.000002075825 s 1.00
Concat / PartOpt / tpu / PostRev 0.0000020102000000000003 s 0.000002008925 s 1.00
Concat / PartOpt / tpu / BothRev 0.000002066275 s 0.000002089875 s 0.99
Concat / IPartOpt / tpu / PreRev 0.0000020143 s 0.000002010675 s 1.00
Concat / IPartOpt / tpu / PostRev 0.00000206245 s 0.000002073175 s 0.99
Concat / IPartOpt / tpu / BothRev 0.0000020061750000000003 s 0.0000020139000000000003 s 1.00
Concat / DefOpt / tpu / PreRev 0.0000020683999999999995 s 0.0000020753 s 1.00
Concat / DefOpt / tpu / PostRev 0.000002005575 s 0.000002000725 s 1.00
Concat / DefOpt / tpu / BothRev 0.000002070475 s 0.000002083775 s 0.99
Concat / IDefOpt / tpu / PreRev 0.000002012925 s 0.0000020033 s 1.00
Concat / IDefOpt / tpu / PostRev 0.000002065875 s 0.00000207225 s 1.00
Concat / IDefOpt / tpu / BothRev 0.000002008625 s 0.00000200575 s 1.00
Concat / JaXPipe / cpu / Primal 0.000013135 s 0.00000728931998310145 s 1.80
Concat / Jax / cpu / Primal 0.000013291 s 0.000007061659989631152 s 1.88
Concat / HLOOpt / cpu / Primal 0.000012851 s 0.000009589940036676126 s 1.34
Concat / PartOpt / cpu / Primal 0.000012989 s 0.00000695204000294325 s 1.87
Concat / IPartOpt / cpu / Primal 0.000012825 s 0.000006810799986851635 s 1.88
Concat / DefOpt / cpu / Primal 0.000013006 s 0.000010120979968633036 s 1.29
Concat / IDefOpt / cpu / Primal 0.000012924 s 0.000006804140002714121 s 1.90
Concat / JaXPipe / cpu / Forward 0.000017676999999999997 s 0.000010196720004387315 s 1.73
Concat / Jax / cpu / Forward 0.000017198 s 0.00001051378000738623 s 1.64
Concat / HLOOpt / cpu / Forward 0.00001711 s 0.000014242419974834774 s 1.20
Concat / PartOpt / cpu / Forward 0.000017670000000000002 s 0.000014837000007901224 s 1.19
Concat / IPartOpt / cpu / Forward 0.000017327 s 0.000010930519947578431 s 1.59
Concat / DefOpt / cpu / Forward 0.000017514 s 0.00001516530002845684 s 1.15
Concat / IDefOpt / cpu / Forward 0.000017696 s 0.000010654359984982876 s 1.66
Concat / JaXPipe / cpu / PreRev 0.000020065 s 0.000012291220000406611 s 1.63
Concat / JaXPipe / cpu / PostRev 0.00001984 s 0.000012073859988959156 s 1.64
Concat / JaXPipe / cpu / BothRev 0.000019791 s 0.000011819039991678438 s 1.67
Concat / Jax / cpu / BothRev 0.000019968 s 0.000012705760009339428 s 1.57
Concat / HLOOpt / cpu / PreRev 0.00002003 s 0.000011614619997999398 s 1.72
Concat / HLOOpt / cpu / PostRev 0.000020277 s 0.00001240181997673062 s 1.64
Concat / HLOOpt / cpu / BothRev 0.000019494 s 0.000013595179925687262 s 1.43
Concat / PartOpt / cpu / PreRev 0.000020321 s 0.000011691240015352378 s 1.74
Concat / PartOpt / cpu / PostRev 0.000020031 s 0.00001208772000609315 s 1.66
Concat / PartOpt / cpu / BothRev 0.000019756 s 0.000011353420031809946 s 1.74
Concat / IPartOpt / cpu / PreRev 0.000019728 s 0.000012248699995325296 s 1.61
Concat / IPartOpt / cpu / PostRev 0.000019552 s 0.000011892720031028149 s 1.64
Concat / IPartOpt / cpu / BothRev 0.000019684 s 0.00001183884000965918 s 1.66
Concat / DefOpt / cpu / PreRev 0.000020076 s 0.000012121159907110268 s 1.66
Concat / DefOpt / cpu / PostRev 0.000020269 s 0.000011294820005787189 s 1.79
Concat / DefOpt / cpu / BothRev 0.000020537 s 0.000012426319981386767 s 1.65
Concat / IDefOpt / cpu / PreRev 0.000020073 s 0.000011653779965854484 s 1.72
Concat / IDefOpt / cpu / PostRev 0.000020138 s 0.00001219744000081846 s 1.65
Concat / IDefOpt / cpu / BothRev 0.000019838 s 0.00001177469997855951 s 1.68
Concat / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.00000728931998310145 s 1.23
Concat / Jax / cpu / Primal 0.000008999999999999999 s 0.000007061659989631152 s 1.27
Concat / HLOOpt / cpu / Primal 0.000008999999999999999 s 0.000009589940036676126 s 0.94
Concat / PartOpt / cpu / Primal 0.000008999999999999999 s 0.00000695204000294325 s 1.29
Concat / IPartOpt / cpu / Primal 0.000008999999999999999 s 0.000006810799986851635 s 1.32
Concat / DefOpt / cpu / Primal 0.000008999999999999999 s 0.000010120979968633036 s 0.89
Concat / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.000006804140002714121 s 1.32
Concat / JaXPipe / cpu / Forward 0.000013 s 0.000010196720004387315 s 1.27
Concat / Jax / cpu / Forward 0.000012 s 0.00001051378000738623 s 1.14
Concat / HLOOpt / cpu / Forward 0.000012 s 0.000014242419974834774 s 0.84
Concat / PartOpt / cpu / Forward 0.000012 s 0.000014837000007901224 s 0.81
Concat / IPartOpt / cpu / Forward 0.000013 s 0.000010930519947578431 s 1.19
Concat / DefOpt / cpu / Forward 0.000041 s 0.00001516530002845684 s 2.70
Concat / IDefOpt / cpu / Forward 0.000013 s 0.000010654359984982876 s 1.22
Concat / JaXPipe / cpu / PreRev 0.000015 s 0.000012291220000406611 s 1.22
Concat / JaXPipe / cpu / PostRev 0.000014 s 0.000012073859988959156 s 1.16
Concat / JaXPipe / cpu / BothRev 0.000014 s 0.000011819039991678438 s 1.18
Concat / Jax / cpu / BothRev 0.000014 s 0.000012705760009339428 s 1.10
Concat / HLOOpt / cpu / PreRev 0.000015 s 0.000011614619997999398 s 1.29
Concat / HLOOpt / cpu / PostRev 0.000014 s 0.00001240181997673062 s 1.13
Concat / HLOOpt / cpu / BothRev 0.000015 s 0.000013595179925687262 s 1.10
Concat / PartOpt / cpu / PreRev 0.000013 s 0.000011691240015352378 s 1.11
Concat / PartOpt / cpu / PostRev 0.000015 s 0.00001208772000609315 s 1.24
Concat / PartOpt / cpu / BothRev 0.000014 s 0.000011353420031809946 s 1.23
Concat / IPartOpt / cpu / PreRev 0.000044 s 0.000012248699995325296 s 3.59
Concat / IPartOpt / cpu / PostRev 0.000014 s 0.000011892720031028149 s 1.18
Concat / IPartOpt / cpu / BothRev 0.000015 s 0.00001183884000965918 s 1.27
Concat / DefOpt / cpu / PreRev 0.000014 s 0.000012121159907110268 s 1.16
Concat / DefOpt / cpu / PostRev 0.000014 s 0.000011294820005787189 s 1.24
Concat / DefOpt / cpu / BothRev 0.000023 s 0.000012426319981386767 s 1.85
Concat / IDefOpt / cpu / PreRev 0.000014 s 0.000011653779965854484 s 1.20
Concat / IDefOpt / cpu / PostRev 0.000014 s 0.00001219744000081846 s 1.15
Concat / IDefOpt / cpu / BothRev 0.000015 s 0.00001177469997855951 s 1.27
const_scatter / JaXPipe / cpu / Primal 0.000007858459948693053 s 0.000007763400026306044 s 1.01
const_scatter / Jax / cpu / Primal 0.000008319000025949208 s 0.0000070458600202982776 s 1.18
const_scatter / HLOOpt / cpu / Primal 0.000008191460010493756 s 0.000006994660006967024 s 1.17
const_scatter / PartOpt / cpu / Primal 0.000007210560015664669 s 0.0000072007200014923 s 1.00
const_scatter / IPartOpt / cpu / Primal 0.000007383600004686741 s 0.000006805479988543084 s 1.08
const_scatter / DefOpt / cpu / Primal 0.000007526819972554222 s 0.000006607559998883516 s 1.14
const_scatter / IDefOpt / cpu / Primal 0.000007296140047401423 s 0.000006990259953454369 s 1.04
const_scatter / JaXPipe / cpu / Forward 0.00001072375997864583 s 0.000010474540013092336 s 1.02
const_scatter / Jax / cpu / Forward 0.000010594060013318083 s 0.000010772199975690456 s 0.98
const_scatter / HLOOpt / cpu / Forward 0.000014750539976375877 s 0.000014144699998723808 s 1.04
const_scatter / PartOpt / cpu / Forward 0.000015150840017668088 s 0.000015396779990624053 s 0.98
const_scatter / IPartOpt / cpu / Forward 0.000010379679997640778 s 0.000010131839981113443 s 1.02
const_scatter / DefOpt / cpu / Forward 0.000015649040033167694 s 0.000014296299987108796 s 1.09
const_scatter / IDefOpt / cpu / Forward 0.000010864899986700038 s 0.00001000824006041512 s 1.09
const_scatter / JaXPipe / cpu / PreRev 0.0003033570000206 s 0.0003034807000403 s 1.00
const_scatter / JaXPipe / cpu / PostRev 0.0002936581799713 s 0.0002982542200425 s 0.98
const_scatter / JaXPipe / cpu / BothRev 0.0002863510400129 s 0.000285023639999 s 1.00
const_scatter / Jax / cpu / BothRev 0.0002861106400359 s 0.0002842879400213 s 1.01
const_scatter / HLOOpt / cpu / PreRev 0.0002875174600103 s 0.0002853217399842 s 1.01
const_scatter / HLOOpt / cpu / PostRev 0.0002900775599755 s 0.0002907451599548 s 1.00
const_scatter / HLOOpt / cpu / BothRev 0.0002879207200021 s 0.0002864135400341 s 1.01
const_scatter / PartOpt / cpu / PreRev 0.0002904777799994 s 0.0002835488199889 s 1.02
const_scatter / PartOpt / cpu / PostRev 0.0002927011999963 s 0.0002898025000013 s 1.01
const_scatter / PartOpt / cpu / BothRev 0.0002876077000109 s 0.0002821240400044 s 1.02
const_scatter / IPartOpt / cpu / PreRev 0.0002914237799723 s 0.0002919069400104 s 1.00
const_scatter / IPartOpt / cpu / PostRev 0.000291716099955 s 0.0002938434999578 s 0.99
const_scatter / IPartOpt / cpu / BothRev 0.0002864443399721 s 0.0002843726199625 s 1.01
const_scatter / DefOpt / cpu / PreRev 0.0002932473599958 s 0.0002895347999765 s 1.01
const_scatter / DefOpt / cpu / PostRev 0.0002930403200025 s 0.000291647899985 s 1.00
const_scatter / DefOpt / cpu / BothRev 0.0002866178000112 s 0.0002939875399533 s 0.97
const_scatter / IDefOpt / cpu / PreRev 0.000290391260014 s 0.0002921339599561 s 0.99
const_scatter / IDefOpt / cpu / PostRev 0.0002924229199834 s 0.0002916513599848 s 1.00
const_scatter / IDefOpt / cpu / BothRev 0.0002868765000403 s 0.0002838417199109 s 1.01
const_scatter / JaXPipe / cuda / Primal 0.000001887 s 0.000001887 s 1
const_scatter / Jax / cuda / Primal 0.000001887 s 0.000001887 s 1
const_scatter / HLOOpt / cuda / Primal 0.000001888 s 0.000001887 s 1.00
const_scatter / PartOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
const_scatter / IPartOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
const_scatter / DefOpt / cuda / Primal 0.000001887 s 0.000001888 s 1.00
const_scatter / IDefOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
const_scatter / JaXPipe / cuda / Forward 0.000010144 s 0.000009664 s 1.05
const_scatter / Jax / cuda / Forward 0.000010048 s 0.00000944 s 1.06
const_scatter / HLOOpt / cuda / Forward 0.000010208 s 0.000009728 s 1.05
const_scatter / PartOpt / cuda / Forward 0.000009888 s 0.00000992 s 1.00
const_scatter / IPartOpt / cuda / Forward 0.000009856 s 0.000009665 s 1.02
const_scatter / DefOpt / cuda / Forward 0.00001024 s 0.000009535 s 1.07
const_scatter / IDefOpt / cuda / Forward 0.000010304 s 0.000009952 s 1.04
const_scatter / JaXPipe / cuda / PreRev 0.000012575 s 0.000012736 s 0.99
const_scatter / JaXPipe / cuda / PostRev 0.000025568 s 0.0000168 s 1.52
const_scatter / JaXPipe / cuda / BothRev 0.000014016 s 0.000012736 s 1.10
const_scatter / Jax / cuda / BothRev 0.000018752000000000003 s 0.000016608 s 1.13
const_scatter / HLOOpt / cuda / PreRev 0.00001312 s 0.000012672 s 1.04
const_scatter / HLOOpt / cuda / PostRev 0.000013632 s 0.000012512 s 1.09
const_scatter / HLOOpt / cuda / BothRev 0.00001376 s 0.00001296 s 1.06
const_scatter / PartOpt / cuda / PreRev 0.000014272 s 0.000012992 s 1.10
const_scatter / PartOpt / cuda / PostRev 0.000016416 s 0.000016383999999999998 s 1.00
const_scatter / PartOpt / cuda / BothRev 0.000013024 s 0.000012736 s 1.02
const_scatter / IPartOpt / cuda / PreRev 0.000012448 s 0.000012769 s 0.97
const_scatter / IPartOpt / cuda / PostRev 0.000016383999999999998 s 0.000016416 s 1.00
const_scatter / IPartOpt / cuda / BothRev 0.000012384 s 0.000012256 s 1.01
const_scatter / DefOpt / cuda / PreRev 0.000012448 s 0.000012576 s 0.99
const_scatter / DefOpt / cuda / PostRev 0.000012929 s 0.00001264 s 1.02
const_scatter / DefOpt / cuda / BothRev 0.000012864 s 0.000013344 s 0.96
const_scatter / IDefOpt / cuda / PreRev 0.000012896 s 0.000013088 s 0.99
const_scatter / IDefOpt / cuda / PostRev 0.000012767 s 0.0000128 s 1.00
const_scatter / IDefOpt / cuda / BothRev 0.000012768 s 0.000012767 s 1.00
const_scatter / JaXPipe / tpu / Primal 0.00000377545 s 0.0000038034 s 0.99
const_scatter / Jax / tpu / Primal 0.00000383505 s 0.00000380865 s 1.01
const_scatter / HLOOpt / tpu / Primal 9.53425e-7 s 9.24775e-7 s 1.03
const_scatter / PartOpt / tpu / Primal 0.000003806675 s 0.00000381505 s 1.00
const_scatter / IPartOpt / tpu / Primal 0.000003772525 s 0.000003789125 s 1.00
const_scatter / DefOpt / tpu / Primal 9.73975e-7 s 9.593500000000002e-7 s 1.02
const_scatter / IDefOpt / tpu / Primal 9.67225e-7 s 9.3435e-7 s 1.04
const_scatter / JaXPipe / tpu / Forward 0.000001938325 s 0.0000019250500000000003 s 1.01
const_scatter / Jax / tpu / Forward 0.000006491025 s 0.000006493175000000001 s 1.00
const_scatter / HLOOpt / tpu / Forward 0.000001925175 s 0.000001916875 s 1.00
const_scatter / PartOpt / tpu / Forward 0.000001958975 s 0.000001943925 s 1.01
const_scatter / IPartOpt / tpu / Forward 0.00000192605 s 0.00000192015 s 1.00
const_scatter / DefOpt / tpu / Forward 0.0000019687 s 0.000001923975 s 1.02
const_scatter / IDefOpt / tpu / Forward 0.000001933 s 0.0000019232 s 1.01
const_scatter / JaXPipe / tpu / PreRev 0.0000043261 s 0.000004320925 s 1.00
const_scatter / JaXPipe / tpu / PostRev 0.0000066103 s 0.00000660905 s 1.00
const_scatter / JaXPipe / tpu / BothRev 0.00000431415 s 0.00000429955 s 1.00
const_scatter / Jax / tpu / BothRev 0.0000066142 s 0.000006660325 s 0.99
const_scatter / HLOOpt / tpu / PreRev 0.0000043207 s 0.000004301925 s 1.00
const_scatter / HLOOpt / tpu / PostRev 0.000004309425000000001 s 0.000004297575000000001 s 1.00
const_scatter / HLOOpt / tpu / BothRev 0.00000431585 s 0.00000430105 s 1.00
const_scatter / PartOpt / tpu / PreRev 0.0000043174750000000005 s 0.0000043072 s 1.00
const_scatter / PartOpt / tpu / PostRev 0.00000660145 s 0.000006595149999999999 s 1.00
const_scatter / PartOpt / tpu / BothRev 0.000004308375 s 0.0000043066500000000005 s 1.00
const_scatter / IPartOpt / tpu / PreRev 0.0000043242 s 0.000004302775 s 1.00
const_scatter / IPartOpt / tpu / PostRev 0.000006626975 s 0.000006622050000000001 s 1.00
const_scatter / IPartOpt / tpu / BothRev 0.00000432295 s 0.000004293175 s 1.01
const_scatter / DefOpt / tpu / PreRev 0.000004297425000000001 s 0.000004297425000000001 s 1
const_scatter / DefOpt / tpu / PostRev 0.000004309725 s 0.0000043024250000000006 s 1.00
const_scatter / DefOpt / tpu / BothRev 0.00000430405 s 0.000004299799999999999 s 1.00
const_scatter / IDefOpt / tpu / PreRev 0.0000043157 s 0.000004311499999999999 s 1.00
const_scatter / IDefOpt / tpu / PostRev 0.000004318675 s 0.00000428725 s 1.01
const_scatter / IDefOpt / tpu / BothRev 0.0000043286 s 0.00000428755 s 1.01
const_scatter / JaXPipe / cpu / Primal 0.000012945 s 0.000007763400026306044 s 1.67
const_scatter / Jax / cpu / Primal 0.000013333 s 0.0000070458600202982776 s 1.89
const_scatter / HLOOpt / cpu / Primal 0.000012977 s 0.000006994660006967024 s 1.86
const_scatter / PartOpt / cpu / Primal 0.000012791 s 0.0000072007200014923 s 1.78
const_scatter / IPartOpt / cpu / Primal 0.000012577 s 0.000006805479988543084 s 1.85
const_scatter / DefOpt / cpu / Primal 0.000012885 s 0.000006607559998883516 s 1.95
const_scatter / IDefOpt / cpu / Primal 0.000012843 s 0.000006990259953454369 s 1.84
const_scatter / JaXPipe / cpu / Forward 0.000017576999999999998 s 0.000010474540013092336 s 1.68
const_scatter / Jax / cpu / Forward 0.000016751 s 0.000010772199975690456 s 1.56
const_scatter / HLOOpt / cpu / Forward 0.000016712000000000002 s 0.000014144699998723808 s 1.18
const_scatter / PartOpt / cpu / Forward 0.000016934 s 0.000015396779990624053 s 1.10
const_scatter / IPartOpt / cpu / Forward 0.00001694 s 0.000010131839981113443 s 1.67
const_scatter / DefOpt / cpu / Forward 0.000016749 s 0.000014296299987108796 s 1.17
const_scatter / IDefOpt / cpu / Forward 0.000016804 s 0.00001000824006041512 s 1.68
const_scatter / JaXPipe / cpu / PreRev 0.0004947759999999 s 0.0003034807000403 s 1.63
const_scatter / JaXPipe / cpu / PostRev 0.0005280479999999 s 0.0002982542200425 s 1.77
const_scatter / JaXPipe / cpu / BothRev 0.000522243 s 0.000285023639999 s 1.83
const_scatter / Jax / cpu / BothRev 0.000516483 s 0.0002842879400213 s 1.82
const_scatter / HLOOpt / cpu / PreRev 0.000515135 s 0.0002853217399842 s 1.81
const_scatter / HLOOpt / cpu / PostRev 0.0005107709999999 s 0.0002907451599548 s 1.76
const_scatter / HLOOpt / cpu / BothRev 0.000491423 s 0.0002864135400341 s 1.72
const_scatter / PartOpt / cpu / PreRev 0.000503264 s 0.0002835488199889 s 1.77
const_scatter / PartOpt / cpu / PostRev 0.00050828 s 0.0002898025000013 s 1.75
const_scatter / PartOpt / cpu / BothRev 0.000525003 s 0.0002821240400044 s 1.86
const_scatter / IPartOpt / cpu / PreRev 0.000500425 s 0.0002919069400104 s 1.71
const_scatter / IPartOpt / cpu / PostRev 0.000511533 s 0.0002938434999578 s 1.74
const_scatter / IPartOpt / cpu / BothRev 0.00050068 s 0.0002843726199625 s 1.76
const_scatter / DefOpt / cpu / PreRev 0.0005138019999999 s 0.0002895347999765 s 1.77
const_scatter / DefOpt / cpu / PostRev 0.000523517 s 0.000291647899985 s 1.80
const_scatter / DefOpt / cpu / BothRev 0.000533237 s 0.0002939875399533 s 1.81
const_scatter / IDefOpt / cpu / PreRev 0.000517185 s 0.0002921339599561 s 1.77
const_scatter / IDefOpt / cpu / PostRev 0.000521314 s 0.0002916513599848 s 1.79
const_scatter / IDefOpt / cpu / BothRev 0.000501794 s 0.0002838417199109 s 1.77
const_scatter / JaXPipe / cpu / Primal 0.000008 s 0.000007763400026306044 s 1.03
const_scatter / Jax / cpu / Primal 0.000008 s 0.0000070458600202982776 s 1.14
const_scatter / HLOOpt / cpu / Primal 0.000008999999999999999 s 0.000006994660006967024 s 1.29
const_scatter / PartOpt / cpu / Primal 0.000011 s 0.0000072007200014923 s 1.53
const_scatter / IPartOpt / cpu / Primal 0.000008999999999999999 s 0.000006805479988543084 s 1.32
const_scatter / DefOpt / cpu / Primal 0.000008999999999999999 s 0.000006607559998883516 s 1.36
const_scatter / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.000006990259953454369 s 1.29
const_scatter / JaXPipe / cpu / Forward 0.000012 s 0.000010474540013092336 s 1.15
const_scatter / Jax / cpu / Forward 0.000038 s 0.000010772199975690456 s 3.53
const_scatter / HLOOpt / cpu / Forward 0.000013 s 0.000014144699998723808 s 0.92
const_scatter / PartOpt / cpu / Forward 0.000012 s 0.000015396779990624053 s 0.78
const_scatter / IPartOpt / cpu / Forward 0.000041 s 0.000010131839981113443 s 4.05
const_scatter / DefOpt / cpu / Forward 0.000012 s 0.000014296299987108796 s 0.84
const_scatter / IDefOpt / cpu / Forward 0.000013 s 0.00001000824006041512 s 1.30
const_scatter / JaXPipe / cpu / PreRev 0.000499 s 0.0003034807000403 s 1.64
const_scatter / JaXPipe / cpu / PostRev 0.00035 s 0.0002982542200425 s 1.17
const_scatter / JaXPipe / cpu / BothRev 0.000347 s 0.000285023639999 s 1.22
const_scatter / Jax / cpu / BothRev 0.000357 s 0.0002842879400213 s 1.26
const_scatter / HLOOpt / cpu / PreRev 0.000389 s 0.0002853217399842 s 1.36
const_scatter / HLOOpt / cpu / PostRev 0.0003489999999999 s 0.0002907451599548 s 1.20
const_scatter / HLOOpt / cpu / BothRev 0.000535 s 0.0002864135400341 s 1.87
const_scatter / PartOpt / cpu / PreRev 0.000347 s 0.0002835488199889 s 1.22
const_scatter / PartOpt / cpu / PostRev 0.000363 s 0.0002898025000013 s 1.25
const_scatter / PartOpt / cpu / BothRev 0.000406 s 0.0002821240400044 s 1.44
const_scatter / IPartOpt / cpu / PreRev 0.0004129999999999 s 0.0002919069400104 s 1.41
const_scatter / IPartOpt / cpu / PostRev 0.0003529999999999 s 0.0002938434999578 s 1.20
const_scatter / IPartOpt / cpu / BothRev 0.000463 s 0.0002843726199625 s 1.63
const_scatter / DefOpt / cpu / PreRev 0.000402 s 0.0002895347999765 s 1.39
const_scatter / DefOpt / cpu / PostRev 0.000414 s 0.000291647899985 s 1.42
const_scatter / DefOpt / cpu / BothRev 0.000385 s 0.0002939875399533 s 1.31
const_scatter / IDefOpt / cpu / PreRev 0.000345 s 0.0002921339599561 s 1.18
const_scatter / IDefOpt / cpu / PostRev 0.0004869999999999 s 0.0002916513599848 s 1.67
const_scatter / IDefOpt / cpu / BothRev 0.00056 s 0.0002838417199109 s 1.97
GenDot / JaXPipe / cpu / Primal 0.000009710840013212872 s 0.000007976340002642246 s 1.22
GenDot / Jax / cpu / Primal 0.00000787888004197157 s 0.000007622319990332471 s 1.03
GenDot / HLOOpt / cpu / Primal 0.000012147539982834132 s 0.000012125279999963822 s 1.00
GenDot / PartOpt / cpu / Primal 0.000007728399941697716 s 0.000007736880015727365 s 1.00
GenDot / IPartOpt / cpu / Primal 0.000008285080057248705 s 0.000008522200005245395 s 0.97
GenDot / DefOpt / cpu / Primal 0.00001270487999136094 s 0.000007448159985870006 s 1.71
GenDot / IDefOpt / cpu / Primal 0.000007625459993505501 s 0.000007787959984852933 s 0.98
GenDot / JaXPipe / cpu / Forward 0.00001248287998350861 s 0.000011303560040687445 s 1.10
GenDot / Jax / cpu / Forward 0.000011816540054496726 s 0.000010663020048014004 s 1.11
GenDot / HLOOpt / cpu / Forward 0.000015159180029513664 s 0.00001102295996133762 s 1.38
GenDot / PartOpt / cpu / Forward 0.000016477140015922486 s 0.000016053339995778514 s 1.03
GenDot / IPartOpt / cpu / Forward 0.000012227399965922812 s 0.000011285260025033494 s 1.08
GenDot / DefOpt / cpu / Forward 0.000016626900014671263 s 0.00001600067995241261 s 1.04
GenDot / IDefOpt / cpu / Forward 0.000011610940027821926 s 0.000010661779988367926 s 1.09
GenDot / JaXPipe / cpu / PreRev 0.0000125530399964191 s 0.000012155380027252247 s 1.03
GenDot / JaXPipe / cpu / PostRev 0.000011516340000525816 s 0.000011094039991803584 s 1.04
GenDot / JaXPipe / cpu / BothRev 0.000017316259954895942 s 0.000011813439987236053 s 1.47
GenDot / Jax / cpu / BothRev 0.00001180726001621224 s 0.000011727659957614378 s 1.01
GenDot / HLOOpt / cpu / PreRev 0.000012128299968026113 s 0.000011548159973244765 s 1.05
GenDot / HLOOpt / cpu / PostRev 0.000016628180010229698 s 0.000015893500003585358 s 1.05
GenDot / HLOOpt / cpu / BothRev 0.000014366500026881113 s 0.000016876320014489465 s 0.85
GenDot / PartOpt / cpu / PreRev 0.000012396180018185987 s 0.00001174090000858996 s 1.06
GenDot / PartOpt / cpu / PostRev 0.000012323360015216168 s 0.000010544900023887748 s 1.17
GenDot / PartOpt / cpu / BothRev 0.000011786739978560943 s 0.000011326679987178067 s 1.04
GenDot / IPartOpt / cpu / PreRev 0.000012197320002087508 s 0.00001432055995792325 s 0.85
GenDot / IPartOpt / cpu / PostRev 0.000011331659989082254 s 0.000010636360002536094 s 1.07
GenDot / IPartOpt / cpu / BothRev 0.000011832999998659945 s 0.00001155084003585216 s 1.02
GenDot / DefOpt / cpu / PreRev 0.000012428019981598482 s 0.000011734499958038214 s 1.06
GenDot / DefOpt / cpu / PostRev 0.00001270140003725828 s 0.00001169869996374473 s 1.09
GenDot / DefOpt / cpu / BothRev 0.000012315360017964848 s 0.000011772220004786504 s 1.05
GenDot / IDefOpt / cpu / PreRev 0.00001252021997970587 s 0.000011911119991054876 s 1.05
GenDot / IDefOpt / cpu / PostRev 0.000012556040046547424 s 0.000011878620016432253 s 1.06
GenDot / IDefOpt / cpu / BothRev 0.000012506619987107116 s 0.000011148640032843104 s 1.12
GenDot / JaXPipe / cuda / Primal 0.000002015 s 0.000002015 s 1
GenDot / Jax / cuda / Primal 0.000002015 s 0.000002015 s 1
GenDot / HLOOpt / cuda / Primal 0.000002015 s 0.000001984 s 1.02
GenDot / PartOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
GenDot / IPartOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
GenDot / DefOpt / cuda / Primal 0.000002015 s 0.000001984 s 1.02
GenDot / IDefOpt / cuda / Primal 0.000002015 s 0.000001984 s 1.02
GenDot / JaXPipe / cuda / Forward 0.000009984 s 0.000009856 s 1.01
GenDot / Jax / cuda / Forward 0.000010209 s 0.00001008 s 1.01
GenDot / HLOOpt / cuda / Forward 0.000010207 s 0.00000992 s 1.03
GenDot / PartOpt / cuda / Forward 0.000010272 s 0.000010176 s 1.01
GenDot / IPartOpt / cuda / Forward 0.000010336 s 0.000010304 s 1.00
GenDot / DefOpt / cuda / Forward 0.000010016 s 0.000009984 s 1.00
GenDot / IDefOpt / cuda / Forward 0.000010369 s 0.000010304 s 1.01
GenDot / JaXPipe / cuda / PreRev 0.00000992 s 0.000010176 s 0.97
GenDot / JaXPipe / cuda / PostRev 0.000010272 s 0.000010144 s 1.01
GenDot / JaXPipe / cuda / BothRev 0.000010048 s 0.000010208 s 0.98
GenDot / Jax / cuda / BothRev 0.000010368 s 0.000011488 s 0.90
GenDot / HLOOpt / cuda / PreRev 0.000010144 s 0.000011104 s 0.91
GenDot / HLOOpt / cuda / PostRev 0.00001008 s 0.000011103 s 0.91
GenDot / HLOOpt / cuda / BothRev 0.00001024 s 0.00001136 s 0.90
GenDot / PartOpt / cuda / PreRev 0.000009984 s 0.00001168 s 0.85
GenDot / PartOpt / cuda / PostRev 0.000010176 s 0.000009696 s 1.05
GenDot / PartOpt / cuda / BothRev 0.000010208 s 0.000009889 s 1.03
GenDot / IPartOpt / cuda / PreRev 0.000010208 s 0.00001008 s 1.01
GenDot / IPartOpt / cuda / PostRev 0.000010528 s 0.000011296 s 0.93
GenDot / IPartOpt / cuda / BothRev 0.000010815 s 0.000010208 s 1.06
GenDot / DefOpt / cuda / PreRev 0.00001072 s 0.000009664 s 1.11
GenDot / DefOpt / cuda / PostRev 0.000010592 s 0.000009503 s 1.11
GenDot / DefOpt / cuda / BothRev 0.000009856 s 0.000010016 s 0.98
GenDot / IDefOpt / cuda / PreRev 0.000010176 s 0.000010112 s 1.01
GenDot / IDefOpt / cuda / PostRev 0.000010209 s 0.000009889 s 1.03
GenDot / IDefOpt / cuda / BothRev 0.000009952 s 0.000010144 s 0.98
GenDot / JaXPipe / tpu / Primal 9.256e-7 s 9.30225e-7 s 1.00
GenDot / Jax / tpu / Primal 9.35825e-7 s 9.357e-7 s 1.00
GenDot / HLOOpt / tpu / Primal 0.0000015487 s 0.0000015747 s 0.98
GenDot / PartOpt / tpu / Primal 9.357e-7 s 9.36175e-7 s 1.00
GenDot / IPartOpt / tpu / Primal 9.3595e-7 s 9.4085e-7 s 0.99
GenDot / DefOpt / tpu / Primal 0.000001491675 s 0.000001483575 s 1.01
GenDot / IDefOpt / tpu / Primal 0.00000155825 s 0.0000015670249999999998 s 0.99
GenDot / JaXPipe / tpu / Forward 0.000003168825 s 0.000003160525 s 1.00
GenDot / Jax / tpu / Forward 0.000002326225 s 0.0000023322 s 1.00
GenDot / HLOOpt / tpu / Forward 0.00000312795 s 0.0000031071500000000004 s 1.01
GenDot / PartOpt / tpu / Forward 0.0000032094 s 0.0000032155250000000004 s 1.00
GenDot / IPartOpt / tpu / Forward 0.000003106475 s 0.000003114575 s 1.00
GenDot / DefOpt / tpu / Forward 0.000003209225 s 0.000003211525 s 1.00
GenDot / IDefOpt / tpu / Forward 0.000003114225 s 0.00000311245 s 1.00
GenDot / JaXPipe / tpu / PreRev 0.000002947025 s 0.00000295445 s 1.00
GenDot / JaXPipe / tpu / PostRev 0.000002405675 s 0.000002400925 s 1.00
GenDot / JaXPipe / tpu / BothRev 0.0000029545500000000004 s 0.0000029649750000000004 s 1.00
GenDot / Jax / tpu / BothRev 0.0000024078 s 0.000002409925 s 1.00
GenDot / HLOOpt / tpu / PreRev 0.00000294865 s 0.000002956325 s 1.00
GenDot / HLOOpt / tpu / PostRev 0.0000029370000000000004 s 0.00000293495 s 1.00
GenDot / HLOOpt / tpu / BothRev 0.000002953875 s 0.000002958175 s 1.00
GenDot / PartOpt / tpu / PreRev 0.000002925925 s 0.0000029199000000000006 s 1.00
GenDot / PartOpt / tpu / PostRev 0.000002384575 s 0.000002395525 s 1.00
GenDot / PartOpt / tpu / BothRev 0.000002932875 s 0.000002937575 s 1.00
GenDot / IPartOpt / tpu / PreRev 0.0000029503000000000004 s 0.00000295935 s 1.00
GenDot / IPartOpt / tpu / PostRev 0.00000240985 s 0.000002409825 s 1.00
GenDot / IPartOpt / tpu / BothRev 0.000002945675 s 0.00000296385 s 0.99
GenDot / DefOpt / tpu / PreRev 0.0000029353 s 0.0000029279500000000005 s 1.00
GenDot / DefOpt / tpu / PostRev 0.000002952125 s 0.000002959675 s 1.00
GenDot / DefOpt / tpu / BothRev 0.0000029304 s 0.000002942425 s 1.00
GenDot / IDefOpt / tpu / PreRev 0.000002957 s 0.00000296965 s 1.00
GenDot / IDefOpt / tpu / PostRev 0.0000029308 s 0.0000029244500000000004 s 1.00
GenDot / IDefOpt / tpu / BothRev 0.0000029569250000000005 s 0.0000029509000000000004 s 1.00
GenDot / JaXPipe / cpu / Primal 0.000015042 s 0.000007976340002642246 s 1.89
GenDot / Jax / cpu / Primal 0.000015412 s 0.000007622319990332471 s 2.02
GenDot / HLOOpt / cpu / Primal 0.000014071 s 0.000012125279999963822 s 1.16
GenDot / PartOpt / cpu / Primal 0.00001525 s 0.000007736880015727365 s 1.97
GenDot / IPartOpt / cpu / Primal 0.000014611 s 0.000008522200005245395 s 1.71
GenDot / DefOpt / cpu / Primal 0.000014126 s 0.000007448159985870006 s 1.90
GenDot / IDefOpt / cpu / Primal 0.000014207 s 0.000007787959984852933 s 1.82
GenDot / JaXPipe / cpu / Forward 0.00001935 s 0.000011303560040687445 s 1.71
GenDot / Jax / cpu / Forward 0.000020797 s 0.000010663020048014004 s 1.95
GenDot / HLOOpt / cpu / Forward 0.000018746 s 0.00001102295996133762 s 1.70
GenDot / PartOpt / cpu / Forward 0.000019276000000000003 s 0.000016053339995778514 s 1.20
GenDot / IPartOpt / cpu / Forward 0.000019659 s 0.000011285260025033494 s 1.74
GenDot / DefOpt / cpu / Forward 0.000019225 s 0.00001600067995241261 s 1.20
GenDot / IDefOpt / cpu / Forward 0.000019296 s 0.000010661779988367926 s 1.81
GenDot / JaXPipe / cpu / PreRev 0.000019987 s 0.000012155380027252247 s 1.64
GenDot / JaXPipe / cpu / PostRev 0.000021131 s 0.000011094039991803584 s 1.90
GenDot / JaXPipe / cpu / BothRev 0.000020627 s 0.000011813439987236053 s 1.75
GenDot / Jax / cpu / BothRev 0.00002108 s 0.000011727659957614378 s 1.80
GenDot / HLOOpt / cpu / PreRev 0.000019222 s 0.000011548159973244765 s 1.66
GenDot / HLOOpt / cpu / PostRev 0.000019594 s 0.000015893500003585358 s 1.23
GenDot / HLOOpt / cpu / BothRev 0.000019729 s 0.000016876320014489465 s 1.17
GenDot / PartOpt / cpu / PreRev 0.000019672 s 0.00001174090000858996 s 1.68
GenDot / PartOpt / cpu / PostRev 0.00002183 s 0.000010544900023887748 s 2.07
GenDot / PartOpt / cpu / BothRev 0.00001997 s 0.000011326679987178067 s 1.76
GenDot / IPartOpt / cpu / PreRev 0.000019124 s 0.00001432055995792325 s 1.34
GenDot / IPartOpt / cpu / PostRev 0.000021158 s 0.000010636360002536094 s 1.99
GenDot / IPartOpt / cpu / BothRev 0.000020059 s 0.00001155084003585216 s 1.74
GenDot / DefOpt / cpu / PreRev 0.000019013 s 0.000011734499958038214 s 1.62
GenDot / DefOpt / cpu / PostRev 0.000020357 s 0.00001169869996374473 s 1.74
GenDot / DefOpt / cpu / BothRev 0.000019279 s 0.000011772220004786504 s 1.64
GenDot / IDefOpt / cpu / PreRev 0.000019516 s 0.000011911119991054876 s 1.64
GenDot / IDefOpt / cpu / PostRev 0.000019900000000000003 s 0.000011878620016432253 s 1.68
GenDot / IDefOpt / cpu / BothRev 0.00002027 s 0.000011148640032843104 s 1.82
GenDot / JaXPipe / cpu / Primal 0.00001 s 0.000007976340002642246 s 1.25
GenDot / Jax / cpu / Primal 0.00001 s 0.000007622319990332471 s 1.31
GenDot / HLOOpt / cpu / Primal 0.000034 s 0.000012125279999963822 s 2.80
GenDot / PartOpt / cpu / Primal 0.00001 s 0.000007736880015727365 s 1.29
GenDot / IPartOpt / cpu / Primal 0.000013 s 0.000008522200005245395 s 1.53
GenDot / DefOpt / cpu / Primal 0.00001 s 0.000007448159985870006 s 1.34
GenDot / IDefOpt / cpu / Primal 0.000008999999999999999 s 0.000007787959984852933 s 1.16
GenDot / JaXPipe / cpu / Forward 0.000014 s 0.000011303560040687445 s 1.24
GenDot / Jax / cpu / Forward 0.000015 s 0.000010663020048014004 s 1.41
GenDot / HLOOpt / cpu / Forward 0.000015 s 0.00001102295996133762 s 1.36
GenDot / PartOpt / cpu / Forward 0.000016 s 0.000016053339995778514 s 1.00
GenDot / IPartOpt / cpu / Forward 0.000014 s 0.000011285260025033494 s 1.24
GenDot / DefOpt / cpu / Forward 0.000013 s 0.00001600067995241261 s 0.81
GenDot / IDefOpt / cpu / Forward 0.000013 s 0.000010661779988367926 s 1.22
GenDot / JaXPipe / cpu / PreRev 0.000014 s 0.000012155380027252247 s 1.15
GenDot / JaXPipe / cpu / PostRev 0.000015 s 0.000011094039991803584 s 1.35
GenDot / JaXPipe / cpu / BothRev 0.000026 s 0.000011813439987236053 s 2.20
GenDot / Jax / cpu / BothRev 0.000015 s 0.000011727659957614378 s 1.28
GenDot / HLOOpt / cpu / PreRev 0.000014 s 0.000011548159973244765 s 1.21
GenDot / HLOOpt / cpu / PostRev 0.000019 s 0.000015893500003585358 s 1.20
GenDot / HLOOpt / cpu / BothRev 0.000014 s 0.000016876320014489465 s 0.83
GenDot / PartOpt / cpu / PreRev 0.000014 s 0.00001174090000858996 s 1.19
GenDot / PartOpt / cpu / PostRev 0.000015 s 0.000010544900023887748 s 1.42
GenDot / PartOpt / cpu / BothRev 0.000014 s 0.000011326679987178067 s 1.24
GenDot / IPartOpt / cpu / PreRev 0.000014 s 0.00001432055995792325 s 0.98
GenDot / IPartOpt / cpu / PostRev 0.000016 s 0.000010636360002536094 s 1.50
GenDot / IPartOpt / cpu / BothRev 0.000014 s 0.00001155084003585216 s 1.21
GenDot / DefOpt / cpu / PreRev 0.000013 s 0.000011734499958038214 s 1.11
GenDot / DefOpt / cpu / PostRev 0.000015 s 0.00001169869996374473 s 1.28
GenDot / DefOpt / cpu / BothRev 0.000014 s 0.000011772220004786504 s 1.19
GenDot / IDefOpt / cpu / PreRev 0.000014 s 0.000011911119991054876 s 1.18
GenDot / IDefOpt / cpu / PostRev 0.000015 s 0.000011878620016432253 s 1.26
GenDot / IDefOpt / cpu / BothRev 0.000044 s 0.000011148640032843104 s 3.95
hlo_ffi / JaXPipe / cpu / Primal 0.000011161940019519534 s 0.000011683219945552991 s 0.96
hlo_ffi / Jax / cpu / Primal 0.000011351999992257334 s 0.00001093347996174998 s 1.04
hlo_ffi / HLOOpt / cpu / Primal 0.000014426640000237968 s 0.000010873880028157143 s 1.33
hlo_ffi / PartOpt / cpu / Primal 0.000011115759953099767 s 0.000010604800017972591 s 1.05
hlo_ffi / IPartOpt / cpu / Primal 0.000010841720040843938 s 0.000010708660011005122 s 1.01
hlo_ffi / DefOpt / cpu / Primal 0.000014101220021984774 s 0.000014841120018900257 s 0.95
hlo_ffi / IDefOpt / cpu / Primal 0.000010800600048241904 s 0.000011009920008291374 s 0.98
hlo_ffi / JaXPipe / cpu / Forward 0.000016400420017816943 s 0.000016810779980005462 s 0.98
hlo_ffi / Jax / cpu / Forward 0.000016374499982703127 s 0.000016500559986525332 s 0.99
hlo_ffi / HLOOpt / cpu / Forward 0.000015835140011404292 s 0.000016243679965555202 s 0.97
hlo_ffi / PartOpt / cpu / Forward 0.000016182399967874516 s 0.000016591320008956245 s 0.98
hlo_ffi / IPartOpt / cpu / Forward 0.000016748439984439757 s 0.0000167376999706903 s 1.00
hlo_ffi / DefOpt / cpu / Forward 0.00001623438002752664 s 0.000016887200044948257 s 0.96
hlo_ffi / IDefOpt / cpu / Forward 0.000016157259969986627 s 0.00001706424001895357 s 0.95
hlo_ffi / JaXPipe / cpu / PreRev 0.000016021979954530254 s 0.00001627081997867208 s 0.98
hlo_ffi / JaXPipe / cpu / PostRev 0.0000160561800475989 s 0.000016024260066842543 s 1.00
hlo_ffi / JaXPipe / cpu / BothRev 0.00001613188003830146 s 0.000016259600006378605 s 0.99
hlo_ffi / Jax / cpu / BothRev 0.000016405300011683722 s 0.000016137319989866227 s 1.02
hlo_ffi / HLOOpt / cpu / PreRev 0.00001585205996889272 s 0.000016437020030934944 s 0.96
hlo_ffi / HLOOpt / cpu / PostRev 0.000015580380049868837 s 0.00001556615996378241 s 1.00
hlo_ffi / HLOOpt / cpu / BothRev 0.000017279060039072646 s 0.000017776600006982335 s 0.97
hlo_ffi / PartOpt / cpu / PreRev 0.000016523120020792703 s 0.00001579934000801586 s 1.05
hlo_ffi / PartOpt / cpu / PostRev 0.000016308779995597432 s 0.0000158483600080217 s 1.03
hlo_ffi / PartOpt / cpu / BothRev 0.000016716080008336576 s 0.00001626846002181992 s 1.03
hlo_ffi / IPartOpt / cpu / PreRev 0.00001639620000787545 s 0.00001559686000291549 s 1.05
hlo_ffi / IPartOpt / cpu / PostRev 0.000016330520011251793 s 0.000016143759967235383 s 1.01
hlo_ffi / IPartOpt / cpu / BothRev 0.000016686200033291244 s 0.000016162379997695096 s 1.03
hlo_ffi / DefOpt / cpu / PreRev 0.000015798820022610016 s 0.000016102459967441972 s 0.98
hlo_ffi / DefOpt / cpu / PostRev 0.000016521560019100436 s 0.00001652375997764466 s 1.00
hlo_ffi / DefOpt / cpu / BothRev 0.000015938539972921718 s 0.00001630163998925127 s 0.98
hlo_ffi / IDefOpt / cpu / PreRev 0.000016353860000890564 s 0.000016101619985420255 s 1.02
hlo_ffi / IDefOpt / cpu / PostRev 0.000015935660021568763 s 0.000015861459987718264 s 1.00
hlo_ffi / IDefOpt / cpu / BothRev 0.000016035239987104433 s 0.000016633619961794465 s 0.96
hlo_ffi / JaXPipe / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / Jax / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / HLOOpt / cuda / Primal 0.000001983 s 0.000001983 s 1
hlo_ffi / PartOpt / cuda / Primal 0.000001983 s 0.000001983 s 1
hlo_ffi / IPartOpt / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / DefOpt / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / IDefOpt / cuda / Primal 0.000001983 s 0.000001983 s 1
hlo_ffi / JaXPipe / cuda / Forward 0.000002048 s 0.000002047 s 1.00
hlo_ffi / Jax / cuda / Forward 0.00000208 s 0.000002047 s 1.02
hlo_ffi / HLOOpt / cuda / Forward 0.00000208 s 0.000002047 s 1.02
hlo_ffi / PartOpt / cuda / Forward 0.00000208 s 0.000002047 s 1.02
hlo_ffi / IPartOpt / cuda / Forward 0.00000208 s 0.000002047 s 1.02
hlo_ffi / DefOpt / cuda / Forward 0.00000208 s 0.000002048 s 1.02
hlo_ffi / IDefOpt / cuda / Forward 0.00000208 s 0.000002048 s 1.02
hlo_ffi / JaXPipe / cuda / PreRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / JaXPipe / cuda / PostRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / JaXPipe / cuda / BothRev 0.000002047 s 0.000002047 s 1
hlo_ffi / Jax / cuda / BothRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / HLOOpt / cuda / PreRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / HLOOpt / cuda / PostRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / HLOOpt / cuda / BothRev 0.000002048 s 0.000002048 s 1
hlo_ffi / PartOpt / cuda / PreRev 0.000002047 s 0.000002047 s 1
hlo_ffi / PartOpt / cuda / PostRev 0.000002047 s 0.000002047 s 1
hlo_ffi / PartOpt / cuda / BothRev 0.000002047 s 0.000002047 s 1
hlo_ffi / IPartOpt / cuda / PreRev 0.000002048 s 0.000002048 s 1
hlo_ffi / IPartOpt / cuda / PostRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / IPartOpt / cuda / BothRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / DefOpt / cuda / PreRev 0.000002048 s 0.000002048 s 1
hlo_ffi / DefOpt / cuda / PostRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / DefOpt / cuda / BothRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / IDefOpt / cuda / PreRev 0.000002047 s 0.000002047 s 1
hlo_ffi / IDefOpt / cuda / PostRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / IDefOpt / cuda / BothRev 0.000002047 s 0.000002048 s 1.00
hlo_ffi / JaXPipe / tpu / Primal 9.2765e-7 s 9.3345e-7 s 0.99
hlo_ffi / Jax / tpu / Primal 9.4985e-7 s 9.5385e-7 s 1.00
hlo_ffi / HLOOpt / tpu / Primal 9.0635e-7 s 9.08725e-7 s 1.00
hlo_ffi / PartOpt / tpu / Primal 9.50525e-7 s 9.57e-7 s 0.99
hlo_ffi / IPartOpt / tpu / Primal 9.0945e-7 s 9.13175e-7 s 1.00
hlo_ffi / DefOpt / tpu / Primal 9.53925e-7 s 9.671e-7 s 0.99
hlo_ffi / IDefOpt / tpu / Primal 9.0615e-7 s 9.112e-7 s 0.99
hlo_ffi / JaXPipe / tpu / Forward 9.49575e-7 s 9.48875e-7 s 1.00
hlo_ffi / Jax / tpu / Forward 9.8165e-7 s 9.8175e-7 s 1.00
hlo_ffi / HLOOpt / tpu / Forward 9.74625e-7 s 9.73825e-7 s 1.00
hlo_ffi / PartOpt / tpu / Forward 9.34375e-7 s 9.34275e-7 s 1.00
hlo_ffi / IPartOpt / tpu / Forward 9.74175e-7 s 9.736749999999998e-7 s 1.00
hlo_ffi / DefOpt / tpu / Forward 9.3395e-7 s 9.341e-7 s 1.00
hlo_ffi / IDefOpt / tpu / Forward 9.74e-7 s 9.738e-7 s 1.00
hlo_ffi / JaXPipe / tpu / PreRev 9.38775e-7 s 9.388e-7 s 1.00
hlo_ffi / JaXPipe / tpu / PostRev 9.64375e-7 s 9.64375e-7 s 1
hlo_ffi / JaXPipe / tpu / BothRev 9.6035e-7 s 9.601e-7 s 1.00
hlo_ffi / Jax / tpu / BothRev 9.64675e-7 s 9.648e-7 s 1.00
hlo_ffi / HLOOpt / tpu / PreRev 9.60575e-7 s 9.60525e-7 s 1.00
hlo_ffi / HLOOpt / tpu / PostRev 9.6515e-7 s 9.6485e-7 s 1.00
hlo_ffi / HLOOpt / tpu / BothRev 9.60375e-7 s 9.60475e-7 s 1.00
hlo_ffi / PartOpt / tpu / PreRev 9.6525e-7 s 9.65375e-7 s 1.00
hlo_ffi / PartOpt / tpu / PostRev 9.60375e-7 s 9.596e-7 s 1.00
hlo_ffi / PartOpt / tpu / BothRev 9.652e-7 s 9.64525e-7 s 1.00
hlo_ffi / IPartOpt / tpu / PreRev 9.6045e-7 s 9.5985e-7 s 1.00
hlo_ffi / IPartOpt / tpu / PostRev 9.65175e-7 s 9.646e-7 s 1.00
hlo_ffi / IPartOpt / tpu / BothRev 9.59875e-7 s 9.6025e-7 s 1.00
hlo_ffi / DefOpt / tpu / PreRev 9.651e-7 s 9.64575e-7 s 1.00
hlo_ffi / DefOpt / tpu / PostRev 9.6015e-7 s 9.60475e-7 s 1.00
hlo_ffi / DefOpt / tpu / BothRev 9.6495e-7 s 9.64275e-7 s 1.00
hlo_ffi / IDefOpt / tpu / PreRev 9.59825e-7 s 9.608e-7 s 1.00
hlo_ffi / IDefOpt / tpu / PostRev 9.6495e-7 s 9.64675e-7 s 1.00
hlo_ffi / IDefOpt / tpu / BothRev 9.602e-7 s 9.599e-7 s 1.00
hlo_ffi / JaXPipe / cpu / Primal 0.000017666 s 0.000011683219945552991 s 1.51
hlo_ffi / Jax / cpu / Primal 0.000017556 s 0.00001093347996174998 s 1.61
hlo_ffi / HLOOpt / cpu / Primal 0.000017354 s 0.000010873880028157143 s 1.60
hlo_ffi / PartOpt / cpu / Primal 0.000017032 s 0.000010604800017972591 s 1.61
hlo_ffi / IPartOpt / cpu / Primal 0.000017371 s 0.000010708660011005122 s 1.62
hlo_ffi / DefOpt / cpu / Primal 0.000017774000000000003 s 0.000014841120018900257 s 1.20
hlo_ffi / IDefOpt / cpu / Primal 0.000016955999999999998 s 0.000011009920008291374 s 1.54
hlo_ffi / JaXPipe / cpu / Forward 0.000024754 s 0.000016810779980005462 s 1.47
hlo_ffi / Jax / cpu / Forward 0.000024594 s 0.000016500559986525332 s 1.49
hlo_ffi / HLOOpt / cpu / Forward 0.000024073 s 0.000016243679965555202 s 1.48
hlo_ffi / PartOpt / cpu / Forward 0.000023941 s 0.000016591320008956245 s 1.44
hlo_ffi / IPartOpt / cpu / Forward 0.000024387 s 0.0000167376999706903 s 1.46
hlo_ffi / DefOpt / cpu / Forward 0.00002399 s 0.000016887200044948257 s 1.42
hlo_ffi / IDefOpt / cpu / Forward 0.000023929 s 0.00001706424001895357 s 1.40
hlo_ffi / JaXPipe / cpu / PreRev 0.000024599 s 0.00001627081997867208 s 1.51
hlo_ffi / JaXPipe / cpu / PostRev 0.00002485 s 0.000016024260066842543 s 1.55
hlo_ffi / JaXPipe / cpu / BothRev 0.000025 s 0.000016259600006378605 s 1.54
hlo_ffi / Jax / cpu / BothRev 0.000024307 s 0.000016137319989866227 s 1.51
hlo_ffi / HLOOpt / cpu / PreRev 0.000026238 s 0.000016437020030934944 s 1.60
hlo_ffi / HLOOpt / cpu / PostRev 0.000025836 s 0.00001556615996378241 s 1.66
hlo_ffi / HLOOpt / cpu / BothRev 0.000024421 s 0.000017776600006982335 s 1.37
hlo_ffi / PartOpt / cpu / PreRev 0.000024529 s 0.00001579934000801586 s 1.55
hlo_ffi / PartOpt / cpu / PostRev 0.000025199 s 0.0000158483600080217 s 1.59
hlo_ffi / PartOpt / cpu / BothRev 0.000025386 s 0.00001626846002181992 s 1.56
hlo_ffi / IPartOpt / cpu / PreRev 0.000024544 s 0.00001559686000291549 s 1.57
hlo_ffi / IPartOpt / cpu / PostRev 0.000024798 s 0.000016143759967235383 s 1.54
hlo_ffi / IPartOpt / cpu / BothRev 0.000024542 s 0.000016162379997695096 s 1.52
hlo_ffi / DefOpt / cpu / PreRev 0.000024733 s 0.000016102459967441972 s 1.54
hlo_ffi / DefOpt / cpu / PostRev 0.000025887 s 0.00001652375997764466 s 1.57
hlo_ffi / DefOpt / cpu / BothRev 0.000025966 s 0.00001630163998925127 s 1.59
hlo_ffi / IDefOpt / cpu / PreRev 0.000024077 s 0.000016101619985420255 s 1.50
hlo_ffi / IDefOpt / cpu / PostRev 0.000025648 s 0.000015861459987718264 s 1.62
hlo_ffi / IDefOpt / cpu / BothRev 0.00002578 s 0.000016633619961794465 s 1.55
hlo_ffi / JaXPipe / cpu / Primal 0.000012 s 0.000011683219945552991 s 1.03
hlo_ffi / Jax / cpu / Primal 0.000012 s 0.00001093347996174998 s 1.10
hlo_ffi / HLOOpt / cpu / Primal 0.000013 s 0.000010873880028157143 s 1.20
hlo_ffi / PartOpt / cpu / Primal 0.000013 s 0.000010604800017972591 s 1.23
hlo_ffi / IPartOpt / cpu / Primal 0.000013 s 0.000010708660011005122 s 1.21
hlo_ffi / DefOpt / cpu / Primal 0.000013 s 0.000014841120018900257 s 0.88
hlo_ffi / IDefOpt / cpu / Primal 0.000013 s 0.000011009920008291374 s 1.18
hlo_ffi / JaXPipe / cpu / Forward 0.000017999999999999997 s 0.000016810779980005462 s 1.07
hlo_ffi / Jax / cpu / Forward 0.000017999999999999997 s 0.000016500559986525332 s 1.09
hlo_ffi / HLOOpt / cpu / Forward 0.000017 s 0.000016243679965555202 s 1.05
hlo_ffi / PartOpt / cpu / Forward 0.000017 s 0.000016591320008956245 s 1.02
hlo_ffi / IPartOpt / cpu / Forward 0.000016 s 0.0000167376999706903 s 0.96
hlo_ffi / DefOpt / cpu / Forward 0.000017 s 0.000016887200044948257 s 1.01
hlo_ffi / IDefOpt / cpu / Forward 0.000017999999999999997 s 0.00001706424001895357 s 1.05
hlo_ffi / JaXPipe / cpu / PreRev 0.000017 s 0.00001627081997867208 s 1.04
hlo_ffi / JaXPipe / cpu / PostRev 0.000017 s 0.000016024260066842543 s 1.06
hlo_ffi / JaXPipe / cpu / BothRev 0.000017 s 0.000016259600006378605 s 1.05
hlo_ffi / Jax / cpu / BothRev 0.000017 s 0.000016137319989866227 s 1.05
hlo_ffi / HLOOpt / cpu / PreRev 0.000017 s 0.000016437020030934944 s 1.03
hlo_ffi / HLOOpt / cpu / PostRev 0.000017 s 0.00001556615996378241 s 1.09
hlo_ffi / HLOOpt / cpu / BothRev 0.000056 s 0.000017776600006982335 s 3.15
hlo_ffi / PartOpt / cpu / PreRev 0.000017999999999999997 s 0.00001579934000801586 s 1.14
hlo_ffi / PartOpt / cpu / PostRev 0.000017999999999999997 s 0.0000158483600080217 s 1.14
hlo_ffi / PartOpt / cpu / BothRev 0.000017 s 0.00001626846002181992 s 1.04
hlo_ffi / IPartOpt / cpu / PreRev 0.000017 s 0.00001559686000291549 s 1.09
hlo_ffi / IPartOpt / cpu / PostRev 0.000017999999999999997 s 0.000016143759967235383 s 1.11
hlo_ffi / IPartOpt / cpu / BothRev 0.000017 s 0.000016162379997695096 s 1.05
hlo_ffi / DefOpt / cpu / PreRev 0.000017999999999999997 s 0.000016102459967441972 s 1.12
hlo_ffi / DefOpt / cpu / PostRev 0.000017 s 0.00001652375997764466 s 1.03
hlo_ffi / DefOpt / cpu / BothRev 0.000017 s 0.00001630163998925127 s 1.04
hlo_ffi / IDefOpt / cpu / PreRev 0.000017 s 0.000016101619985420255 s 1.06
hlo_ffi / IDefOpt / cpu / PostRev 0.000017 s 0.000015861459987718264 s 1.07
hlo_ffi / IDefOpt / cpu / BothRev 0.000017 s 0.000016633619961794465 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.0011394164000193 s 0.0010979923999911 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.0010211476000222 s 0.0009642218000408 s 1.06
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.0009822407999308 s 0.0009914535999996 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.0009377344001222 s 0.0009346603998892 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.0009606779999558 s 0.0009413688000677 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.0010039738000159 s 0.001004744400052 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.000997136999922 s 0.0010086406000482 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.0028962450000108 s 0.0027766954000981 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.0023513898000601 s 0.0023500500001318 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.0024714459998904 s 0.0022446068000135 s 1.10
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.0022732717999133 s 0.0022493833998851 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.0022815433999312 s 0.0021829849999448 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.0022654426000372 s 0.002215745400008 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.0022645318000286 s 0.0024366134001866 s 0.93
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.0066986801999519 s 0.0066071121999812 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.0063430310000512 s 0.0060330907999741 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.0057026240001505 s 0.0055423939998945 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.0042185371999039 s 0.0057159447999765 s 0.74
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.0061754990000736 s 0.0056442914000399 s 1.09
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.0033079599999837 s 0.0054469434000566 s 0.61
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.0057557478000489 s 0.0055817624000155 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.0048426846001348 s 0.0072480184000596 s 0.67
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.0061545509999632 s 0.0057854932002555 s 1.06
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.0034732670001176 s 0.0056285987998307 s 0.62
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.005796855600056 s 0.005813245399986 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.0036269055999582 s 0.0059623540000757 s 0.61
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.0058944771999449 s 0.0058417539999027 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.0034753680000903 s 0.005578324200087 s 0.62
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.0056282336000549 s 0.0057755715999519 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.0040375607998612 s 0.0058006471998851 s 0.70
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.0057279065999864 s 0.0057635594001112 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.0034775234001244 s 0.0058320177999121 s 0.60
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.0057100048000393 s 0.0059808674001033 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal 0.000274145 s 0.0002812799999999 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal 0.000272544 s 0.000280961 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal 0.000287233 s 0.000288864 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal 0.0002725449999999 s 0.00028032 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal 0.000274497 s 0.00028272 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal 0.000288032 s 0.000289537 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal 0.00028784 s 0.000288833 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward 0.000558465 s 0.000562049 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward 0.000538433 s 0.000539745 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward 0.0005577609999999 s 0.000561377 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward 0.00055824 s 0.000561089 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward 0.000558881 s 0.0005594249999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward 0.00055853 s 0.0005615689999999 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward 0.000558337 s 0.0005614719999999 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev 0.001024482 s 0.001054273 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev 0.000986849 s 0.001012514 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev 0.001022466 s 0.0010486109999999 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev 0.000981601 s 0.001013345 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev 0.001007138 s 0.001035266 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev 0.001033346 s 0.001061505 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev 0.00100749 s 0.001033346 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev 0.001024129 s 0.00105165 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev 0.000970658 s 0.000998978 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev 0.001024033 s 0.001049281 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev 0.00102237 s 0.001049185 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev 0.000972354 s 0.000999874 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev 0.001021825 s 0.001049249 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev 0.001017825 s 0.001045313 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev 0.000956705 s 0.000981729 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev 0.001019874 s 0.001048194 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev 0.001018177 s 0.001044449 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev 0.001019554 s 0.001045025 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev 0.001020769 s 0.001046785 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal 0.00012390525 s 0.000124138 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal 0.000126831 s 0.0001265315 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal 0.0001526522499999 s 0.00015248825 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal 0.00013381225 s 0.0001342567499999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal 0.000130815 s 0.00013130825 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal 0.00014763875 s 0.000148157 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal 0.0001508275 s 0.0001509649999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward 0.0002119565 s 0.00021195375 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward 0.0002610409999999 s 0.00026127075 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward 0.000211834 s 0.0002120705 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward 0.0002183465 s 0.000218458 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward 0.00021252675 s 0.000212148 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward 0.0002181719999999 s 0.0002188435 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward 0.00021192975 s 0.00021205775 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev 0.00035582275 s 0.0003543785 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev 0.00025678275 s 0.00025620625 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev 0.00035629325 s 0.0003539884999999 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev 0.0002569964999999 s 0.00025666725 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev 0.00035625675 s 0.000354111 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev 0.00029120825 s 0.0002908894999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev 0.000356395 s 0.00035398575 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev 0.0003561189999999 s 0.00035565625 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev 0.0002720195 s 0.00027080325 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev 0.0003558807499999 s 0.0003549992499999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev 0.000356267 s 0.0003538355 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev 0.00027263975 s 0.0002722535 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev 0.0003564195 s 0.0003536705 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev 0.00035795875 s 0.0003572295 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev 0.00028446375 s 0.00028288975 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev 0.00035822725 s 0.00035754225 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev 0.000358309 s 0.00035657975 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev 0.0003014005 s 0.0003010629999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev 0.0003587345 s 0.00035641975 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.002241534 s 0.0010979923999911 s 2.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.002216465 s 0.0009642218000408 s 2.30
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.00212408 s 0.0009914535999996 s 2.14
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.002197731 s 0.0009346603998892 s 2.35
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.0021961119999999 s 0.0009413688000677 s 2.33
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.002373761 s 0.001004744400052 s 2.36
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.00225613 s 0.0010086406000482 s 2.24
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.005529156 s 0.0027766954000981 s 1.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.005023817 s 0.0023500500001318 s 2.14
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.005419653 s 0.0022446068000135 s 2.41
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.005645761 s 0.0022493833998851 s 2.51
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.005693438 s 0.0021829849999448 s 2.61
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.005283076 s 0.002215745400008 s 2.38
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.0054864449999999 s 0.0024366134001866 s 2.25
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.010451774 s 0.0066071121999812 s 1.58
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.009404883 s 0.0060330907999741 s 1.56
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.007992624 s 0.0055423939998945 s 1.44
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.008630675 s 0.0057159447999765 s 1.51
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.009159435 s 0.0056442914000399 s 1.62
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.007988127 s 0.0054469434000566 s 1.47
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.008718377 s 0.0055817624000155 s 1.56
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.009059383 s 0.0072480184000596 s 1.25
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.0090106369999999 s 0.0057854932002555 s 1.56
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.00852264 s 0.0056285987998307 s 1.51
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.007899994 s 0.005813245399986 s 1.36
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.008903211 s 0.0059623540000757 s 1.49
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.009060197 s 0.0058417539999027 s 1.55
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.008628433 s 0.005578324200087 s 1.55
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.0072552 s 0.0057755715999519 s 1.26
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.0087431219999999 s 0.0058006471998851 s 1.51
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.008413147 s 0.0057635594001112 s 1.46
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.008607123 s 0.0058320177999121 s 1.48
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.008416237 s 0.0059808674001033 s 1.41
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.002305 s 0.0010979923999911 s 2.10
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.003407 s 0.0009642218000408 s 3.53
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.001609 s 0.0009914535999996 s 1.62
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.0019299999999999 s 0.0009346603998892 s 2.06
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.00162 s 0.0009413688000677 s 1.72
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.001964 s 0.001004744400052 s 1.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.00179 s 0.0010086406000482 s 1.77
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.004434 s 0.0027766954000981 s 1.60
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.00452 s 0.0023500500001318 s 1.92
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.00414 s 0.0022446068000135 s 1.84
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.0041589999999999 s 0.0022493833998851 s 1.85
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.004301 s 0.0021829849999448 s 1.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.0052089999999999 s 0.002215745400008 s 2.35
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.004606 s 0.0024366134001866 s 1.89
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.011292 s 0.0066071121999812 s 1.71
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.0098339999999999 s 0.0060330907999741 s 1.63
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.00958 s 0.0055423939998945 s 1.73
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.012016 s 0.0057159447999765 s 2.10
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.015022 s 0.0056442914000399 s 2.66
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.007902 s 0.0054469434000566 s 1.45
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.014385 s 0.0055817624000155 s 2.58
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.009478 s 0.0072480184000596 s 1.31
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.011652 s 0.0057854932002555 s 2.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.006878 s 0.0056285987998307 s 1.22
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.011532 s 0.005813245399986 s 1.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.009721 s 0.0059623540000757 s 1.63
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.0085069999999999 s 0.0058417539999027 s 1.46
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.0080139999999999 s 0.005578324200087 s 1.44
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.008335 s 0.0057755715999519 s 1.44
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.009495 s 0.0058006471998851 s 1.64
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.009662 s 0.0057635594001112 s 1.68
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.026705 s 0.0058320177999121 s 4.58
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.014134 s 0.0059808674001033 s 2.36
scatter_sum / JaXPipe / cpu / Primal 0.000009147839991783255 s 0.000008440359997621273 s 1.08
scatter_sum / Jax / cpu / Primal 0.000009004340035971836 s 0.0000087820000044303 s 1.03
scatter_sum / HLOOpt / cpu / Primal 0.000012051700023221202 s 0.000012110340012441157 s 1.00
scatter_sum / PartOpt / cpu / Primal 0.000009201820012094684 s 0.000008506519961883895 s 1.08
scatter_sum / IPartOpt / cpu / Primal 0.00000890010001057817 s 0.000008739120039535919 s 1.02
scatter_sum / DefOpt / cpu / Primal 0.000008363259912584908 s 0.000008176920000551035 s 1.02
scatter_sum / IDefOpt / cpu / Primal 0.00000909112001863832 s 0.000008401979957852746 s 1.08
scatter_sum / JaXPipe / cpu / Forward 0.000013786280014755904 s 0.000012322020047577098 s 1.12
scatter_sum / Jax / cpu / Forward 0.000013691759995708708 s 0.000011957599945162656 s 1.15
scatter_sum / HLOOpt / cpu / Forward 0.00001846332002969575 s 0.000017227240014108248 s 1.07
scatter_sum / PartOpt / cpu / Forward 0.000019410379973123785 s 0.000017908880017785123 s 1.08
scatter_sum / IPartOpt / cpu / Forward 0.000013634399983857291 s 0.000012497800025812466 s 1.09
scatter_sum / DefOpt / cpu / Forward 0.000019325459998071893 s 0.000017549120011608467 s 1.10
scatter_sum / IDefOpt / cpu / Forward 0.000013564039982156828 s 0.000012501700020948191 s 1.08
scatter_sum / JaXPipe / cpu / PreRev 0.000013040960011494462 s 0.00001307078001445916 s 1.00
scatter_sum / JaXPipe / cpu / PostRev 0.000012946059941896235 s 0.00001225669997438672 s 1.06
scatter_sum / JaXPipe / cpu / BothRev 0.000013096539960315568 s 0.000017648939983700985 s 0.74
scatter_sum / Jax / cpu / BothRev 0.000013294119971760664 s 0.000012138579986640252 s 1.10
scatter_sum / HLOOpt / cpu / PreRev 0.000012564399994516862 s 0.000012648520005313913 s 0.99
scatter_sum / HLOOpt / cpu / PostRev 0.000016941559970291564 s 0.000017356560047119274 s 0.98
scatter_sum / HLOOpt / cpu / BothRev 0.000014695219970235483 s 0.000014113280030869646 s 1.04
scatter_sum / PartOpt / cpu / PreRev 0.000013426379973680014 s 0.000011910819985132548 s 1.13
scatter_sum / PartOpt / cpu / PostRev 0.000013138760023139184 s 0.000012337380039753044 s 1.06
scatter_sum / PartOpt / cpu / BothRev 0.000013234120033303042 s 0.000012810720018023855 s 1.03
scatter_sum / IPartOpt / cpu / PreRev 0.00001925157999721705 s 0.00001804224000807153 s 1.07
scatter_sum / IPartOpt / cpu / PostRev 0.000012707099976978496 s 0.000013031860025876086 s 0.98
scatter_sum / IPartOpt / cpu / BothRev 0.000013232420042186276 s 0.00001302892000239808 s 1.02
scatter_sum / DefOpt / cpu / PreRev 0.000012935120066686068 s 0.000012569900018206682 s 1.03
scatter_sum / DefOpt / cpu / PostRev 0.000012908339986097415 s 0.000012809520012524444 s 1.01
scatter_sum / DefOpt / cpu / BothRev 0.000013276920026328298 s 0.000012639920005312888 s 1.05
scatter_sum / IDefOpt / cpu / PreRev 0.00001304314001572493 s 0.000011934459989788592 s 1.09
scatter_sum / IDefOpt / cpu / PostRev 0.000013412940024863928 s 0.000012639860005947413 s 1.06
scatter_sum / IDefOpt / cpu / BothRev 0.000012983579972569716 s 0.00001264491999791062 s 1.03
scatter_sum / JaXPipe / cuda / Primal 0.000010144 s 0.000009888 s 1.03
scatter_sum / Jax / cuda / Primal 0.000009984 s 0.000010016 s 1.00
scatter_sum / HLOOpt / cuda / Primal 0.000009952 s 0.00001008 s 0.99
scatter_sum / PartOpt / cuda / Primal 0.000009856 s 0.00000992 s 0.99
scatter_sum / IPartOpt / cuda / Primal 0.00001008 s 0.00001008 s 1
scatter_sum / DefOpt / cuda / Primal 0.000010112 s 0.000009984 s 1.01
scatter_sum / IDefOpt / cuda / Primal 0.000011393 s 0.000009824 s 1.16
scatter_sum / JaXPipe / cuda / Forward 0.000019936 s 0.000017344 s 1.15
scatter_sum / Jax / cuda / Forward 0.000017247999999999998 s 0.000017216 s 1.00
scatter_sum / HLOOpt / cuda / Forward 0.000017313 s 0.000021664 s 0.80
scatter_sum / PartOpt / cuda / Forward 0.000016864 s 0.000017375999999999998 s 0.97
scatter_sum / IPartOpt / cuda / Forward 0.000017056 s 0.000017472 s 0.98
scatter_sum / DefOpt / cuda / Forward 0.000017631 s 0.000017856 s 0.99
scatter_sum / IDefOpt / cuda / Forward 0.00001728 s 0.000017247999999999998 s 1.00
scatter_sum / JaXPipe / cuda / PreRev 0.000016864 s 0.000017408 s 0.97
scatter_sum / JaXPipe / cuda / PostRev 0.000016896000000000002 s 0.000016705 s 1.01
scatter_sum / JaXPipe / cuda / BothRev 0.0000168 s 0.000017375000000000002 s 0.97
scatter_sum / Jax / cuda / BothRev 0.000017152 s 0.0000168 s 1.02
scatter_sum / HLOOpt / cuda / PreRev 0.000016832 s 0.000017472 s 0.96
scatter_sum / HLOOpt / cuda / PostRev 0.000017152 s 0.000017056 s 1.01
scatter_sum / HLOOpt / cuda / BothRev 0.000016992 s 0.000018208 s 0.93
scatter_sum / PartOpt / cuda / PreRev 0.000017375999999999998 s 0.000017152 s 1.01
scatter_sum / PartOpt / cuda / PostRev 0.00001712 s 0.000026272 s 0.65
scatter_sum / PartOpt / cuda / BothRev 0.00001696 s 0.000016576000000000002 s 1.02
scatter_sum / IPartOpt / cuda / PreRev 0.000016992 s 0.000017247999999999998 s 0.99
scatter_sum / IPartOpt / cuda / PostRev 0.000016992 s 0.000017952 s 0.95
scatter_sum / IPartOpt / cuda / BothRev 0.000017792 s 0.000017152 s 1.04
scatter_sum / DefOpt / cuda / PreRev 0.00001824 s 0.000017696 s 1.03
scatter_sum / DefOpt / cuda / PostRev 0.000017375999999999998 s 0.000016383999999999998 s 1.06
scatter_sum / DefOpt / cuda / BothRev 0.000016255999999999998 s 0.000017664 s 0.92
scatter_sum / IDefOpt / cuda / PreRev 0.00001712 s 0.000017344 s 0.99
scatter_sum / IDefOpt / cuda / PostRev 0.000016927999999999998 s 0.000016896000000000002 s 1.00
scatter_sum / IDefOpt / cuda / BothRev 0.000016927999999999998 s 0.000017343 s 0.98
scatter_sum / JaXPipe / tpu / Primal 0.000001351175 s 0.0000013434250000000002 s 1.01
scatter_sum / Jax / tpu / Primal 0.0000013538 s 0.000001413975 s 0.96
scatter_sum / HLOOpt / tpu / Primal 0.000001360975 s 0.0000013532 s 1.01
scatter_sum / PartOpt / tpu / Primal 0.000001353 s 0.00000141455 s 0.96
scatter_sum / IPartOpt / tpu / Primal 0.0000013611 s 0.000001352775 s 1.01
scatter_sum / DefOpt / tpu / Primal 0.000001353475 s 0.000001414125 s 0.96
scatter_sum / IDefOpt / tpu / Primal 0.0000013609 s 0.00000135315 s 1.01
scatter_sum / JaXPipe / tpu / Forward 0.0000026942500000000005 s 0.00000271065 s 0.99
scatter_sum / Jax / tpu / Forward 0.000002735875 s 0.000002731275 s 1.00
scatter_sum / HLOOpt / tpu / Forward 0.0000027027 s 0.00000271145 s 1.00
scatter_sum / PartOpt / tpu / Forward 0.000002708575 s 0.0000027006 s 1.00
scatter_sum / IPartOpt / tpu / Forward 0.00000270545 s 0.0000027127000000000003 s 1.00
scatter_sum / DefOpt / tpu / Forward 0.00000271315 s 0.00000270155 s 1.00
scatter_sum / IDefOpt / tpu / Forward 0.000002693175 s 0.0000027107 s 0.99
scatter_sum / JaXPipe / tpu / PreRev 0.000002698325 s 0.000002693900000000001 s 1.00
scatter_sum / JaXPipe / tpu / PostRev 0.00000268905 s 0.000002693625 s 1.00
scatter_sum / JaXPipe / tpu / BothRev 0.0000027118 s 0.000002707425 s 1.00
scatter_sum / Jax / tpu / BothRev 0.0000027470499999999994 s 0.000002752975 s 1.00
scatter_sum / HLOOpt / tpu / PreRev 0.0000027194 s 0.000002708075 s 1.00
scatter_sum / HLOOpt / tpu / PostRev 0.0000027416 s 0.000002755075 s 1.00
scatter_sum / HLOOpt / tpu / BothRev 0.0000027108 s 0.000002710625 s 1.00
scatter_sum / PartOpt / tpu / PreRev 0.0000027517000000000003 s 0.0000027484750000000004 s 1.00
scatter_sum / PartOpt / tpu / PostRev 0.000002711075 s 0.000002705775 s 1.00
scatter_sum / PartOpt / tpu / BothRev 0.00000274615 s 0.000002757 s 1.00
scatter_sum / IPartOpt / tpu / PreRev 0.00000272 s 0.000002708875 s 1.00
scatter_sum / IPartOpt / tpu / PostRev 0.00000275135 s 0.0000027508500000000005 s 1.00
scatter_sum / IPartOpt / tpu / BothRev 0.000002714875 s 0.0000027068 s 1.00
scatter_sum / DefOpt / tpu / PreRev 0.000002747 s 0.0000027486 s 1.00
scatter_sum / DefOpt / tpu / PostRev 0.000002718175 s 0.0000027060250000000005 s 1.00
scatter_sum / DefOpt / tpu / BothRev 0.000002749725 s 0.00000275165 s 1.00
scatter_sum / IDefOpt / tpu / PreRev 0.0000027113250000000003 s 0.000002707525 s 1.00
scatter_sum / IDefOpt / tpu / PostRev 0.0000027499 s 0.0000027528750000000003 s 1.00
scatter_sum / IDefOpt / tpu / BothRev 0.0000027216000000000003 s 0.0000027087 s 1.00
scatter_sum / JaXPipe / cpu / Primal 0.000016007 s 0.000008440359997621273 s 1.90
scatter_sum / Jax / cpu / Primal 0.000015613 s 0.0000087820000044303 s 1.78
scatter_sum / HLOOpt / cpu / Primal 0.000016389999999999997 s 0.000012110340012441157 s 1.35
scatter_sum / PartOpt / cpu / Primal 0.000016182 s 0.000008506519961883895 s 1.90
scatter_sum / IPartOpt / cpu / Primal 0.000015813 s 0.000008739120039535919 s 1.81
scatter_sum / DefOpt / cpu / Primal 0.000016063999999999997 s 0.000008176920000551035 s 1.96
scatter_sum / IDefOpt / cpu / Primal 0.000015775 s 0.000008401979957852746 s 1.88
scatter_sum / JaXPipe / cpu / Forward 0.000023522000000000003 s 0.000012322020047577098 s 1.91
scatter_sum / Jax / cpu / Forward 0.000022868 s 0.000011957599945162656 s 1.91
scatter_sum / HLOOpt / cpu / Forward 0.000022327 s 0.000017227240014108248 s 1.30
scatter_sum / PartOpt / cpu / Forward 0.000022705 s 0.000017908880017785123 s 1.27
scatter_sum / IPartOpt / cpu / Forward 0.000022456 s 0.000012497800025812466 s 1.80
scatter_sum / DefOpt / cpu / Forward 0.000023163 s 0.000017549120011608467 s 1.32
scatter_sum / IDefOpt / cpu / Forward 0.000022182 s 0.000012501700020948191 s 1.77
scatter_sum / JaXPipe / cpu / PreRev 0.000022991 s 0.00001307078001445916 s 1.76
scatter_sum / JaXPipe / cpu / PostRev 0.000024367 s 0.00001225669997438672 s 1.99
scatter_sum / JaXPipe / cpu / BothRev 0.000023811 s 0.000017648939983700985 s 1.35
scatter_sum / Jax / cpu / BothRev 0.000022798 s 0.000012138579986640252 s 1.88
scatter_sum / HLOOpt / cpu / PreRev 0.000022226 s 0.000012648520005313913 s 1.76
scatter_sum / HLOOpt / cpu / PostRev 0.000023701 s 0.000017356560047119274 s 1.37
scatter_sum / HLOOpt / cpu / BothRev 0.000022956 s 0.000014113280030869646 s 1.63
scatter_sum / PartOpt / cpu / PreRev 0.000022595 s 0.000011910819985132548 s 1.90
scatter_sum / PartOpt / cpu / PostRev 0.000024134 s 0.000012337380039753044 s 1.96
scatter_sum / PartOpt / cpu / BothRev 0.000023868 s 0.000012810720018023855 s 1.86
scatter_sum / IPartOpt / cpu / PreRev 0.000022124 s 0.00001804224000807153 s 1.23
scatter_sum / IPartOpt / cpu / PostRev 0.000023774 s 0.000013031860025876086 s 1.82
scatter_sum / IPartOpt / cpu / BothRev 0.000023657 s 0.00001302892000239808 s 1.82
scatter_sum / DefOpt / cpu / PreRev 0.000022659 s 0.000012569900018206682 s 1.80
scatter_sum / DefOpt / cpu / PostRev 0.000023969 s 0.000012809520012524444 s 1.87
scatter_sum / DefOpt / cpu / BothRev 0.00002365 s 0.000012639920005312888 s 1.87
scatter_sum / IDefOpt / cpu / PreRev 0.000022522 s 0.000011934459989788592 s 1.89
scatter_sum / IDefOpt / cpu / PostRev 0.00002277 s 0.000012639860005947413 s 1.80
scatter_sum / IDefOpt / cpu / BothRev 0.000023648 s 0.00001264491999791062 s 1.87
scatter_sum / JaXPipe / cpu / Primal 0.000011 s 0.000008440359997621273 s 1.30
scatter_sum / Jax / cpu / Primal 0.000035000000000000004 s 0.0000087820000044303 s 3.99
scatter_sum / HLOOpt / cpu / Primal 0.000011 s 0.000012110340012441157 s 0.91
scatter_sum / PartOpt / cpu / Primal 0.000011 s 0.000008506519961883895 s 1.29
scatter_sum / IPartOpt / cpu / Primal 0.000016 s 0.000008739120039535919 s 1.83
scatter_sum / DefOpt / cpu / Primal 0.000015 s 0.000008176920000551035 s 1.83
scatter_sum / IDefOpt / cpu / Primal 0.000011 s 0.000008401979957852746 s 1.31
scatter_sum / JaXPipe / cpu / Forward 0.000016 s 0.000012322020047577098 s 1.30
scatter_sum / Jax / cpu / Forward 0.000016 s 0.000011957599945162656 s 1.34
scatter_sum / HLOOpt / cpu / Forward 0.000016 s 0.000017227240014108248 s 0.93
scatter_sum / PartOpt / cpu / Forward 0.000016 s 0.000017908880017785123 s 0.89
scatter_sum / IPartOpt / cpu / Forward 0.000016 s 0.000012497800025812466 s 1.28
scatter_sum / DefOpt / cpu / Forward 0.000019 s 0.000017549120011608467 s 1.08
scatter_sum / IDefOpt / cpu / Forward 0.000017 s 0.000012501700020948191 s 1.36
scatter_sum / JaXPipe / cpu / PreRev 0.000016 s 0.00001307078001445916 s 1.22
scatter_sum / JaXPipe / cpu / PostRev 0.000016 s 0.00001225669997438672 s 1.31
scatter_sum / JaXPipe / cpu / BothRev 0.000024 s 0.000017648939983700985 s 1.36
scatter_sum / Jax / cpu / BothRev 0.000019 s 0.000012138579986640252 s 1.57
scatter_sum / HLOOpt / cpu / PreRev 0.000017 s 0.000012648520005313913 s 1.34
scatter_sum / HLOOpt / cpu / PostRev 0.000017 s 0.000017356560047119274 s 0.98
scatter_sum / HLOOpt / cpu / BothRev 0.000019 s 0.000014113280030869646 s 1.35
scatter_sum / PartOpt / cpu / PreRev 0.000017999999999999997 s 0.000011910819985132548 s 1.51
scatter_sum / PartOpt / cpu / PostRev 0.000016 s 0.000012337380039753044 s 1.30
scatter_sum / PartOpt / cpu / BothRev 0.000017 s 0.000012810720018023855 s 1.33
scatter_sum / IPartOpt / cpu / PreRev 0.000017 s 0.00001804224000807153 s 0.94
scatter_sum / IPartOpt / cpu / PostRev 0.000016 s 0.000013031860025876086 s 1.23
scatter_sum / IPartOpt / cpu / BothRev 0.000017 s 0.00001302892000239808 s 1.30
scatter_sum / DefOpt / cpu / PreRev 0.000016 s 0.000012569900018206682 s 1.27
scatter_sum / DefOpt / cpu / PostRev 0.000017 s 0.000012809520012524444 s 1.33
scatter_sum / DefOpt / cpu / BothRev 0.000016 s 0.000012639920005312888 s 1.27
scatter_sum / IDefOpt / cpu / PreRev 0.000017 s 0.000011934459989788592 s 1.42
scatter_sum / IDefOpt / cpu / PostRev 0.000017 s 0.000012639860005947413 s 1.34
scatter_sum / IDefOpt / cpu / BothRev 0.000017 s 0.00001264491999791062 s 1.34
slicing / JaXPipe / cpu / Primal 0.000008105140004772693 s 0.000007139699982872117 s 1.14
slicing / Jax / cpu / Primal 0.000006579500004590954 s 0.000006225620008990518 s 1.06
slicing / HLOOpt / cpu / Primal 0.000011492300009194878 s 0.000009641319975344231 s 1.19
slicing / PartOpt / cpu / Primal 0.000007086600035108858 s 0.000006203919992913143 s 1.14
slicing / IPartOpt / cpu / Primal 0.000008129899988489342 s 0.000006752039989805781 s 1.20
slicing / DefOpt / cpu / Primal 0.000011949499930778983 s 0.000010541800020291702 s 1.13
slicing / IDefOpt / cpu / Primal 0.000006886060036777053 s 0.000006175259986775927 s 1.12
slicing / JaXPipe / cpu / Forward 0.000010785320000650245 s 0.000009888000040518818 s 1.09
slicing / Jax / cpu / Forward 0.000011396900008548984 s 0.000010671199988792067 s 1.07
slicing / HLOOpt / cpu / Forward 0.000016003540022211382 s 0.000014793900018048587 s 1.08
slicing / PartOpt / cpu / Forward 0.000017621219967622892 s 0.000014559259980160278 s 1.21
slicing / IPartOpt / cpu / Forward 0.000010517399987293176 s 0.000009506959941063542 s 1.11
slicing / DefOpt / cpu / Forward 0.00001519784000265645 s 0.000015004759989096783 s 1.01
slicing / IDefOpt / cpu / Forward 0.00001048092000928591 s 0.00000969529999565566 s 1.08
slicing / JaXPipe / cpu / PreRev 0.00001156928000455082 s 0.000010796019987537876 s 1.07
slicing / JaXPipe / cpu / PostRev 0.000011554560023796511 s 0.000010617620000630268 s 1.09
slicing / JaXPipe / cpu / BothRev 0.00001537107999865839 s 0.000013817860008202842 s 1.11
slicing / Jax / cpu / BothRev 0.000011821740017694535 s 0.000010457820008014095 s 1.13
slicing / HLOOpt / cpu / PreRev 0.00001087052000912081 s 0.000010395860017524684 s 1.05
slicing / HLOOpt / cpu / PostRev 0.000011583280002014365 s 0.00001077822001207096 s 1.07
slicing / HLOOpt / cpu / BothRev 0.000012874519979959588 s 0.000012587820001499495 s 1.02
slicing / PartOpt / cpu / PreRev 0.000011304140052743603 s 0.000010113780035680977 s 1.12
slicing / PartOpt / cpu / PostRev 0.00001165402000879112 s 0.00001018205997752375 s 1.14
slicing / PartOpt / cpu / BothRev 0.000011771480030802197 s 0.000010186500012423496 s 1.16
slicing / IPartOpt / cpu / PreRev 0.000016095400005724515 s 0.000010268260002703756 s 1.57
slicing / IPartOpt / cpu / PostRev 0.000011564819997147425 s 0.000011290060001556412 s 1.02
slicing / IPartOpt / cpu / BothRev 0.000011689720013237093 s 0.0000105920400164905 s 1.10
slicing / DefOpt / cpu / PreRev 0.00001079020001270692 s 0.000010875059988393332 s 0.99
slicing / DefOpt / cpu / PostRev 0.000011439800018706592 s 0.000011181340005350648 s 1.02
slicing / DefOpt / cpu / BothRev 0.000011289100029898693 s 0.000011073839996242896 s 1.02
slicing / IDefOpt / cpu / PreRev 0.000010925060050794855 s 0.000010779179983728682 s 1.01
slicing / IDefOpt / cpu / PostRev 0.000011775099965234404 s 0.00001098723997529305 s 1.07
slicing / IDefOpt / cpu / BothRev 0.00001111052003579971 s 0.000010734000024967828 s 1.04
slicing / JaXPipe / cuda / Primal 0.000001887 s 0.000001887 s 1
slicing / Jax / cuda / Primal 0.000001887 s 0.000001887 s 1
slicing / HLOOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
slicing / PartOpt / cuda / Primal 0.000001887 s 0.000001888 s 1.00
slicing / IPartOpt / cuda / Primal 0.000001887 s 0.000001888 s 1.00
slicing / DefOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
slicing / IDefOpt / cuda / Primal 0.000001887 s 0.000001887 s 1
slicing / JaXPipe / cuda / Forward 0.000010656 s 0.000010016 s 1.06
slicing / Jax / cuda / Forward 0.000010528 s 0.000009824 s 1.07
slicing / HLOOpt / cuda / Forward 0.000010176 s 0.000009728 s 1.05
slicing / PartOpt / cuda / Forward 0.00000976 s 0.000009824 s 0.99
slicing / IPartOpt / cuda / Forward 0.000010016 s 0.000009921 s 1.01
slicing / DefOpt / cuda / Forward 0.000009888 s 0.000009631 s 1.03
slicing / IDefOpt / cuda / Forward 0.000010176 s 0.00000992 s 1.03
slicing / JaXPipe / cuda / PreRev 0.000009952 s 0.000010273 s 0.97
slicing / JaXPipe / cuda / PostRev 0.000009889 s 0.000010176 s 0.97
slicing / JaXPipe / cuda / BothRev 0.000010207 s 0.000009952 s 1.03
slicing / Jax / cuda / BothRev 0.000009568 s 0.000010816 s 0.88
slicing / HLOOpt / cuda / PreRev 0.000010112 s 0.00001024 s 0.99
slicing / HLOOpt / cuda / PostRev 0.000009792 s 0.000009792 s 1
slicing / HLOOpt / cuda / BothRev 0.000009536 s 0.000010048 s 0.95
slicing / PartOpt / cuda / PreRev 0.00001008 s 0.000009696 s 1.04
slicing / PartOpt / cuda / PostRev 0.000009951 s 0.000009792 s 1.02
slicing / PartOpt / cuda / BothRev 0.000009824 s 0.000010176 s 0.97
slicing / IPartOpt / cuda / PreRev 0.000010016 s 0.00000992 s 1.01
slicing / IPartOpt / cuda / PostRev 0.00001008 s 0.00000944 s 1.07
slicing / IPartOpt / cuda / BothRev 0.000009856 s 0.000009856 s 1
slicing / DefOpt / cuda / PreRev 0.000009888 s 0.000009856 s 1.00
slicing / DefOpt / cuda / PostRev 0.000010432 s 0.00000992 s 1.05
slicing / DefOpt / cuda / BothRev 0.00001008 s 0.000010017 s 1.01
slicing / IDefOpt / cuda / PreRev 0.000015136 s 0.000010496 s 1.44
slicing / IDefOpt / cuda / PostRev 0.000010048 s 0.000010112 s 0.99
slicing / IDefOpt / cuda / BothRev 0.000009697 s 0.000009984 s 0.97
slicing / JaXPipe / tpu / Primal 9.688499999999998e-7 s 0.000001015375 s 0.95
slicing / Jax / tpu / Primal 9.666750000000002e-7 s 9.71525e-7 s 1.00
slicing / HLOOpt / tpu / Primal 9.628e-7 s 0.00000103085 s 0.93
slicing / PartOpt / tpu / Primal 9.5975e-7 s 9.64475e-7 s 1.00
slicing / IPartOpt / tpu / Primal 9.59775e-7 s 0.000001018 s 0.94
slicing / DefOpt / tpu / Primal 9.65725e-7 s 9.70225e-7 s 1.00
slicing / IDefOpt / tpu / Primal 9.601e-7 s 0.000001015725 s 0.95
slicing / JaXPipe / tpu / Forward 0.00000140885 s 0.000001419375 s 0.99
slicing / Jax / tpu / Forward 0.00000141925 s 0.000001471425 s 0.96
slicing / HLOOpt / tpu / Forward 0.000001517975 s 0.0000015144750000000002 s 1.00
slicing / PartOpt / tpu / Forward 0.00000144435 s 0.000001496475 s 0.97
slicing / IPartOpt / tpu / Forward 0.00000152035 s 0.00000151695 s 1.00
slicing / DefOpt / tpu / Forward 0.000001438075 s 0.00000148795 s 0.97
slicing / IDefOpt / tpu / Forward 0.0000015166750000000002 s 0.0000015108750000000002 s 1.00
slicing / JaXPipe / tpu / PreRev 0.000002331775 s 0.00000256505 s 0.91
slicing / JaXPipe / tpu / PostRev 0.00000250855 s 0.0000025232500000000004 s 0.99
slicing / JaXPipe / tpu / BothRev 0.0000023656 s 0.00000262075 s 0.90
slicing / Jax / tpu / BothRev 0.0000025276749999999995 s 0.0000025406 s 0.99
slicing / HLOOpt / tpu / PreRev 0.00000235305 s 0.000002599725 s 0.91
slicing / HLOOpt / tpu / PostRev 0.000002521575 s 0.000002542675 s 0.99
slicing / HLOOpt / tpu / BothRev 0.0000023554500000000005 s 0.000002587675 s 0.91
slicing / PartOpt / tpu / PreRev 0.0000025455250000000005 s 0.0000025377499999999995 s 1.00
slicing / PartOpt / tpu / PostRev 0.000002347025 s 0.000002587575 s 0.91
slicing / PartOpt / tpu / BothRev 0.000002534375 s 0.00000254335 s 1.00
slicing / IPartOpt / tpu / PreRev 0.000002359275 s 0.0000025787 s 0.91
slicing / IPartOpt / tpu / PostRev 0.000002540675 s 0.000002556 s 0.99
slicing / IPartOpt / tpu / BothRev 0.000002349425 s 0.000002581425 s 0.91
slicing / DefOpt / tpu / PreRev 0.00000253495 s 0.0000025392 s 1.00
slicing / DefOpt / tpu / PostRev 0.000002354125 s 0.0000025871 s 0.91
slicing / DefOpt / tpu / BothRev 0.00000253195 s 0.000002539625 s 1.00
slicing / IDefOpt / tpu / PreRev 0.00000234255 s 0.0000025818000000000004 s 0.91
slicing / IDefOpt / tpu / PostRev 0.00000252715 s 0.0000025369 s 1.00
slicing / IDefOpt / tpu / BothRev 0.000002348875 s 0.0000025960250000000003 s 0.90
slicing / JaXPipe / cpu / Primal 0.000012989 s 0.000007139699982872117 s 1.82
slicing / Jax / cpu / Primal 0.000013042 s 0.000006225620008990518 s 2.09
slicing / HLOOpt / cpu / Primal 0.000012918 s 0.000009641319975344231 s 1.34
slicing / PartOpt / cpu / Primal 0.000012983 s 0.000006203919992913143 s 2.09
slicing / IPartOpt / cpu / Primal 0.00001277 s 0.000006752039989805781 s 1.89
slicing / DefOpt / cpu / Primal 0.000013083 s 0.000010541800020291702 s 1.24
slicing / IDefOpt / cpu / Primal 0.000012736 s 0.000006175259986775927 s 2.06
slicing / JaXPipe / cpu / Forward 0.000017434000000000003 s 0.000009888000040518818 s 1.76
slicing / Jax / cpu / Forward 0.000016664000000000002 s 0.000010671199988792067 s 1.56
slicing / HLOOpt / cpu / Forward 0.000016676 s 0.000014793900018048587 s 1.13
slicing / PartOpt / cpu / Forward 0.000017174 s 0.000014559259980160278 s 1.18
slicing / IPartOpt / cpu / Forward 0.000016823 s 0.000009506959941063542 s 1.77
slicing / DefOpt / cpu / Forward 0.000016843 s 0.000015004759989096783 s 1.12
slicing / IDefOpt / cpu / Forward 0.000016721 s 0.00000969529999565566 s 1.72
slicing / JaXPipe / cpu / PreRev 0.000017229 s 0.000010796019987537876 s 1.60
slicing / JaXPipe / cpu / PostRev 0.000017779 s 0.000010617620000630268 s 1.67
slicing / JaXPipe / cpu / BothRev 0.000017587999999999998 s 0.000013817860008202842 s 1.27
slicing / Jax / cpu / BothRev 0.000018046 s 0.000010457820008014095 s 1.73
slicing / HLOOpt / cpu / PreRev 0.000017004 s 0.000010395860017524684 s 1.64
slicing / HLOOpt / cpu / PostRev 0.000017477 s 0.00001077822001207096 s 1.62
slicing / HLOOpt / cpu / BothRev 0.000017706 s 0.000012587820001499495 s 1.41
slicing / PartOpt / cpu / PreRev 0.00001706 s 0.000010113780035680977 s 1.69
slicing / PartOpt / cpu / PostRev 0.000018226 s 0.00001018205997752375 s 1.79
slicing / PartOpt / cpu / BothRev 0.000018010000000000002 s 0.000010186500012423496 s 1.77
slicing / IPartOpt / cpu / PreRev 0.000017629 s 0.000010268260002703756 s 1.72
slicing / IPartOpt / cpu / PostRev 0.000018687 s 0.000011290060001556412 s 1.66
slicing / IPartOpt / cpu / BothRev 0.000017926 s 0.0000105920400164905 s 1.69
slicing / DefOpt / cpu / PreRev 0.000017676999999999997 s 0.000010875059988393332 s 1.63
slicing / DefOpt / cpu / PostRev 0.000018077 s 0.000011181340005350648 s 1.62
slicing / DefOpt / cpu / BothRev 0.000017523 s 0.000011073839996242896 s 1.58
slicing / IDefOpt / cpu / PreRev 0.000017451999999999998 s 0.000010779179983728682 s 1.62
slicing / IDefOpt / cpu / PostRev 0.000017395000000000002 s 0.00001098723997529305 s 1.58
slicing / IDefOpt / cpu / BothRev 0.000017852 s 0.000010734000024967828 s 1.66
slicing / JaXPipe / cpu / Primal 0.000008999999999999999 s 0.000007139699982872117 s 1.26
slicing / Jax / cpu / Primal 0.000008999999999999999 s 0.000006225620008990518 s 1.45
slicing / HLOOpt / cpu / Primal 0.000008 s 0.000009641319975344231 s 0.83
slicing / PartOpt / cpu / Primal 0.000008 s 0.000006203919992913143 s 1.29
slicing / IPartOpt / cpu / Primal 0.000031 s 0.000006752039989805781 s 4.59
slicing / DefOpt / cpu / Primal 0.00003 s 0.000010541800020291702 s 2.85
slicing / IDefOpt / cpu / Primal 0.000019 s 0.000006175259986775927 s 3.08
slicing / JaXPipe / cpu / Forward 0.000012 s 0.000009888000040518818 s 1.21
slicing / Jax / cpu / Forward 0.000012 s 0.000010671199988792067 s 1.12
slicing / HLOOpt / cpu / Forward 0.000013 s 0.000014793900018048587 s 0.88
slicing / PartOpt / cpu / Forward 0.000011 s 0.000014559259980160278 s 0.76
slicing / IPartOpt / cpu / Forward 0.000012 s 0.000009506959941063542 s 1.26
slicing / DefOpt / cpu / Forward 0.000012 s 0.000015004759989096783 s 0.80
slicing / IDefOpt / cpu / Forward 0.000012 s 0.00000969529999565566 s 1.24
slicing / JaXPipe / cpu / PreRev 0.000012 s 0.000010796019987537876 s 1.11
slicing / JaXPipe / cpu / PostRev 0.000012 s 0.000010617620000630268 s 1.13
slicing / JaXPipe / cpu / BothRev 0.000012 s 0.000013817860008202842 s 0.87
slicing / Jax / cpu / BothRev 0.000012 s 0.000010457820008014095 s 1.15
slicing / HLOOpt / cpu / PreRev 0.000014 s 0.000010395860017524684 s 1.35
slicing / HLOOpt / cpu / PostRev 0.000012 s 0.00001077822001207096 s 1.11
slicing / HLOOpt / cpu / BothRev 0.000012 s 0.000012587820001499495 s 0.95
slicing / PartOpt / cpu / PreRev 0.000013 s 0.000010113780035680977 s 1.29
slicing / PartOpt / cpu / PostRev 0.000012 s 0.00001018205997752375 s 1.18
slicing / PartOpt / cpu / BothRev 0.000012 s 0.000010186500012423496 s 1.18
slicing / IPartOpt / cpu / PreRev 0.000012 s 0.000010268260002703756 s 1.17
slicing / IPartOpt / cpu / PostRev 0.000012 s 0.000011290060001556412 s 1.06
slicing / IPartOpt / cpu / BothRev 0.000012 s 0.0000105920400164905 s 1.13
slicing / DefOpt / cpu / PreRev 0.000012 s 0.000010875059988393332 s 1.10
slicing / DefOpt / cpu / PostRev 0.000012 s 0.000011181340005350648 s 1.07
slicing / DefOpt / cpu / BothRev 0.000012 s 0.000011073839996242896 s 1.08
slicing / IDefOpt / cpu / PreRev 0.000012 s 0.000010779179983728682 s 1.11
slicing / IDefOpt / cpu / PostRev 0.000013 s 0.00001098723997529305 s 1.18
slicing / IDefOpt / cpu / BothRev 0.000012 s 0.000010734000024967828 s 1.12
sum / JaXPipe / cpu / Primal 0.00000946393995945982 s 0.000009182520043395926 s 1.03
sum / Jax / cpu / Primal 0.000008301480011141394 s 0.000008828279960653163 s 0.94
sum / HLOOpt / cpu / Primal 0.000012508000036177691 s 0.000012311380023675156 s 1.02
sum / PartOpt / cpu / Primal 0.000008498340012010886 s 0.000008517040032529622 s 1.00
sum / IPartOpt / cpu / Primal 0.000008799700026429491 s 0.00000768539998716733 s 1.14
sum / DefOpt / cpu / Primal 0.00001349050000499119 s 0.000008223459944929346 s 1.64
sum / IDefOpt / cpu / Primal 0.000008610579989181133 s 0.00000776467998548469 s 1.11
sum / JaXPipe / cpu / Forward 0.000012908859998788102 s 0.000011874920019181446 s 1.09
sum / Jax / cpu / Forward 0.000012571580036819796 s 0.0000123945599898434 s 1.01
sum / HLOOpt / cpu / Forward 0.00001828192002903961 s 0.00001648988003580598 s 1.11
sum / PartOpt / cpu / Forward 0.000017889519949676468 s 0.000016962420004347224 s 1.05
sum / IPartOpt / cpu / Forward 0.000012520399995992194 s 0.000011852979978357326 s 1.06
sum / DefOpt / cpu / Forward 0.00001823218001845817 s 0.000017227739990630654 s 1.06
sum / IDefOpt / cpu / Forward 0.000013308400066307512 s 0.000012147320021540508 s 1.10
sum / JaXPipe / cpu / PreRev 0.000011959299999944053 s 0.00001209846000165271 s 0.99
sum / JaXPipe / cpu / PostRev 0.000012749879997500104 s 0.000012334540024312446 s 1.03
sum / JaXPipe / cpu / BothRev 0.000016190540027309907 s 0.00001149539999460103 s 1.41
sum / Jax / cpu / BothRev 0.000012685779975072364 s 0.000011714179963746574 s 1.08
sum / HLOOpt / cpu / PreRev 0.000012113940028939396 s 0.00001129744000536448 s 1.07
sum / HLOOpt / cpu / PostRev 0.000016572340045968302 s 0.00001198017997921852 s 1.38
sum / HLOOpt / cpu / BothRev 0.000013566459992944146 s 0.000013544299981731456 s 1.00
sum / PartOpt / cpu / PreRev 0.000012229200028741617 s 0.000011844500004372094 s 1.03
sum / PartOpt / cpu / PostRev 0.00001193619998957729 s 0.000011333900010868092 s 1.05
sum / PartOpt / cpu / BothRev 0.000012072679955963397 s 0.000011511460006659036 s 1.05
sum / IPartOpt / cpu / PreRev 0.00001194627997392672 s 0.000013520800030164537 s 0.88
sum / IPartOpt / cpu / PostRev 0.000012495500031945994 s 0.000011325380037305876 s 1.10
sum / IPartOpt / cpu / BothRev 0.000011917880019609584 s 0.000011000579979736358 s 1.08
sum / DefOpt / cpu / PreRev 0.000012012799988951885 s 0.000011095880036009477 s 1.08
sum / DefOpt / cpu / PostRev 0.00001192972001263115 s 0.000011411460000090302 s 1.05
sum / DefOpt / cpu / BothRev 0.000012338120004642406 s 0.00001147460001448053 s 1.08
sum / IDefOpt / cpu / PreRev 0.000012404099925333868 s 0.000010913199957940378 s 1.14
sum / IDefOpt / cpu / PostRev 0.000012005920025330852 s 0.000011719559970515548 s 1.02
sum / IDefOpt / cpu / BothRev 0.000012019479954687996 s 0.000011089339977843338 s 1.08
sum / JaXPipe / cuda / Primal 0.000002048 s 0.000002048 s 1
sum / Jax / cuda / Primal 0.000002047 s 0.000002048 s 1.00
sum / HLOOpt / cuda / Primal 0.000002047 s 0.000002047 s 1
sum / PartOpt / cuda / Primal 0.000002048 s 0.00000208 s 0.98
sum / IPartOpt / cuda / Primal 0.000002048 s 0.00000208 s 0.98
sum / DefOpt / cuda / Primal 0.000002048 s 0.00000208 s 0.98
sum / IDefOpt / cuda / Primal 0.000002047 s 0.00000208 s 0.98
sum / JaXPipe / cuda / Forward 0.000010016 s 0.000010176 s 0.98
sum / Jax / cuda / Forward 0.000010016 s 0.00001008 s 0.99
sum / HLOOpt / cuda / Forward 0.000010016 s 0.00001008 s 0.99
sum / PartOpt / cuda / Forward 0.000010048 s 0.000010272 s 0.98
sum / IPartOpt / cuda / Forward 0.00001024 s 0.000010272 s 1.00
sum / DefOpt / cuda / Forward 0.000010272 s 0.000009952 s 1.03
sum / IDefOpt / cuda / Forward 0.000009504 s 0.000010336 s 0.92
sum / JaXPipe / cuda / PreRev 0.000009984 s 0.000009856 s 1.01
sum / JaXPipe / cuda / PostRev 0.000009952 s 0.00000976 s 1.02
sum / JaXPipe / cuda / BothRev 0.000009696 s 0.000010048 s 0.96
sum / Jax / cuda / BothRev 0.000009345 s 0.000009888 s 0.95
sum / HLOOpt / cuda / PreRev 0.000009824 s 0.00000928 s 1.06
sum / HLOOpt / cuda / PostRev 0.000010176 s 0.000009792 s 1.04
sum / HLOOpt / cuda / BothRev 0.000009856 s 0.000009664 s 1.02
sum / PartOpt / cuda / PreRev 0.000009696 s 0.000009824 s 0.99
sum / PartOpt / cuda / PostRev 0.00001008 s 0.000009792 s 1.03
sum / PartOpt / cuda / BothRev 0.000010112 s 0.000009952 s 1.02
sum / IPartOpt / cuda / PreRev 0.000009984 s 0.000009663 s 1.03
sum / IPartOpt / cuda / PostRev 0.00001024 s 0.000009536 s 1.07
sum / IPartOpt / cuda / BothRev 0.000009856 s 0.000009536 s 1.03
sum / DefOpt / cuda / PreRev 0.00000976 s 0.000009888 s 0.99
sum / DefOpt / cuda / PostRev 0.000009728 s 0.000009729 s 1.00
sum / DefOpt / cuda / BothRev 0.00001024 s 0.000009504 s 1.08
sum / IDefOpt / cuda / PreRev 0.0000104 s 0.00000992 s 1.05
sum / IDefOpt / cuda / PostRev 0.00000992 s 0.000009664 s 1.03
sum / IDefOpt / cuda / BothRev 0.000009568 s 0.000009504 s 1.01
sum / JaXPipe / tpu / Primal 5.03e-7 s 5.10575e-7 s 0.99
sum / Jax / tpu / Primal 5.566999999999999e-7 s 5.584499999999999e-7 s 1.00
sum / HLOOpt / tpu / Primal 5.1305e-7 s 5.2135e-7 s 0.98
sum / PartOpt / tpu / Primal 5.57075e-7 s 5.5805e-7 s 1.00
sum / IPartOpt / tpu / Primal 5.129000000000001e-7 s 5.21425e-7 s 0.98
sum / DefOpt / tpu / Primal 5.5695e-7 s 5.58625e-7 s 1.00
sum / IDefOpt / tpu / Primal 5.1275e-7 s 5.215750000000001e-7 s 0.98
sum / JaXPipe / tpu / Forward 0.0000015509 s 0.0000015555749999999995 s 1.00
sum / Jax / tpu / Forward 0.000001492 s 0.00000149725 s 1.00
sum / HLOOpt / tpu / Forward 0.000001528575 s 0.000001531225 s 1.00
sum / PartOpt / tpu / Forward 0.000001492575 s 0.000001504575 s 0.99
sum / IPartOpt / tpu / Forward 0.000001531775 s 0.0000015308250000000005 s 1.00
sum / DefOpt / tpu / Forward 0.0000014939749999999995 s 0.000001492625 s 1.00
sum / IDefOpt / tpu / Forward 0.000001528425 s 0.0000015318 s 1.00
sum / JaXPipe / tpu / PreRev 9.9025e-7 s 0.000001052375 s 0.94
sum / JaXPipe / tpu / PostRev 0.000001051775 s 0.00000109095 s 0.96
sum / JaXPipe / tpu / BothRev 9.921e-7 s 0.0000010573 s 0.94
sum / Jax / tpu / BothRev 0.0000010337499999999998 s 0.0000010866 s 0.95
sum / HLOOpt / tpu / PreRev 9.92875e-7 s 0.000001045525 s 0.95
sum / HLOOpt / tpu / PostRev 0.000001040125 s 0.00000108695 s 0.96
sum / HLOOpt / tpu / BothRev 9.90525e-7 s 0.0000010473499999999998 s 0.95
sum / PartOpt / tpu / PreRev 0.000001033275 s 0.00000108655 s 0.95
sum / PartOpt / tpu / PostRev 9.96925e-7 s 0.000001055375 s 0.94
sum / PartOpt / tpu / BothRev 0.0000010386 s 0.000001086675 s 0.96
sum / IPartOpt / tpu / PreRev 9.92825e-7 s 0.0000010486 s 0.95
sum / IPartOpt / tpu / PostRev 0.00000104065 s 0.000001084325 s 0.96
sum / IPartOpt / tpu / BothRev 9.93825e-7 s 0.000001049875 s 0.95
sum / DefOpt / tpu / PreRev 0.00000103695 s 0.000001085475 s 0.96
sum / DefOpt / tpu / PostRev 9.962e-7 s 0.0000010529 s 0.95
sum / DefOpt / tpu / BothRev 0.0000010416499999999998 s 0.00000108915 s 0.96
sum / IDefOpt / tpu / PreRev 9.96175e-7 s 0.00000104825 s 0.95
sum / IDefOpt / tpu / PostRev 0.000001046875 s 0.00000109085 s 0.96
sum / IDefOpt / tpu / BothRev 9.876e-7 s 0.000001048925 s 0.94
sum / JaXPipe / cpu / Primal 0.000014772 s 0.000009182520043395926 s 1.61
sum / Jax / cpu / Primal 0.000014617 s 0.000008828279960653163 s 1.66
sum / HLOOpt / cpu / Primal 0.00001506 s 0.000012311380023675156 s 1.22
sum / PartOpt / cpu / Primal 0.000014692 s 0.000008517040032529622 s 1.73
sum / IPartOpt / cpu / Primal 0.000014378 s 0.00000768539998716733 s 1.87
sum / DefOpt / cpu / Primal 0.000014507 s 0.000008223459944929346 s 1.76
sum / IDefOpt / cpu / Primal 0.000014602 s 0.00000776467998548469 s 1.88
sum / JaXPipe / cpu / Forward 0.000019846 s 0.000011874920019181446 s 1.67
sum / Jax / cpu / Forward 0.000020383 s 0.0000123945599898434 s 1.64
sum / HLOOpt / cpu / Forward 0.000020161 s 0.00001648988003580598 s 1.22
sum / PartOpt / cpu / Forward 0.000020074 s 0.000016962420004347224 s 1.18
sum / IPartOpt / cpu / Forward 0.000020236 s 0.000011852979978357326 s 1.71
sum / DefOpt / cpu / Forward 0.000020103 s 0.000017227739990630654 s 1.17
sum / IDefOpt / cpu / Forward 0.000020028 s 0.000012147320021540508 s 1.65
sum / JaXPipe / cpu / PreRev 0.000019266 s 0.00001209846000165271 s 1.59
sum / JaXPipe / cpu / PostRev 0.000019116 s 0.000012334540024312446 s 1.55
sum / JaXPipe / cpu / BothRev 0.000019605 s 0.00001149539999460103 s 1.71
sum / Jax / cpu / BothRev 0.000019227 s 0.000011714179963746574 s 1.64
sum / HLOOpt / cpu / PreRev 0.000018577 s 0.00001129744000536448 s 1.64
sum / HLOOpt / cpu / PostRev 0.000019268 s 0.00001198017997921852 s 1.61
sum / HLOOpt / cpu / BothRev 0.000019988 s 0.000013544299981731456 s 1.48
sum / PartOpt / cpu / PreRev 0.000019255 s 0.000011844500004372094 s 1.63
sum / PartOpt / cpu / PostRev 0.000019342 s 0.000011333900010868092 s 1.71
sum / PartOpt / cpu / BothRev 0.000019047 s 0.000011511460006659036 s 1.65
sum / IPartOpt / cpu / PreRev 0.000019193 s 0.000013520800030164537 s 1.42
sum / IPartOpt / cpu / PostRev 0.000020248 s 0.000011325380037305876 s 1.79
sum / IPartOpt / cpu / BothRev 0.00001947 s 0.000011000579979736358 s 1.77
sum / DefOpt / cpu / PreRev 0.0000194 s 0.000011095880036009477 s 1.75
sum / DefOpt / cpu / PostRev 0.000019899 s 0.000011411460000090302 s 1.74
sum / DefOpt / cpu / BothRev 0.000019542 s 0.00001147460001448053 s 1.70
sum / IDefOpt / cpu / PreRev 0.000019256 s 0.000010913199957940378 s 1.76
sum / IDefOpt / cpu / PostRev 0.000019245000000000003 s 0.000011719559970515548 s 1.64
sum / IDefOpt / cpu / BothRev 0.000019839 s 0.000011089339977843338 s 1.79
sum / JaXPipe / cpu / Primal 0.00001 s 0.000009182520043395926 s 1.09
sum / Jax / cpu / Primal 0.00001 s 0.000008828279960653163 s 1.13
sum / HLOOpt / cpu / Primal 0.00001 s 0.000012311380023675156 s 0.81
sum / PartOpt / cpu / Primal 0.00001 s 0.000008517040032529622 s 1.17
sum / IPartOpt / cpu / Primal 0.00001 s 0.00000768539998716733 s 1.30
sum / DefOpt / cpu / Primal 0.00001 s 0.000008223459944929346 s 1.22
sum / IDefOpt / cpu / Primal 0.00001 s 0.00000776467998548469 s 1.29
sum / JaXPipe / cpu / Forward 0.000045 s 0.000011874920019181446 s 3.79
sum / Jax / cpu / Forward 0.000015 s 0.0000123945599898434 s 1.21
sum / HLOOpt / cpu / Forward 0.000015 s 0.00001648988003580598 s 0.91
sum / PartOpt / cpu / Forward 0.000015 s 0.000016962420004347224 s 0.88
sum / IPartOpt / cpu / Forward 0.000014 s 0.000011852979978357326 s 1.18
sum / DefOpt / cpu / Forward 0.000014 s 0.000017227739990630654 s 0.81
sum / IDefOpt / cpu / Forward 0.000014 s 0.000012147320021540508 s 1.15
sum / JaXPipe / cpu / PreRev 0.000016 s 0.00001209846000165271 s 1.32
sum / JaXPipe / cpu / PostRev 0.000014 s 0.000012334540024312446 s 1.14
sum / JaXPipe / cpu / BothRev 0.000013 s 0.00001149539999460103 s 1.13
sum / Jax / cpu / BothRev 0.000014 s 0.000011714179963746574 s 1.20
sum / HLOOpt / cpu / PreRev 0.000013 s 0.00001129744000536448 s 1.15
sum / HLOOpt / cpu / PostRev 0.000013 s 0.00001198017997921852 s 1.09
sum / HLOOpt / cpu / BothRev 0.000013 s 0.000013544299981731456 s 0.96
sum / PartOpt / cpu / PreRev 0.000042 s 0.000011844500004372094 s 3.55
sum / PartOpt / cpu / PostRev 0.000019 s 0.000011333900010868092 s 1.68
sum / PartOpt / cpu / BothRev 0.000014 s 0.000011511460006659036 s 1.22
sum / IPartOpt / cpu / PreRev 0.000044 s 0.000013520800030164537 s 3.25
sum / IPartOpt / cpu / PostRev 0.000013 s 0.000011325380037305876 s 1.15
sum / IPartOpt / cpu / BothRev 0.000014 s 0.000011000579979736358 s 1.27
sum / DefOpt / cpu / PreRev 0.000016 s 0.000011095880036009477 s 1.44
sum / DefOpt / cpu / PostRev 0.000013 s 0.000011411460000090302 s 1.14
sum / DefOpt / cpu / BothRev 0.000014 s 0.00001147460001448053 s 1.22
sum / IDefOpt / cpu / PreRev 0.000013 s 0.000010913199957940378 s 1.19
sum / IDefOpt / cpu / PostRev 0.000013 s 0.000011719559970515548 s 1.11
sum / IDefOpt / cpu / BothRev 0.000013 s 0.000011089339977843338 s 1.17
value_and_grad / JaXPipe / cpu / Primal 0.000016502819989909766 s 0.000015938259984977776 s 1.04
value_and_grad / Jax / cpu / Primal 0.000016065919953689445 s 0.00001551725998069742 s 1.04
value_and_grad / HLOOpt / cpu / Primal 0.00001577392002218403 s 0.000014285360011854209 s 1.10
value_and_grad / PartOpt / cpu / Primal 0.00001569205999658152 s 0.000014652620029664833 s 1.07
value_and_grad / IPartOpt / cpu / Primal 0.00001533910000944161 s 0.00001474139997299062 s 1.04
value_and_grad / DefOpt / cpu / Primal 0.000015350619987657412 s 0.000014477040012934594 s 1.06
value_and_grad / IDefOpt / cpu / Primal 0.000016010740037017967 s 0.000015675319991714788 s 1.02
value_and_grad / JaXPipe / cuda / Primal 0.000033759999999999995 s 0.000032769 s 1.03
value_and_grad / Jax / cuda / Primal 0.000034176 s 0.000032736 s 1.04
value_and_grad / HLOOpt / cuda / Primal 0.000033856 s 0.000033024 s 1.03
value_and_grad / PartOpt / cuda / Primal 0.000033664 s 0.000033024 s 1.02
value_and_grad / IPartOpt / cuda / Primal 0.000034528000000000006 s 0.000033728 s 1.02
value_and_grad / DefOpt / cuda / Primal 0.000034432 s 0.00003328 s 1.03
value_and_grad / IDefOpt / cuda / Primal 0.000034688 s 0.000032928 s 1.05
value_and_grad / JaXPipe / tpu / Primal 0 s 0 s 1
value_and_grad / Jax / tpu / Primal 0 s 0 s 1
value_and_grad / HLOOpt / tpu / Primal 0 s 0 s 1
value_and_grad / PartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IPartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / DefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IDefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / JaXPipe / cpu / Primal 0.000023964 s 0.000015938259984977776 s 1.50
value_and_grad / Jax / cpu / Primal 0.000022842000000000003 s 0.00001551725998069742 s 1.47
value_and_grad / HLOOpt / cpu / Primal 0.000023582 s 0.000014285360011854209 s 1.65
value_and_grad / PartOpt / cpu / Primal 0.000023285 s 0.000014652620029664833 s 1.59
value_and_grad / IPartOpt / cpu / Primal 0.000024032 s 0.00001474139997299062 s 1.63
value_and_grad / DefOpt / cpu / Primal 0.00002405 s 0.000014477040012934594 s 1.66
value_and_grad / IDefOpt / cpu / Primal 0.000022984 s 0.000015675319991714788 s 1.47
value_and_grad / JaXPipe / cpu / Primal 0.000016 s 0.000015938259984977776 s 1.00
value_and_grad / Jax / cpu / Primal 0.000017 s 0.00001551725998069742 s 1.10
value_and_grad / HLOOpt / cpu / Primal 0.000016 s 0.000014285360011854209 s 1.12
value_and_grad / PartOpt / cpu / Primal 0.000016 s 0.000014652620029664833 s 1.09
value_and_grad / IPartOpt / cpu / Primal 0.000016 s 0.00001474139997299062 s 1.09
value_and_grad / DefOpt / cpu / Primal 0.000016 s 0.000014477040012934594 s 1.11
value_and_grad / IDefOpt / cpu / Primal 0.000052 s 0.000015675319991714788 s 3.32
jaxmd20 / JaXPipe / cuda / Primal 0.001538818 s 0.001489091 s 1.03
jaxmd20 / Jax / cuda / Primal 0.001473666 s 0.001436738 s 1.03
jaxmd20 / HLOOpt / cuda / Primal 0.001105762 s 0.001079234 s 1.02
jaxmd20 / PartOpt / cuda / Primal 0.001343555 s 0.001282083 s 1.05
jaxmd20 / IPartOpt / cuda / Primal 0.001350146 s 0.001299937 s 1.04
jaxmd20 / DefOpt / cuda / Primal 0.000527233 s 0.000553953 s 0.95
jaxmd20 / IDefOpt / cuda / Primal 0.000501089 s 0.000493377 s 1.02
jaxmd20 / JaXPipe / cuda / Forward 0.000812449 s 0.000823393 s 0.99
jaxmd20 / Jax / cuda / Forward 0.001788387 s 0.001806562 s 0.99
jaxmd20 / HLOOpt / cuda / Forward 0.000828225 s 0.000826082 s 1.00
jaxmd20 / PartOpt / cuda / Forward 0.000824097 s 0.000817729 s 1.01
jaxmd20 / IPartOpt / cuda / Forward 0.000823585 s 0.0008304009999999 s 0.99
jaxmd20 / DefOpt / cuda / Forward 0.000843456 s 0.000861762 s 0.98
jaxmd20 / IDefOpt / cuda / Forward 0.000825538 s 0.000861697 s 0.96
jaxmd20 / JaXPipe / cuda / PreRev 0.001698179 s 0.001678978 s 1.01
jaxmd20 / JaXPipe / cuda / PostRev 0.005286792 s 0.005292679 s 1.00
jaxmd20 / JaXPipe / cuda / BothRev 0.001687587 s 0.001657699 s 1.02
jaxmd20 / Jax / cuda / BothRev 0.005284104 s 0.005276871 s 1.00
jaxmd20 / HLOOpt / cuda / PreRev 0.001715586 s 0.001739106 s 0.99
jaxmd20 / HLOOpt / cuda / PostRev 0.005190983 s 0.0051658 s 1.00
jaxmd20 / HLOOpt / cuda / BothRev 0.0016418589999999 s 0.00163197 s 1.01
jaxmd20 / PartOpt / cuda / PreRev 0.001729571 s 0.001712547 s 1.01
jaxmd20 / PartOpt / cuda / PostRev 0.005369672 s 0.0053235919999999 s 1.01
jaxmd20 / PartOpt / cuda / BothRev 0.001691267 s 0.0016367069999999 s 1.03
jaxmd20 / IPartOpt / cuda / PreRev 0.001720162 s 0.001706818 s 1.01
jaxmd20 / IPartOpt / cuda / PostRev 0.005377992 s 0.005346889 s 1.01
jaxmd20 / IPartOpt / cuda / BothRev 0.0016504369999999 s 0.001632098 s 1.01
jaxmd20 / DefOpt / cuda / PreRev 0.001719395 s 0.001707459 s 1.01
jaxmd20 / DefOpt / cuda / PostRev 0.002747779 s 0.002712549 s 1.01
jaxmd20 / DefOpt / cuda / BothRev 0.001641153 s 0.0016335059999999 s 1.00
jaxmd20 / IDefOpt / cuda / PreRev 0.00174365 s 0.001709282 s 1.02
jaxmd20 / IDefOpt / cuda / PostRev 0.001989765 s 0.001992451 s 1.00
jaxmd20 / IDefOpt / cuda / BothRev 0.001643491 s 0.001657283 s 0.99
jaxmd20 / JaXPipe / tpu / Primal 0.009274043125 s 0.0092747475 s 1.00
jaxmd20 / Jax / tpu / Primal 0.009265734375 s 0.009263269375 s 1.00
jaxmd20 / HLOOpt / tpu / Primal 0.0091657775 s 0.009166279375 s 1.00
jaxmd20 / PartOpt / tpu / Primal 0.0091969462499999 s 0.009197805625 s 1.00
jaxmd20 / IPartOpt / tpu / Primal 0.00920220875 s 0.0092011981249999 s 1.00
jaxmd20 / DefOpt / tpu / Primal 0.008745315625 s 0.0087459237499999 s 1.00
jaxmd20 / IDefOpt / tpu / Primal 0.008631079375 s 0.0086314499999999 s 1.00
jaxmd20 / JaXPipe / tpu / Forward 0.017264915 s 0.0172624974999999 s 1.00
jaxmd20 / Jax / tpu / Forward 0.01872638375 s 0.018729604375 s 1.00
jaxmd20 / HLOOpt / tpu / Forward 0.017236946875 s 0.017236786875 s 1.00
jaxmd20 / PartOpt / tpu / Forward 0.017267438125 s 0.017269618125 s 1.00
jaxmd20 / IPartOpt / tpu / Forward 0.0172611375 s 0.0172633 s 1.00
jaxmd20 / DefOpt / tpu / Forward 0.017265241875 s 0.017262036875 s 1.00
jaxmd20 / IDefOpt / tpu / Forward 0.017264225 s 0.017262990625 s 1.00
jaxmd20 / JaXPipe / tpu / PreRev 0.025345483125 s 0.02533983375 s 1.00
jaxmd20 / JaXPipe / tpu / PostRev 0.0218921199999999 s 0.021892336875 s 1.00
jaxmd20 / JaXPipe / tpu / BothRev 0.0253572975 s 0.025355716875 s 1.00
jaxmd20 / Jax / tpu / BothRev 0.021891861875 s 0.021893774375 s 1.00
jaxmd20 / HLOOpt / tpu / PreRev 0.025349535 s 0.02535478625 s 1.00
jaxmd20 / HLOOpt / tpu / PostRev 0.02098512875 s 0.020731351875 s 1.01
jaxmd20 / HLOOpt / tpu / BothRev 0.025265225625 s 0.0252639975 s 1.00
jaxmd20 / PartOpt / tpu / PreRev 0.02536283625 s 0.0253615693749999 s 1.00
jaxmd20 / PartOpt / tpu / PostRev 0.021509740625 s 0.021509948125 s 1.00
jaxmd20 / PartOpt / tpu / BothRev 0.0252895174999999 s 0.02528850625 s 1.00
jaxmd20 / IPartOpt / tpu / PreRev 0.02534875375 s 0.025349745 s 1.00
jaxmd20 / IPartOpt / tpu / PostRev 0.0215413031249999 s 0.02154356125 s 1.00
jaxmd20 / IPartOpt / tpu / BothRev 0.025262615 s 0.025260910625 s 1.00
jaxmd20 / DefOpt / tpu / PreRev 0.0253608881249999 s 0.025364399375 s 1.00
jaxmd20 / DefOpt / tpu / PostRev 0.01877164375 s 0.018772566875 s 1.00
jaxmd20 / DefOpt / tpu / BothRev 0.0252865962499999 s 0.02528473125 s 1.00
jaxmd20 / IDefOpt / tpu / PreRev 0.02534712375 s 0.02534748625 s 1.00
jaxmd20 / IDefOpt / tpu / PostRev 0.01811546125 s 0.01811826 s 1.00
jaxmd20 / IDefOpt / tpu / BothRev 0.025260875625 s 0.02526199 s 1.00
jaxmd40 / JaXPipe / cpu / Primal 0.088139626 s 0.064389043 s 1.37
jaxmd40 / Jax / cpu / Primal 0.07484411 s 0.075098109 s 1.00
jaxmd40 / HLOOpt / cpu / Primal 0.08396951 s 0.09393414 s 0.89
jaxmd40 / PartOpt / cpu / Primal 0.063254848 s 0.065813923 s 0.96
jaxmd40 / IPartOpt / cpu / Primal 0.074285567 s 0.071054981 s 1.05
jaxmd40 / DefOpt / cpu / Primal 0.090359072 s 0.088649606 s 1.02
jaxmd40 / IDefOpt / cpu / Primal 0.092169058 s 0.08105066 s 1.14
jaxmd40 / JaXPipe / cpu / Forward 0.164415489 s 0.1586726459999999 s 1.04
jaxmd40 / Jax / cpu / Forward 0.104059021 s 0.090318597 s 1.15
jaxmd40 / HLOOpt / cpu / Forward 0.1636139049999999 s 0.164154542 s 1.00
jaxmd40 / PartOpt / cpu / Forward 0.161517569 s 0.156601635 s 1.03
jaxmd40 / IPartOpt / cpu / Forward 0.166110405 s 0.164045594 s 1.01
jaxmd40 / DefOpt / cpu / Forward 0.172983424 s 0.153992216 s 1.12
jaxmd40 / IDefOpt / cpu / Forward 0.174806117 s 0.156165278 s 1.12
jaxmd40 / JaXPipe / cpu / PreRev 0.256795694 s 0.2168472209999999 s 1.18
jaxmd40 / JaXPipe / cpu / PostRev 0.143777397 s 0.137391516 s 1.05
jaxmd40 / JaXPipe / cpu / BothRev 0.239807983 s 0.224425981 s 1.07
jaxmd40 / Jax / cpu / BothRev 0.147690925 s 0.1403212519999999 s 1.05
jaxmd40 / HLOOpt / cpu / PreRev 0.221257155 s 0.229696034 s 0.96
jaxmd40 / HLOOpt / cpu / PostRev 0.181558132 s 0.1734641039999999 s 1.05
jaxmd40 / HLOOpt / cpu / BothRev 0.2519533419999999 s 0.238734951 s 1.06
jaxmd40 / PartOpt / cpu / PreRev 0.2345179339999999 s 0.230188916 s 1.02
jaxmd40 / PartOpt / cpu / PostRev 0.139092571 s 0.1301312979999999 s 1.07
jaxmd40 / PartOpt / cpu / BothRev 0.237424331 s 0.2444349479999999 s 0.97
jaxmd40 / IPartOpt / cpu / PreRev 0.241023134 s 0.21103832 s 1.14
jaxmd40 / IPartOpt / cpu / PostRev 0.126814143 s 0.133954614 s 0.95
jaxmd40 / IPartOpt / cpu / BothRev 0.248520878 s 0.235675218 s 1.05
jaxmd40 / DefOpt / cpu / PreRev 0.224910529 s 0.218206029 s 1.03
jaxmd40 / DefOpt / cpu / PostRev 0.16998 s 0.170328459 s 1.00
jaxmd40 / DefOpt / cpu / BothRev 0.257545449 s 0.25450203 s 1.01
jaxmd40 / IDefOpt / cpu / PreRev 0.217652286 s 0.221404833 s 0.98
jaxmd40 / IDefOpt / cpu / PostRev 0.17534624 s 0.1760200079999999 s 1.00
jaxmd40 / IDefOpt / cpu / BothRev 0.257813627 s 0.22752861 s 1.13
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal 1.701544392 s 1.705519003 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal 1.704584304 s 1.707962428 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal 1.716863181 s 1.7195978399999998 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal 1.696285122 s 1.699366622 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal 1.695293262 s 1.69751975 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal 1.6638795279999998 s 1.6681674169999998 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal 1.910072394 s 1.915522691 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal 3.988526180625 s 4.01699938375 s 0.99
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal 3.038666975625 s 3.03880095375 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal 3.121071391875 s 3.121094105625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal 3.059029013125 s 3.059155715 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal 3.05898760625 s 3.059077515 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal 2.102623036875 s 2.102721625625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal 4.356143315000001 s 4.356271070625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal 6.223370612 s 5.851205109 s 1.06
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal 6.110045841 s 5.961654927 s 1.02
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal 6.118277398999999 s 6.033430212 s 1.01
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal 6.192022785 s 6.018784604 s 1.03
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal 6.124048152 s 5.926606443 s 1.03
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal 2.375033473 s 2.251407652 s 1.05
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal 6.706174891 s 6.355471251 s 1.06

This comment was automatically generated by workflow using github-action-benchmark.

@jumerckx jumerckx force-pushed the jm/multirotate_recognize branch from a251345 to e8acf0b Compare January 24, 2026 17:34
@jumerckx
Copy link
Collaborator Author

A (positive) rotation amount signifies rotation to the left. I've added this in RotateOp's description

@wsmoses
Copy link
Member

wsmoses commented Jan 24, 2026

needs rebase otherwise good to merge!

@wsmoses wsmoses merged commit bb071ab into main Jan 26, 2026
24 of 25 checks passed
@wsmoses wsmoses deleted the jm/multirotate_recognize branch January 26, 2026 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants