-
Notifications
You must be signed in to change notification settings - Fork 26
feat: compare op simplification for non-negative operands #1916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
536d134 to
e29d921
Compare
e29d921 to
c2ea08a
Compare
c2ea08a to
c9032fc
Compare
wsmoses
approved these changes
Jan 9, 2026
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: c9032fc | Previous: 6fa5904 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000007201319931482431 s |
0.000013693 s |
0.53 |
actmtch / Jax / cpu / Primal |
0.000006899879936099751 s |
0.000013585 s |
0.51 |
actmtch / HLOOpt / cpu / Primal |
0.000008180900003935676 s |
0.000014202 s |
0.58 |
actmtch / PartOpt / cpu / Primal |
0.00000693867999871145 s |
0.000013269 s |
0.52 |
actmtch / IPartOpt / cpu / Primal |
0.000007204520061350195 s |
0.000013422 s |
0.54 |
actmtch / DefOpt / cpu / Primal |
0.00000776888005930232 s |
0.000013765 s |
0.56 |
actmtch / IDefOpt / cpu / Primal |
0.000007733199981885264 s |
0.000014063 s |
0.55 |
actmtch / JaXPipe / cpu / Forward |
0.000011701580078806728 s |
0.000019581 s |
0.60 |
actmtch / Jax / cpu / Forward |
0.000010763479931483744 s |
0.000017748 s |
0.61 |
actmtch / HLOOpt / cpu / Forward |
0.000011524779947649222 s |
0.000018994 s |
0.61 |
actmtch / PartOpt / cpu / Forward |
0.000011748980014090194 s |
0.000018623 s |
0.63 |
actmtch / IPartOpt / cpu / Forward |
0.000012245100006111896 s |
0.000019281 s |
0.64 |
actmtch / DefOpt / cpu / Forward |
0.000011359860091033624 s |
0.000018901 s |
0.60 |
actmtch / IDefOpt / cpu / Forward |
0.000011715440032276091 s |
0.000019099 s |
0.61 |
actmtch / JaXPipe / cpu / PreRev |
0.000011614999875746434 s |
0.000019544 s |
0.59 |
actmtch / JaXPipe / cpu / PostRev |
0.000011189559954800642 s |
0.000017476 s |
0.64 |
actmtch / JaXPipe / cpu / BothRev |
0.000011909900058526544 s |
0.000019317 s |
0.62 |
actmtch / Jax / cpu / BothRev |
0.000010502559998712968 s |
0.000017624 s |
0.60 |
actmtch / HLOOpt / cpu / PreRev |
0.000012520419932116057 s |
0.000019388 s |
0.65 |
actmtch / HLOOpt / cpu / PostRev |
0.000013937340081611185 s |
0.000019539 s |
0.71 |
actmtch / HLOOpt / cpu / BothRev |
0.000012083559995517135 s |
0.000019324 s |
0.63 |
actmtch / PartOpt / cpu / PreRev |
0.000011736840024241246 s |
0.000019265 s |
0.61 |
actmtch / PartOpt / cpu / PostRev |
0.000011038480006391182 s |
0.000017578 s |
0.63 |
actmtch / PartOpt / cpu / BothRev |
0.00001202295998155023 s |
0.000019093 s |
0.63 |
actmtch / IPartOpt / cpu / PreRev |
0.00001199831991471001 s |
0.000019073 s |
0.63 |
actmtch / IPartOpt / cpu / PostRev |
0.000010725960055424369 s |
0.000017405 s |
0.62 |
actmtch / IPartOpt / cpu / BothRev |
0.00001221437998538022 s |
0.000019417 s |
0.63 |
actmtch / DefOpt / cpu / PreRev |
0.000011741720027202972 s |
0.000019016 s |
0.62 |
actmtch / DefOpt / cpu / PostRev |
0.000012539139970613175 s |
0.000019018 s |
0.66 |
actmtch / DefOpt / cpu / BothRev |
0.000011934440008189995 s |
0.000019349 s |
0.62 |
actmtch / IDefOpt / cpu / PreRev |
0.00001198208001369494 s |
0.000019193 s |
0.62 |
actmtch / IDefOpt / cpu / PostRev |
0.00001256468001884059 s |
0.000018937 s |
0.66 |
actmtch / IDefOpt / cpu / BothRev |
0.00001217254000948742 s |
0.000019236 s |
0.63 |
actmtch / JaXPipe / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / Jax / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / HLOOpt / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / PartOpt / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / IPartOpt / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / DefOpt / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / IDefOpt / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / JaXPipe / cuda / Forward |
0.000010112 s |
0.000010687 s |
0.95 |
actmtch / Jax / cuda / Forward |
0.000009792 s |
0.00001104 s |
0.89 |
actmtch / HLOOpt / cuda / Forward |
0.000009793 s |
0.0000104 s |
0.94 |
actmtch / PartOpt / cuda / Forward |
0.00001008 s |
0.000010848 s |
0.93 |
actmtch / IPartOpt / cuda / Forward |
0.00001008 s |
0.000010528 s |
0.96 |
actmtch / DefOpt / cuda / Forward |
0.000010112 s |
0.00001088 s |
0.93 |
actmtch / IDefOpt / cuda / Forward |
0.000009888 s |
0.000011071 s |
0.89 |
actmtch / JaXPipe / cuda / PreRev |
0.000009664 s |
0.000010432 s |
0.93 |
actmtch / JaXPipe / cuda / PostRev |
0.000010048 s |
0.00001072 s |
0.94 |
actmtch / JaXPipe / cuda / BothRev |
0.000009824 s |
0.000010496 s |
0.94 |
actmtch / Jax / cuda / BothRev |
0.000010304 s |
0.000010849 s |
0.95 |
actmtch / HLOOpt / cuda / PreRev |
0.000010048 s |
0.000010592 s |
0.95 |
actmtch / HLOOpt / cuda / PostRev |
0.000010144 s |
0.000010783 s |
0.94 |
actmtch / HLOOpt / cuda / BothRev |
0.000010272 s |
0.000010528 s |
0.98 |
actmtch / PartOpt / cuda / PreRev |
0.00001008 s |
0.00001072 s |
0.94 |
actmtch / PartOpt / cuda / PostRev |
0.000010048 s |
0.000010752 s |
0.93 |
actmtch / PartOpt / cuda / BothRev |
0.000010049 s |
0.000010047 s |
1.00 |
actmtch / IPartOpt / cuda / PreRev |
0.000010016 s |
0.000010463 s |
0.96 |
actmtch / IPartOpt / cuda / PostRev |
0.000010208 s |
0.000011424 s |
0.89 |
actmtch / IPartOpt / cuda / BothRev |
0.000010112 s |
0.000010656 s |
0.95 |
actmtch / DefOpt / cuda / PreRev |
0.00001136 s |
0.000010688 s |
1.06 |
actmtch / DefOpt / cuda / PostRev |
0.000009951 s |
0.000010624 s |
0.94 |
actmtch / DefOpt / cuda / BothRev |
0.000010144 s |
0.000010624 s |
0.95 |
actmtch / IDefOpt / cuda / PreRev |
0.00000976 s |
0.00001072 s |
0.91 |
actmtch / IDefOpt / cuda / PostRev |
0.000010176 s |
0.000010272 s |
0.99 |
actmtch / IDefOpt / cuda / BothRev |
0.000010112 s |
0.00001088 s |
0.93 |
actmtch / JaXPipe / cpu / Primal |
0.000017187 s |
0.000013693 s |
1.26 |
actmtch / Jax / cpu / Primal |
0.000017111000000000003 s |
0.000013585 s |
1.26 |
actmtch / HLOOpt / cpu / Primal |
0.000017604000000000003 s |
0.000014202 s |
1.24 |
actmtch / PartOpt / cpu / Primal |
0.000016953 s |
0.000013269 s |
1.28 |
actmtch / IPartOpt / cpu / Primal |
0.000017122 s |
0.000013422 s |
1.28 |
actmtch / DefOpt / cpu / Primal |
0.00001799 s |
0.000013765 s |
1.31 |
actmtch / IDefOpt / cpu / Primal |
0.000017565000000000002 s |
0.000014063 s |
1.25 |
actmtch / JaXPipe / cpu / Forward |
0.000024319 s |
0.000019581 s |
1.24 |
actmtch / Jax / cpu / Forward |
0.000023146 s |
0.000017748 s |
1.30 |
actmtch / HLOOpt / cpu / Forward |
0.000024225 s |
0.000018994 s |
1.28 |
actmtch / PartOpt / cpu / Forward |
0.000025067 s |
0.000018623 s |
1.35 |
actmtch / IPartOpt / cpu / Forward |
0.000023899 s |
0.000019281 s |
1.24 |
actmtch / DefOpt / cpu / Forward |
0.000024835 s |
0.000018901 s |
1.31 |
actmtch / IDefOpt / cpu / Forward |
0.000024257 s |
0.000019099 s |
1.27 |
actmtch / JaXPipe / cpu / PreRev |
0.000024999 s |
0.000019544 s |
1.28 |
actmtch / JaXPipe / cpu / PostRev |
0.000022427 s |
0.000017476 s |
1.28 |
actmtch / JaXPipe / cpu / BothRev |
0.000031608 s |
0.000019317 s |
1.64 |
actmtch / Jax / cpu / BothRev |
0.000022738 s |
0.000017624 s |
1.29 |
actmtch / HLOOpt / cpu / PreRev |
0.000024907 s |
0.000019388 s |
1.28 |
actmtch / HLOOpt / cpu / PostRev |
0.000025351 s |
0.000019539 s |
1.30 |
actmtch / HLOOpt / cpu / BothRev |
0.000024748 s |
0.000019324 s |
1.28 |
actmtch / PartOpt / cpu / PreRev |
0.000024524 s |
0.000019265 s |
1.27 |
actmtch / PartOpt / cpu / PostRev |
0.000021846 s |
0.000017578 s |
1.24 |
actmtch / PartOpt / cpu / BothRev |
0.00002493 s |
0.000019093 s |
1.31 |
actmtch / IPartOpt / cpu / PreRev |
0.00002437 s |
0.000019073 s |
1.28 |
actmtch / IPartOpt / cpu / PostRev |
0.000022499000000000003 s |
0.000017405 s |
1.29 |
actmtch / IPartOpt / cpu / BothRev |
0.000031097 s |
0.000019417 s |
1.60 |
actmtch / DefOpt / cpu / PreRev |
0.000024822 s |
0.000019016 s |
1.31 |
actmtch / DefOpt / cpu / PostRev |
0.00002538 s |
0.000019018 s |
1.33 |
actmtch / DefOpt / cpu / BothRev |
0.000025226 s |
0.000019349 s |
1.30 |
actmtch / IDefOpt / cpu / PreRev |
0.00002453 s |
0.000019193 s |
1.28 |
actmtch / IDefOpt / cpu / PostRev |
0.000024517 s |
0.000018937 s |
1.29 |
actmtch / IDefOpt / cpu / BothRev |
0.000024662 s |
0.000019236 s |
1.28 |
add_one / JaXPipe / cpu / Primal |
0.000007034380050754408 s |
0.000012845 s |
0.55 |
add_one / Jax / cpu / Primal |
0.000007319819906115299 s |
0.000013048 s |
0.56 |
add_one / HLOOpt / cpu / Primal |
0.000007308860094781267 s |
0.000012855 s |
0.57 |
add_one / PartOpt / cpu / Primal |
0.000006844860017736209 s |
0.000013045 s |
0.52 |
add_one / IPartOpt / cpu / Primal |
0.000007604719994560582 s |
0.000012816 s |
0.59 |
add_one / DefOpt / cpu / Primal |
0.000007214479974209098 s |
0.000012893 s |
0.56 |
add_one / IDefOpt / cpu / Primal |
0.0000072192400330095555 s |
0.000012853 s |
0.56 |
add_one / JaXPipe / cpu / Forward |
0.000011213860125280916 s |
0.000017968 s |
0.62 |
add_one / Jax / cpu / Forward |
0.000011201679990335834 s |
0.000017565000000000002 s |
0.64 |
add_one / HLOOpt / cpu / Forward |
0.000011516659960761898 s |
0.000017607999999999998 s |
0.65 |
add_one / PartOpt / cpu / Forward |
0.000011175659965374509 s |
0.00001764 s |
0.63 |
add_one / IPartOpt / cpu / Forward |
0.000010871740068978396 s |
0.000017749000000000002 s |
0.61 |
add_one / DefOpt / cpu / Forward |
0.000011006819931935752 s |
0.000017621000000000003 s |
0.62 |
add_one / IDefOpt / cpu / Forward |
0.000011178679997101428 s |
0.000017704000000000002 s |
0.63 |
add_one / JaXPipe / cpu / PreRev |
0.00001332221996563021 s |
0.000020014 s |
0.67 |
add_one / JaXPipe / cpu / PostRev |
0.000012538280043372653 s |
0.00001934 s |
0.65 |
add_one / JaXPipe / cpu / BothRev |
0.000013508799984265352 s |
0.000019994 s |
0.68 |
add_one / Jax / cpu / BothRev |
0.000013378680014284328 s |
0.000019969 s |
0.67 |
add_one / HLOOpt / cpu / PreRev |
0.00001322725998761598 s |
0.000019379 s |
0.68 |
add_one / HLOOpt / cpu / PostRev |
0.000014914419989509042 s |
0.000019822 s |
0.75 |
add_one / HLOOpt / cpu / BothRev |
0.00001320842005952727 s |
0.000030914 s |
0.43 |
add_one / PartOpt / cpu / PreRev |
0.0000127738600349403 s |
0.000019639 s |
0.65 |
add_one / PartOpt / cpu / PostRev |
0.000013462459955917438 s |
0.000019533 s |
0.69 |
add_one / PartOpt / cpu / BothRev |
0.000013185060015530326 s |
0.000019776 s |
0.67 |
add_one / IPartOpt / cpu / PreRev |
0.000013336700012587244 s |
0.000019921 s |
0.67 |
add_one / IPartOpt / cpu / PostRev |
0.000012693620010395537 s |
0.000019436 s |
0.65 |
add_one / IPartOpt / cpu / BothRev |
0.000012813619996450144 s |
0.000019761 s |
0.65 |
add_one / DefOpt / cpu / PreRev |
0.000012694759971054736 s |
0.000019927 s |
0.64 |
add_one / DefOpt / cpu / PostRev |
0.000012853939970227658 s |
0.000019651 s |
0.65 |
add_one / DefOpt / cpu / BothRev |
0.000012963379849679769 s |
0.000019777 s |
0.66 |
add_one / IDefOpt / cpu / PreRev |
0.000013084940019325584 s |
0.00001971 s |
0.66 |
add_one / IDefOpt / cpu / PostRev |
0.000013241919969004812 s |
0.000019798 s |
0.67 |
add_one / IDefOpt / cpu / BothRev |
0.000012539740055217408 s |
0.000019814 s |
0.63 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002304 s |
0.83 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002304 s |
0.83 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002304 s |
0.83 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002304 s |
0.83 |
add_one / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002303 s |
0.83 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / JaXPipe / cuda / Forward |
0.00001136 s |
0.000010783 s |
1.05 |
add_one / Jax / cuda / Forward |
0.000010016 s |
0.000010784 s |
0.93 |
add_one / HLOOpt / cuda / Forward |
0.00000976 s |
0.000010752 s |
0.91 |
add_one / PartOpt / cuda / Forward |
0.00001024 s |
0.000010496 s |
0.98 |
add_one / IPartOpt / cuda / Forward |
0.00000992 s |
0.000010464 s |
0.95 |
add_one / DefOpt / cuda / Forward |
0.000010016 s |
0.000010752 s |
0.93 |
add_one / IDefOpt / cuda / Forward |
0.000009664 s |
0.000010464 s |
0.92 |
add_one / JaXPipe / cuda / PreRev |
0.0000248 s |
0.000025248 s |
0.98 |
add_one / JaXPipe / cuda / PostRev |
0.000024928 s |
0.000025952 s |
0.96 |
add_one / JaXPipe / cuda / BothRev |
0.000025216 s |
0.000025376 s |
0.99 |
add_one / Jax / cuda / BothRev |
0.000024704 s |
0.000025824 s |
0.96 |
add_one / HLOOpt / cuda / PreRev |
0.000025024 s |
0.00002592 s |
0.97 |
add_one / HLOOpt / cuda / PostRev |
0.00002496 s |
0.0000256 s |
0.98 |
add_one / HLOOpt / cuda / BothRev |
0.000025056 s |
0.000025856 s |
0.97 |
add_one / PartOpt / cuda / PreRev |
0.000025376 s |
0.000026048 s |
0.97 |
add_one / PartOpt / cuda / PostRev |
0.000024863 s |
0.000025536 s |
0.97 |
add_one / PartOpt / cuda / BothRev |
0.000025184 s |
0.000025984 s |
0.97 |
add_one / IPartOpt / cuda / PreRev |
0.000025024 s |
0.000026208 s |
0.95 |
add_one / IPartOpt / cuda / PostRev |
0.00002512 s |
0.00002544 s |
0.99 |
add_one / IPartOpt / cuda / BothRev |
0.000025152 s |
0.000026304 s |
0.96 |
add_one / DefOpt / cuda / PreRev |
0.000025536 s |
0.000026976 s |
0.95 |
add_one / DefOpt / cuda / PostRev |
0.000025088 s |
0.000025568 s |
0.98 |
add_one / DefOpt / cuda / BothRev |
0.000025344 s |
0.000025983 s |
0.98 |
add_one / IDefOpt / cuda / PreRev |
0.000025632 s |
0.000025792 s |
0.99 |
add_one / IDefOpt / cuda / PostRev |
0.0000248 s |
0.000025504 s |
0.97 |
add_one / IDefOpt / cuda / BothRev |
0.000024607 s |
0.000025728 s |
0.96 |
add_one / JaXPipe / cpu / Primal |
0.000016910999999999998 s |
0.000012845 s |
1.32 |
add_one / Jax / cpu / Primal |
0.000016255 s |
0.000013048 s |
1.25 |
add_one / HLOOpt / cpu / Primal |
0.000016346 s |
0.000012855 s |
1.27 |
add_one / PartOpt / cpu / Primal |
0.000016702 s |
0.000013045 s |
1.28 |
add_one / IPartOpt / cpu / Primal |
0.000017168 s |
0.000012816 s |
1.34 |
add_one / DefOpt / cpu / Primal |
0.000016122000000000003 s |
0.000012893 s |
1.25 |
add_one / IDefOpt / cpu / Primal |
0.000016597 s |
0.000012853 s |
1.29 |
add_one / JaXPipe / cpu / Forward |
0.000022809 s |
0.000017968 s |
1.27 |
add_one / Jax / cpu / Forward |
0.00002325 s |
0.000017565000000000002 s |
1.32 |
add_one / HLOOpt / cpu / Forward |
0.000023168 s |
0.000017607999999999998 s |
1.32 |
add_one / PartOpt / cpu / Forward |
0.000022062 s |
0.00001764 s |
1.25 |
add_one / IPartOpt / cpu / Forward |
0.000022365 s |
0.000017749000000000002 s |
1.26 |
add_one / DefOpt / cpu / Forward |
0.00002254 s |
0.000017621000000000003 s |
1.28 |
add_one / IDefOpt / cpu / Forward |
0.00002353 s |
0.000017704000000000002 s |
1.33 |
add_one / JaXPipe / cpu / PreRev |
0.000024604 s |
0.000020014 s |
1.23 |
add_one / JaXPipe / cpu / PostRev |
0.000024662 s |
0.00001934 s |
1.28 |
add_one / JaXPipe / cpu / BothRev |
0.000024958 s |
0.000019994 s |
1.25 |
add_one / Jax / cpu / BothRev |
0.00002419 s |
0.000019969 s |
1.21 |
add_one / HLOOpt / cpu / PreRev |
0.000025334 s |
0.000019379 s |
1.31 |
add_one / HLOOpt / cpu / PostRev |
0.000024793 s |
0.000019822 s |
1.25 |
add_one / HLOOpt / cpu / BothRev |
0.000024742 s |
0.000030914 s |
0.80 |
add_one / PartOpt / cpu / PreRev |
0.000024883 s |
0.000019639 s |
1.27 |
add_one / PartOpt / cpu / PostRev |
0.000025087 s |
0.000019533 s |
1.28 |
add_one / PartOpt / cpu / BothRev |
0.000030182 s |
0.000019776 s |
1.53 |
add_one / IPartOpt / cpu / PreRev |
0.00002498 s |
0.000019921 s |
1.25 |
add_one / IPartOpt / cpu / PostRev |
0.000024899 s |
0.000019436 s |
1.28 |
add_one / IPartOpt / cpu / BothRev |
0.000024874 s |
0.000019761 s |
1.26 |
add_one / DefOpt / cpu / PreRev |
0.000024266 s |
0.000019927 s |
1.22 |
add_one / DefOpt / cpu / PostRev |
0.000024573 s |
0.000019651 s |
1.25 |
add_one / DefOpt / cpu / BothRev |
0.000024245 s |
0.000019777 s |
1.23 |
add_one / IDefOpt / cpu / PreRev |
0.000025175 s |
0.00001971 s |
1.28 |
add_one / IDefOpt / cpu / PostRev |
0.000024673 s |
0.000019798 s |
1.25 |
add_one / IDefOpt / cpu / BothRev |
0.000024585 s |
0.000019814 s |
1.24 |
add_two / JaXPipe / cpu / Primal |
0.000008259959977294783 s |
0.000013174 s |
0.63 |
add_two / Jax / cpu / Primal |
0.000007852259950595907 s |
0.000013197 s |
0.60 |
add_two / HLOOpt / cpu / Primal |
0.000007710800036875299 s |
0.000013221 s |
0.58 |
add_two / PartOpt / cpu / Primal |
0.000007940079995023552 s |
0.000013127 s |
0.60 |
add_two / IPartOpt / cpu / Primal |
0.000007903200021246448 s |
0.000013694 s |
0.58 |
add_two / DefOpt / cpu / Primal |
0.000007476479913748335 s |
0.000013493 s |
0.55 |
add_two / IDefOpt / cpu / Primal |
0.00000761823996072053 s |
0.000013285 s |
0.57 |
add_two / JaXPipe / cpu / Forward |
0.000011124900011054706 s |
0.00001841 s |
0.60 |
add_two / Jax / cpu / Forward |
0.00001174106002508779 s |
0.000018237 s |
0.64 |
add_two / HLOOpt / cpu / Forward |
0.000011820599956990918 s |
0.000018121 s |
0.65 |
add_two / PartOpt / cpu / Forward |
0.00001198922000185121 s |
0.000017817 s |
0.67 |
add_two / IPartOpt / cpu / Forward |
0.00001154243993369164 s |
0.000018347 s |
0.63 |
add_two / DefOpt / cpu / Forward |
0.000011452540038590086 s |
0.000017979 s |
0.64 |
add_two / IDefOpt / cpu / Forward |
0.000011464599992905278 s |
0.000017652 s |
0.65 |
add_two / JaXPipe / cpu / PreRev |
0.000015792379963386337 s |
0.000023426 s |
0.67 |
add_two / JaXPipe / cpu / PostRev |
0.000015142579941311852 s |
0.000023189 s |
0.65 |
add_two / JaXPipe / cpu / BothRev |
0.000015681359873269684 s |
0.000023003 s |
0.68 |
add_two / Jax / cpu / BothRev |
0.000016002479987946572 s |
0.000022819 s |
0.70 |
add_two / HLOOpt / cpu / PreRev |
0.000014953359932405872 s |
0.000023062 s |
0.65 |
add_two / HLOOpt / cpu / PostRev |
0.000017233040034625445 s |
0.000023305 s |
0.74 |
add_two / HLOOpt / cpu / BothRev |
0.00001474040001994581 s |
0.000022702 s |
0.65 |
add_two / PartOpt / cpu / PreRev |
0.000015172719977272208 s |
0.000023151 s |
0.66 |
add_two / PartOpt / cpu / PostRev |
0.000015370119999715826 s |
0.000023277 s |
0.66 |
add_two / PartOpt / cpu / BothRev |
0.000015777060016262113 s |
0.000023147 s |
0.68 |
add_two / IPartOpt / cpu / PreRev |
0.000016025779987103307 s |
0.00002274 s |
0.70 |
add_two / IPartOpt / cpu / PostRev |
0.000014813960060564568 s |
0.000023172 s |
0.64 |
add_two / IPartOpt / cpu / BothRev |
0.000015196239965007408 s |
0.00002328 s |
0.65 |
add_two / DefOpt / cpu / PreRev |
0.000015920499972708057 s |
0.000022870000000000003 s |
0.70 |
add_two / DefOpt / cpu / PostRev |
0.000015671059973101365 s |
0.000023133 s |
0.68 |
add_two / DefOpt / cpu / BothRev |
0.00001569124000525335 s |
0.000023381 s |
0.67 |
add_two / IDefOpt / cpu / PreRev |
0.000015320500006055225 s |
0.00002299 s |
0.67 |
add_two / IDefOpt / cpu / PostRev |
0.00001476474015362328 s |
0.000022887 s |
0.65 |
add_two / IDefOpt / cpu / BothRev |
0.00001537245991130476 s |
0.000023289 s |
0.66 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / JaXPipe / cuda / Forward |
0.000009536 s |
0.000010529 s |
0.91 |
add_two / Jax / cuda / Forward |
0.00000992 s |
0.000010272 s |
0.97 |
add_two / HLOOpt / cuda / Forward |
0.00000912 s |
0.0000104 s |
0.88 |
add_two / PartOpt / cuda / Forward |
0.0000096 s |
0.000010591 s |
0.91 |
add_two / IPartOpt / cuda / Forward |
0.000009632 s |
0.00001088 s |
0.89 |
add_two / DefOpt / cuda / Forward |
0.00000992 s |
0.000010208 s |
0.97 |
add_two / IDefOpt / cuda / Forward |
0.00000976 s |
0.00001072 s |
0.91 |
add_two / JaXPipe / cuda / PreRev |
0.000032256 s |
0.0000352 s |
0.92 |
add_two / JaXPipe / cuda / PostRev |
0.000031583 s |
0.000033472 s |
0.94 |
add_two / JaXPipe / cuda / BothRev |
0.000032544 s |
0.000033472 s |
0.97 |
add_two / Jax / cuda / BothRev |
0.000032800000000000004 s |
0.00003296 s |
1.00 |
add_two / HLOOpt / cuda / PreRev |
0.000032256 s |
0.000033183 s |
0.97 |
add_two / HLOOpt / cuda / PostRev |
0.000032256 s |
0.000032671 s |
0.99 |
add_two / HLOOpt / cuda / BothRev |
0.000032352 s |
0.000033119999999999995 s |
0.98 |
add_two / PartOpt / cuda / PreRev |
0.000050592 s |
0.000033696 s |
1.50 |
add_two / PartOpt / cuda / PostRev |
0.00003168 s |
0.000032864 s |
0.96 |
add_two / PartOpt / cuda / BothRev |
0.000032447 s |
0.000033408 s |
0.97 |
add_two / IPartOpt / cuda / PreRev |
0.000032735 s |
0.000033695 s |
0.97 |
add_two / IPartOpt / cuda / PostRev |
0.000032608 s |
0.000033248 s |
0.98 |
add_two / IPartOpt / cuda / BothRev |
0.000032160000000000004 s |
0.000033216 s |
0.97 |
add_two / DefOpt / cuda / PreRev |
0.000032672 s |
0.000033184 s |
0.98 |
add_two / DefOpt / cuda / PostRev |
0.000032 s |
0.000033217000000000004 s |
0.96 |
add_two / DefOpt / cuda / BothRev |
0.000032608 s |
0.000033248 s |
0.98 |
add_two / IDefOpt / cuda / PreRev |
0.000032608 s |
0.000033407 s |
0.98 |
add_two / IDefOpt / cuda / PostRev |
0.000032448 s |
0.000033344 s |
0.97 |
add_two / IDefOpt / cuda / BothRev |
0.000032 s |
0.000033375000000000005 s |
0.96 |
add_two / JaXPipe / cpu / Primal |
0.000017527 s |
0.000013174 s |
1.33 |
add_two / Jax / cpu / Primal |
0.000016643000000000003 s |
0.000013197 s |
1.26 |
add_two / HLOOpt / cpu / Primal |
0.000017032 s |
0.000013221 s |
1.29 |
add_two / PartOpt / cpu / Primal |
0.000017135 s |
0.000013127 s |
1.31 |
add_two / IPartOpt / cpu / Primal |
0.000017109 s |
0.000013694 s |
1.25 |
add_two / DefOpt / cpu / Primal |
0.000017247999999999998 s |
0.000013493 s |
1.28 |
add_two / IDefOpt / cpu / Primal |
0.000017021999999999997 s |
0.000013285 s |
1.28 |
add_two / JaXPipe / cpu / Forward |
0.000022628 s |
0.00001841 s |
1.23 |
add_two / Jax / cpu / Forward |
0.000022445 s |
0.000018237 s |
1.23 |
add_two / HLOOpt / cpu / Forward |
0.000022991 s |
0.000018121 s |
1.27 |
add_two / PartOpt / cpu / Forward |
0.000022732 s |
0.000017817 s |
1.28 |
add_two / IPartOpt / cpu / Forward |
0.000022375 s |
0.000018347 s |
1.22 |
add_two / DefOpt / cpu / Forward |
0.00002273 s |
0.000017979 s |
1.26 |
add_two / IDefOpt / cpu / Forward |
0.000022687 s |
0.000017652 s |
1.29 |
add_two / JaXPipe / cpu / PreRev |
0.000029078 s |
0.000023426 s |
1.24 |
add_two / JaXPipe / cpu / PostRev |
0.00002851 s |
0.000023189 s |
1.23 |
add_two / JaXPipe / cpu / BothRev |
0.000028740000000000003 s |
0.000023003 s |
1.25 |
add_two / Jax / cpu / BothRev |
0.000028589 s |
0.000022819 s |
1.25 |
add_two / HLOOpt / cpu / PreRev |
0.000027803 s |
0.000023062 s |
1.21 |
add_two / HLOOpt / cpu / PostRev |
0.000029359 s |
0.000023305 s |
1.26 |
add_two / HLOOpt / cpu / BothRev |
0.000028242 s |
0.000022702 s |
1.24 |
add_two / PartOpt / cpu / PreRev |
0.000028773 s |
0.000023151 s |
1.24 |
add_two / PartOpt / cpu / PostRev |
0.000029281000000000003 s |
0.000023277 s |
1.26 |
add_two / PartOpt / cpu / BothRev |
0.000029223 s |
0.000023147 s |
1.26 |
add_two / IPartOpt / cpu / PreRev |
0.000034102 s |
0.00002274 s |
1.50 |
add_two / IPartOpt / cpu / PostRev |
0.000028833 s |
0.000023172 s |
1.24 |
add_two / IPartOpt / cpu / BothRev |
0.000028773 s |
0.00002328 s |
1.24 |
add_two / DefOpt / cpu / PreRev |
0.000029006 s |
0.000022870000000000003 s |
1.27 |
add_two / DefOpt / cpu / PostRev |
0.000028359 s |
0.000023133 s |
1.23 |
add_two / DefOpt / cpu / BothRev |
0.00002806 s |
0.000023381 s |
1.20 |
add_two / IDefOpt / cpu / PreRev |
0.000028431000000000003 s |
0.00002299 s |
1.24 |
add_two / IDefOpt / cpu / PostRev |
0.000035423 s |
0.000022887 s |
1.55 |
add_two / IDefOpt / cpu / BothRev |
0.00002943 s |
0.000023289 s |
1.26 |
cache / JaXPipe / cpu / Primal |
0.000007788720013195417 s |
0.000012915 s |
0.60 |
cache / Jax / cpu / Primal |
0.000006901620054122759 s |
0.000013084 s |
0.53 |
cache / HLOOpt / cpu / Primal |
0.000007173459998739417 s |
0.00001278 s |
0.56 |
cache / PartOpt / cpu / Primal |
0.000006703539984300733 s |
0.000012674 s |
0.53 |
cache / IPartOpt / cpu / Primal |
0.000007330079915845999 s |
0.000012571 s |
0.58 |
cache / DefOpt / cpu / Primal |
0.00000720321993867401 s |
0.000013024 s |
0.55 |
cache / IDefOpt / cpu / Primal |
0.000007182459921750706 s |
0.000012491 s |
0.58 |
cache / JaXPipe / cpu / Forward |
0.000016098240048449953 s |
0.000023874 s |
0.67 |
cache / Jax / cpu / Forward |
0.000015890279992163415 s |
0.000017412000000000002 s |
0.91 |
cache / HLOOpt / cpu / Forward |
0.00001702454002952436 s |
0.000016971 s |
1.00 |
cache / PartOpt / cpu / Forward |
0.000016424240020569413 s |
0.000017065 s |
0.96 |
cache / IPartOpt / cpu / Forward |
0.000016462699913972756 s |
0.000017104 s |
0.96 |
cache / DefOpt / cpu / Forward |
0.000016849940020620125 s |
0.000017124 s |
0.98 |
cache / IDefOpt / cpu / Forward |
0.00001571123995745438 s |
0.000017069999999999998 s |
0.92 |
cache / JaXPipe / cpu / PreRev |
0.000017700279986456735 s |
0.000017882999999999998 s |
0.99 |
cache / JaXPipe / cpu / PostRev |
0.000021089599977131 s |
0.000020973 s |
1.01 |
cache / JaXPipe / cpu / BothRev |
0.00001802619995942223 s |
0.00001835 s |
0.98 |
cache / Jax / cpu / BothRev |
0.00002078466006423696 s |
0.000021256 s |
0.98 |
cache / HLOOpt / cpu / PreRev |
0.000017231199890375138 s |
0.000017772 s |
0.97 |
cache / HLOOpt / cpu / PostRev |
0.000019978979998995783 s |
0.000017704000000000002 s |
1.13 |
cache / HLOOpt / cpu / BothRev |
0.000017123080087912968 s |
0.000018288 s |
0.94 |
cache / PartOpt / cpu / PreRev |
0.000015736880031909093 s |
0.000017628 s |
0.89 |
cache / PartOpt / cpu / PostRev |
0.00002113714002916822 s |
0.000019973 s |
1.06 |
cache / PartOpt / cpu / BothRev |
0.000017007519982144004 s |
0.000017819 s |
0.95 |
cache / IPartOpt / cpu / PreRev |
0.000017031740062520838 s |
0.000017434999999999998 s |
0.98 |
cache / IPartOpt / cpu / PostRev |
0.000019669999965117316 s |
0.00002052 s |
0.96 |
cache / IPartOpt / cpu / BothRev |
0.00001779625994458911 s |
0.00001842 s |
0.97 |
cache / DefOpt / cpu / PreRev |
0.00001696633988103713 s |
0.000017628 s |
0.96 |
cache / DefOpt / cpu / PostRev |
0.000016383759993914283 s |
0.00001768 s |
0.93 |
cache / DefOpt / cpu / BothRev |
0.000016485000032844254 s |
0.000017449 s |
0.94 |
cache / IDefOpt / cpu / PreRev |
0.00001667702001213911 s |
0.000017389000000000002 s |
0.96 |
cache / IDefOpt / cpu / PostRev |
0.000015776520067447565 s |
0.00001734 s |
0.91 |
cache / IDefOpt / cpu / BothRev |
0.00003045335990464082 s |
0.000018119 s |
1.68 |
cache / JaXPipe / cuda / Primal |
0.000002303 s |
0.000002335 s |
0.99 |
cache / Jax / cuda / Primal |
0.000002304 s |
0.000002335 s |
0.99 |
cache / HLOOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / PartOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / IPartOpt / cuda / Primal |
0.000002304 s |
0.000002335 s |
0.99 |
cache / DefOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / IDefOpt / cuda / Primal |
0.000002209 s |
0.000002336 s |
0.95 |
cache / JaXPipe / cuda / Forward |
0.000002335 s |
0.000002336 s |
1.00 |
cache / Jax / cuda / Forward |
0.000002335 s |
0.0000023670000000000004 s |
0.99 |
cache / HLOOpt / cuda / Forward |
0.000002335 s |
0.000002335 s |
1 |
cache / PartOpt / cuda / Forward |
0.000002335 s |
0.0000023670000000000004 s |
0.99 |
cache / IPartOpt / cuda / Forward |
0.000002335 s |
0.000002336 s |
1.00 |
cache / DefOpt / cuda / Forward |
0.000002272 s |
0.0000023670000000000004 s |
0.96 |
cache / IDefOpt / cuda / Forward |
0.000002335 s |
0.000002336 s |
1.00 |
cache / JaXPipe / cuda / PreRev |
0.000010496 s |
0.000010591 s |
0.99 |
cache / JaXPipe / cuda / PostRev |
0.000010304 s |
0.000010752 s |
0.96 |
cache / JaXPipe / cuda / BothRev |
0.000010304 s |
0.000010816 s |
0.95 |
cache / Jax / cuda / BothRev |
0.000010944 s |
0.00001072 s |
1.02 |
cache / HLOOpt / cuda / PreRev |
0.000013536 s |
0.0000136 s |
1.00 |
cache / HLOOpt / cuda / PostRev |
0.000013472 s |
0.000013568 s |
0.99 |
cache / HLOOpt / cuda / BothRev |
0.000013536 s |
0.0000136 s |
1.00 |
cache / PartOpt / cuda / PreRev |
0.000010784 s |
0.000010688 s |
1.01 |
cache / PartOpt / cuda / PostRev |
0.000010784 s |
0.000010688 s |
1.01 |
cache / PartOpt / cuda / BothRev |
0.000010495 s |
0.000010752 s |
0.98 |
cache / IPartOpt / cuda / PreRev |
0.00001088 s |
0.00001088 s |
1 |
cache / IPartOpt / cuda / PostRev |
0.0000104 s |
0.000010592 s |
0.98 |
cache / IPartOpt / cuda / BothRev |
0.000010752 s |
0.000010592 s |
1.02 |
cache / DefOpt / cuda / PreRev |
0.000011103 s |
0.000010688 s |
1.04 |
cache / DefOpt / cuda / PostRev |
0.00001088 s |
0.000010784 s |
1.01 |
cache / DefOpt / cuda / BothRev |
0.00001056 s |
0.000010815 s |
0.98 |
cache / IDefOpt / cuda / PreRev |
0.000010656 s |
0.000010592 s |
1.01 |
cache / IDefOpt / cuda / PostRev |
0.000010816 s |
0.000010432 s |
1.04 |
cache / IDefOpt / cuda / BothRev |
0.00001072 s |
0.000010944 s |
0.98 |
cache / JaXPipe / cpu / Primal |
0.000019204 s |
0.000012915 s |
1.49 |
cache / Jax / cpu / Primal |
0.000019196 s |
0.000013084 s |
1.47 |
cache / HLOOpt / cpu / Primal |
0.000019121 s |
0.00001278 s |
1.50 |
cache / PartOpt / cpu / Primal |
0.000019375 s |
0.000012674 s |
1.53 |
cache / IPartOpt / cpu / Primal |
0.000019127 s |
0.000012571 s |
1.52 |
cache / DefOpt / cpu / Primal |
0.000019011 s |
0.000013024 s |
1.46 |
cache / IDefOpt / cpu / Primal |
0.00001919 s |
0.000012491 s |
1.54 |
cache / JaXPipe / cpu / Forward |
0.000027992 s |
0.000023874 s |
1.17 |
cache / Jax / cpu / Forward |
0.000029526 s |
0.000017412000000000002 s |
1.70 |
cache / HLOOpt / cpu / Forward |
0.000028862 s |
0.000016971 s |
1.70 |
cache / PartOpt / cpu / Forward |
0.0000287 s |
0.000017065 s |
1.68 |
cache / IPartOpt / cpu / Forward |
0.000023351 s |
0.000017104 s |
1.37 |
cache / DefOpt / cpu / Forward |
0.000022127 s |
0.000017124 s |
1.29 |
cache / IDefOpt / cpu / Forward |
0.000022363000000000003 s |
0.000017069999999999998 s |
1.31 |
cache / JaXPipe / cpu / PreRev |
0.000023223 s |
0.000017882999999999998 s |
1.30 |
cache / JaXPipe / cpu / PostRev |
0.000027043 s |
0.000020973 s |
1.29 |
cache / JaXPipe / cpu / BothRev |
0.000023516 s |
0.00001835 s |
1.28 |
cache / Jax / cpu / BothRev |
0.00002702 s |
0.000021256 s |
1.27 |
cache / HLOOpt / cpu / PreRev |
0.000031205 s |
0.000017772 s |
1.76 |
cache / HLOOpt / cpu / PostRev |
0.000022456 s |
0.000017704000000000002 s |
1.27 |
cache / HLOOpt / cpu / BothRev |
0.000023465 s |
0.000018288 s |
1.28 |
cache / PartOpt / cpu / PreRev |
0.000022647 s |
0.000017628 s |
1.28 |
cache / PartOpt / cpu / PostRev |
0.000026537 s |
0.000019973 s |
1.33 |
cache / PartOpt / cpu / BothRev |
0.000027707 s |
0.000017819 s |
1.55 |
cache / IPartOpt / cpu / PreRev |
0.000024641 s |
0.000017434999999999998 s |
1.41 |
cache / IPartOpt / cpu / PostRev |
0.000025745 s |
0.00002052 s |
1.25 |
cache / IPartOpt / cpu / BothRev |
0.000033968 s |
0.00001842 s |
1.84 |
cache / DefOpt / cpu / PreRev |
0.000035927000000000003 s |
0.000017628 s |
2.04 |
cache / DefOpt / cpu / PostRev |
0.000034671 s |
0.00001768 s |
1.96 |
cache / DefOpt / cpu / BothRev |
0.000022249 s |
0.000017449 s |
1.28 |
cache / IDefOpt / cpu / PreRev |
0.000022741 s |
0.000017389000000000002 s |
1.31 |
cache / IDefOpt / cpu / PostRev |
0.000022222 s |
0.00001734 s |
1.28 |
cache / IDefOpt / cpu / BothRev |
0.00002324 s |
0.000018119 s |
1.28 |
Concat / JaXPipe / cpu / Primal |
0.000007456739949702751 s |
0.000012834 s |
0.58 |
Concat / Jax / cpu / Primal |
0.000007775160083838273 s |
0.000012809 s |
0.61 |
Concat / HLOOpt / cpu / Primal |
0.000007000319947110256 s |
0.000013168 s |
0.53 |
Concat / PartOpt / cpu / Primal |
0.000007046319951768964 s |
0.000013103 s |
0.54 |
Concat / IPartOpt / cpu / Primal |
0.000007259900030476274 s |
0.000012935 s |
0.56 |
Concat / DefOpt / cpu / Primal |
0.000007452939953509485 s |
0.000013538 s |
0.55 |
Concat / IDefOpt / cpu / Primal |
0.000007563439976365771 s |
0.000012679 s |
0.60 |
Concat / JaXPipe / cpu / Forward |
0.000011245020068599844 s |
0.000018003 s |
0.62 |
Concat / Jax / cpu / Forward |
0.000011482760037324624 s |
0.000017769 s |
0.65 |
Concat / HLOOpt / cpu / Forward |
0.000011660739983199164 s |
0.000017542999999999998 s |
0.66 |
Concat / PartOpt / cpu / Forward |
0.000010989980019076027 s |
0.000017561 s |
0.63 |
Concat / IPartOpt / cpu / Forward |
0.0000111760000254435 s |
0.000017515 s |
0.64 |
Concat / DefOpt / cpu / Forward |
0.00001143011993917753 s |
0.0000176 s |
0.65 |
Concat / IDefOpt / cpu / Forward |
0.000010942740009340924 s |
0.000017825 s |
0.61 |
Concat / JaXPipe / cpu / PreRev |
0.000013116779937263346 s |
0.00002049 s |
0.64 |
Concat / JaXPipe / cpu / PostRev |
0.000012138660040363902 s |
0.000019962 s |
0.61 |
Concat / JaXPipe / cpu / BothRev |
0.00001262137991943746 s |
0.000019878 s |
0.63 |
Concat / Jax / cpu / BothRev |
0.000012852839990955544 s |
0.000020003 s |
0.64 |
Concat / HLOOpt / cpu / PreRev |
0.000012378939991322114 s |
0.000020007 s |
0.62 |
Concat / HLOOpt / cpu / PostRev |
0.000014624119994550713 s |
0.000020165 s |
0.73 |
Concat / HLOOpt / cpu / BothRev |
0.00001253108004675596 s |
0.000019592 s |
0.64 |
Concat / PartOpt / cpu / PreRev |
0.000012716259971057295 s |
0.000020409 s |
0.62 |
Concat / PartOpt / cpu / PostRev |
0.000012448319903342054 s |
0.00002033 s |
0.61 |
Concat / PartOpt / cpu / BothRev |
0.000013392720065894536 s |
0.000019275 s |
0.69 |
Concat / IPartOpt / cpu / PreRev |
0.000012670139931287847 s |
0.000019956 s |
0.63 |
Concat / IPartOpt / cpu / PostRev |
0.00001253014004760189 s |
0.000019605 s |
0.64 |
Concat / IPartOpt / cpu / BothRev |
0.000012867919976997655 s |
0.000019625 s |
0.66 |
Concat / DefOpt / cpu / PreRev |
0.000012563260024762713 s |
0.000020967 s |
0.60 |
Concat / DefOpt / cpu / PostRev |
0.000012539779945655029 s |
0.000019458 s |
0.64 |
Concat / DefOpt / cpu / BothRev |
0.000013063999995210909 s |
0.00001977 s |
0.66 |
Concat / IDefOpt / cpu / PreRev |
0.00001310702005866915 s |
0.000020008 s |
0.66 |
Concat / IDefOpt / cpu / PostRev |
0.00001235310000993195 s |
0.000019651 s |
0.63 |
Concat / IDefOpt / cpu / BothRev |
0.000012811820033675758 s |
0.000019793 s |
0.65 |
Concat / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002432 s |
0.79 |
Concat / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
Concat / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002432 s |
0.79 |
Concat / JaXPipe / cuda / Forward |
0.00000992 s |
0.000010656 s |
0.93 |
Concat / Jax / cuda / Forward |
0.000010112 s |
0.000010527 s |
0.96 |
Concat / HLOOpt / cuda / Forward |
0.000009952 s |
0.000010688 s |
0.93 |
Concat / PartOpt / cuda / Forward |
0.000009728 s |
0.000010848 s |
0.90 |
Concat / IPartOpt / cuda / Forward |
0.000009984 s |
0.00001136 s |
0.88 |
Concat / DefOpt / cuda / Forward |
0.000010015 s |
0.000010944 s |
0.92 |
Concat / IDefOpt / cuda / Forward |
0.00000992 s |
0.000011712 s |
0.85 |
Concat / JaXPipe / cuda / PreRev |
0.000016416 s |
0.000018752000000000003 s |
0.88 |
Concat / JaXPipe / cuda / PostRev |
0.000016545 s |
0.000018208 s |
0.91 |
Concat / JaXPipe / cuda / BothRev |
0.000016255999999999998 s |
0.000018688 s |
0.87 |
Concat / Jax / cuda / BothRev |
0.00001664 s |
0.00001856 s |
0.90 |
Concat / HLOOpt / cuda / PreRev |
0.000016255 s |
0.000016704 s |
0.97 |
Concat / HLOOpt / cuda / PostRev |
0.000016255999999999998 s |
0.000016927999999999998 s |
0.96 |
Concat / HLOOpt / cuda / BothRev |
0.000016128 s |
0.000016864 s |
0.96 |
Concat / PartOpt / cuda / PreRev |
0.0000168 s |
0.00001712 s |
0.98 |
Concat / PartOpt / cuda / PostRev |
0.000017632 s |
0.000017151 s |
1.03 |
Concat / PartOpt / cuda / BothRev |
0.000016 s |
0.000017216 s |
0.93 |
Concat / IPartOpt / cuda / PreRev |
0.000016127 s |
0.000017024 s |
0.95 |
Concat / IPartOpt / cuda / PostRev |
0.00001616 s |
0.00001712 s |
0.94 |
Concat / IPartOpt / cuda / BothRev |
0.000015872 s |
0.000017056 s |
0.93 |
Concat / DefOpt / cuda / PreRev |
0.000016096 s |
0.000017151 s |
0.94 |
Concat / DefOpt / cuda / PostRev |
0.000016224 s |
0.000016927000000000002 s |
0.96 |
Concat / DefOpt / cuda / BothRev |
0.000016416 s |
0.0000168 s |
0.98 |
Concat / IDefOpt / cuda / PreRev |
0.000016352 s |
0.000016736 s |
0.98 |
Concat / IDefOpt / cuda / PostRev |
0.000015904000000000002 s |
0.00001712 s |
0.93 |
Concat / IDefOpt / cuda / BothRev |
0.000016032 s |
0.000016896000000000002 s |
0.95 |
Concat / JaXPipe / cpu / Primal |
0.00001601 s |
0.000012834 s |
1.25 |
Concat / Jax / cpu / Primal |
0.00001617 s |
0.000012809 s |
1.26 |
Concat / HLOOpt / cpu / Primal |
0.0000163 s |
0.000013168 s |
1.24 |
Concat / PartOpt / cpu / Primal |
0.000016381 s |
0.000013103 s |
1.25 |
Concat / IPartOpt / cpu / Primal |
0.000016062 s |
0.000012935 s |
1.24 |
Concat / DefOpt / cpu / Primal |
0.00001619 s |
0.000013538 s |
1.20 |
Concat / IDefOpt / cpu / Primal |
0.000016722 s |
0.000012679 s |
1.32 |
Concat / JaXPipe / cpu / Forward |
0.000022536 s |
0.000018003 s |
1.25 |
Concat / Jax / cpu / Forward |
0.000022376 s |
0.000017769 s |
1.26 |
Concat / HLOOpt / cpu / Forward |
0.000023705 s |
0.000017542999999999998 s |
1.35 |
Concat / PartOpt / cpu / Forward |
0.000021783 s |
0.000017561 s |
1.24 |
Concat / IPartOpt / cpu / Forward |
0.000022378 s |
0.000017515 s |
1.28 |
Concat / DefOpt / cpu / Forward |
0.00002213 s |
0.0000176 s |
1.26 |
Concat / IDefOpt / cpu / Forward |
0.000022386 s |
0.000017825 s |
1.26 |
Concat / JaXPipe / cpu / PreRev |
0.000025438 s |
0.00002049 s |
1.24 |
Concat / JaXPipe / cpu / PostRev |
0.000025675 s |
0.000019962 s |
1.29 |
Concat / JaXPipe / cpu / BothRev |
0.000025249 s |
0.000019878 s |
1.27 |
Concat / Jax / cpu / BothRev |
0.000025144 s |
0.000020003 s |
1.26 |
Concat / HLOOpt / cpu / PreRev |
0.000025239 s |
0.000020007 s |
1.26 |
Concat / HLOOpt / cpu / PostRev |
0.000025772 s |
0.000020165 s |
1.28 |
Concat / HLOOpt / cpu / BothRev |
0.000025274 s |
0.000019592 s |
1.29 |
Concat / PartOpt / cpu / PreRev |
0.000024645 s |
0.000020409 s |
1.21 |
Concat / PartOpt / cpu / PostRev |
0.000025331 s |
0.00002033 s |
1.25 |
Concat / PartOpt / cpu / BothRev |
0.000025811 s |
0.000019275 s |
1.34 |
Concat / IPartOpt / cpu / PreRev |
0.000025042 s |
0.000019956 s |
1.25 |
Concat / IPartOpt / cpu / PostRev |
0.000024779 s |
0.000019605 s |
1.26 |
Concat / IPartOpt / cpu / BothRev |
0.000024942 s |
0.000019625 s |
1.27 |
Concat / DefOpt / cpu / PreRev |
0.00002553 s |
0.000020967 s |
1.22 |
Concat / DefOpt / cpu / PostRev |
0.000025127 s |
0.000019458 s |
1.29 |
Concat / DefOpt / cpu / BothRev |
0.000024911 s |
0.00001977 s |
1.26 |
Concat / IDefOpt / cpu / PreRev |
0.000024969 s |
0.000020008 s |
1.25 |
Concat / IDefOpt / cpu / PostRev |
0.00002478 s |
0.000019651 s |
1.26 |
Concat / IDefOpt / cpu / BothRev |
0.000025152 s |
0.000019793 s |
1.27 |
const_scatter / JaXPipe / cpu / Primal |
0.000007154420090955682 s |
0.000012852 s |
0.56 |
const_scatter / Jax / cpu / Primal |
0.000006759899952157866 s |
0.000012697 s |
0.53 |
const_scatter / HLOOpt / cpu / Primal |
0.000007399020050797844 s |
0.000013442 s |
0.55 |
const_scatter / PartOpt / cpu / Primal |
0.000007071519958117278 s |
0.000012651 s |
0.56 |
const_scatter / IPartOpt / cpu / Primal |
0.000007502479929826222 s |
0.000012527 s |
0.60 |
const_scatter / DefOpt / cpu / Primal |
0.000007541679933638079 s |
0.00001329 s |
0.57 |
const_scatter / IDefOpt / cpu / Primal |
0.000007414639949274715 s |
0.000013248 s |
0.56 |
const_scatter / JaXPipe / cpu / Forward |
0.000011879740068252433 s |
0.000018184 s |
0.65 |
const_scatter / Jax / cpu / Forward |
0.000010172320035053416 s |
0.000016712999999999997 s |
0.61 |
const_scatter / HLOOpt / cpu / Forward |
0.000012081260010745609 s |
0.000017864 s |
0.68 |
const_scatter / PartOpt / cpu / Forward |
0.000011645359936665044 s |
0.000017628 s |
0.66 |
const_scatter / IPartOpt / cpu / Forward |
0.00001197507997858338 s |
0.000017791 s |
0.67 |
const_scatter / DefOpt / cpu / Forward |
0.000011461919948487775 s |
0.000018052 s |
0.63 |
const_scatter / IDefOpt / cpu / Forward |
0.000011299039961158997 s |
0.000018058 s |
0.63 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002902136800003 s |
0.000494193 s |
0.59 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002845714599243 s |
0.000490383 s |
0.58 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002870615399842 s |
0.000515952 s |
0.56 |
const_scatter / Jax / cpu / BothRev |
0.000284687319945 s |
0.000499809 s |
0.57 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002851120000013 s |
0.000506807 s |
0.56 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002875938599936 s |
0.000532947 s |
0.54 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002868951599702 s |
0.000492198 s |
0.58 |
const_scatter / PartOpt / cpu / PreRev |
0.0002844207199996 s |
0.000513277 s |
0.55 |
const_scatter / PartOpt / cpu / PostRev |
0.0002857115599726 s |
0.000495613 s |
0.58 |
const_scatter / PartOpt / cpu / BothRev |
0.0002854904398373 s |
0.000528017 s |
0.54 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002865611399101 s |
0.000518134 s |
0.55 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002861602600023 s |
0.000499348 s |
0.57 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002903588000481 s |
0.000509812 s |
0.57 |
const_scatter / DefOpt / cpu / PreRev |
0.0002857440000661 s |
0.000512476 s |
0.56 |
const_scatter / DefOpt / cpu / PostRev |
0.0002870006400189 s |
0.000525034 s |
0.55 |
const_scatter / DefOpt / cpu / BothRev |
0.0002847741400182 s |
0.000517393 s |
0.55 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002877239400368 s |
0.000538514 s |
0.53 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002858229000412 s |
0.000505837 s |
0.57 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002864221199524 s |
0.000512877 s |
0.56 |
const_scatter / JaXPipe / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / Jax / cuda / Primal |
0.000001887 s |
0.000002433 s |
0.78 |
const_scatter / HLOOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / PartOpt / cuda / Primal |
0.000001888 s |
0.000002463 s |
0.77 |
const_scatter / IPartOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / DefOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / IDefOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / JaXPipe / cuda / Forward |
0.000009888 s |
0.00001104 s |
0.90 |
const_scatter / Jax / cuda / Forward |
0.00000992 s |
0.000010272 s |
0.97 |
const_scatter / HLOOpt / cuda / Forward |
0.000009664 s |
0.000010497 s |
0.92 |
const_scatter / PartOpt / cuda / Forward |
0.00000976 s |
0.000011137 s |
0.88 |
const_scatter / IPartOpt / cuda / Forward |
0.0000096 s |
0.000010656 s |
0.90 |
const_scatter / DefOpt / cuda / Forward |
0.00000992 s |
0.000010656 s |
0.93 |
const_scatter / IDefOpt / cuda / Forward |
0.00001008 s |
0.000011009 s |
0.92 |
const_scatter / JaXPipe / cuda / PreRev |
0.000015937 s |
0.000017472 s |
0.91 |
const_scatter / JaXPipe / cuda / PostRev |
0.00001632 s |
0.00001696 s |
0.96 |
const_scatter / JaXPipe / cuda / BothRev |
0.000015809 s |
0.000016672 s |
0.95 |
const_scatter / Jax / cuda / BothRev |
0.00001632 s |
0.00001712 s |
0.95 |
const_scatter / HLOOpt / cuda / PreRev |
0.000015935999999999998 s |
0.000017184 s |
0.93 |
const_scatter / HLOOpt / cuda / PostRev |
0.000016351 s |
0.00001696 s |
0.96 |
const_scatter / HLOOpt / cuda / BothRev |
0.00001584 s |
0.000017408 s |
0.91 |
const_scatter / PartOpt / cuda / PreRev |
0.000016542999999999997 s |
0.00001664 s |
0.99 |
const_scatter / PartOpt / cuda / PostRev |
0.000016383999999999998 s |
0.0000168 s |
0.98 |
const_scatter / PartOpt / cuda / BothRev |
0.000016416 s |
0.000016864 s |
0.97 |
const_scatter / IPartOpt / cuda / PreRev |
0.000016096 s |
0.000016927999999999998 s |
0.95 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016096 s |
0.000016512 s |
0.97 |
const_scatter / IPartOpt / cuda / BothRev |
0.000016128 s |
0.00001744 s |
0.92 |
const_scatter / DefOpt / cuda / PreRev |
0.000016768000000000003 s |
0.000016958999999999998 s |
0.99 |
const_scatter / DefOpt / cuda / PostRev |
0.000015711 s |
0.000017056 s |
0.92 |
const_scatter / DefOpt / cuda / BothRev |
0.000016705 s |
0.000017312 s |
0.96 |
const_scatter / IDefOpt / cuda / PreRev |
0.000016511 s |
0.000017152 s |
0.96 |
const_scatter / IDefOpt / cuda / PostRev |
0.000015584000000000002 s |
0.000017344 s |
0.90 |
const_scatter / IDefOpt / cuda / BothRev |
0.00001616 s |
0.000017088 s |
0.95 |
const_scatter / JaXPipe / cpu / Primal |
0.000016519 s |
0.000012852 s |
1.29 |
const_scatter / Jax / cpu / Primal |
0.000016459000000000003 s |
0.000012697 s |
1.30 |
const_scatter / HLOOpt / cpu / Primal |
0.000017311 s |
0.000013442 s |
1.29 |
const_scatter / PartOpt / cpu / Primal |
0.000016421 s |
0.000012651 s |
1.30 |
const_scatter / IPartOpt / cpu / Primal |
0.000016572 s |
0.000012527 s |
1.32 |
const_scatter / DefOpt / cpu / Primal |
0.000017155 s |
0.00001329 s |
1.29 |
const_scatter / IDefOpt / cpu / Primal |
0.000017225000000000002 s |
0.000013248 s |
1.30 |
const_scatter / JaXPipe / cpu / Forward |
0.000022848 s |
0.000018184 s |
1.26 |
const_scatter / Jax / cpu / Forward |
0.000021435 s |
0.000016712999999999997 s |
1.28 |
const_scatter / HLOOpt / cpu / Forward |
0.000022498 s |
0.000017864 s |
1.26 |
const_scatter / PartOpt / cpu / Forward |
0.000022697 s |
0.000017628 s |
1.29 |
const_scatter / IPartOpt / cpu / Forward |
0.000022812 s |
0.000017791 s |
1.28 |
const_scatter / DefOpt / cpu / Forward |
0.000023017 s |
0.000018052 s |
1.28 |
const_scatter / IDefOpt / cpu / Forward |
0.000022727 s |
0.000018058 s |
1.26 |
const_scatter / JaXPipe / cpu / PreRev |
0.000574196 s |
0.000494193 s |
1.16 |
const_scatter / JaXPipe / cpu / PostRev |
0.000551594 s |
0.000490383 s |
1.12 |
const_scatter / JaXPipe / cpu / BothRev |
0.00054812 s |
0.000515952 s |
1.06 |
const_scatter / Jax / cpu / BothRev |
0.00055762 s |
0.000499809 s |
1.12 |
const_scatter / HLOOpt / cpu / PreRev |
0.000563702 s |
0.000506807 s |
1.11 |
const_scatter / HLOOpt / cpu / PostRev |
0.000553084 s |
0.000532947 s |
1.04 |
const_scatter / HLOOpt / cpu / BothRev |
0.000566906 s |
0.000492198 s |
1.15 |
const_scatter / PartOpt / cpu / PreRev |
0.000577253 s |
0.000513277 s |
1.12 |
const_scatter / PartOpt / cpu / PostRev |
0.000541583 s |
0.000495613 s |
1.09 |
const_scatter / PartOpt / cpu / BothRev |
0.000554452 s |
0.000528017 s |
1.05 |
const_scatter / IPartOpt / cpu / PreRev |
0.0005387769999999 s |
0.000518134 s |
1.04 |
const_scatter / IPartOpt / cpu / PostRev |
0.000540141 s |
0.000499348 s |
1.08 |
const_scatter / IPartOpt / cpu / BothRev |
0.000546308 s |
0.000509812 s |
1.07 |
const_scatter / DefOpt / cpu / PreRev |
0.0005743559999999 s |
0.000512476 s |
1.12 |
const_scatter / DefOpt / cpu / PostRev |
0.000568489 s |
0.000525034 s |
1.08 |
const_scatter / DefOpt / cpu / BothRev |
0.000555437 s |
0.000517393 s |
1.07 |
const_scatter / IDefOpt / cpu / PreRev |
0.000562543 s |
0.000538514 s |
1.04 |
const_scatter / IDefOpt / cpu / PostRev |
0.0005744529999999 s |
0.000505837 s |
1.14 |
const_scatter / IDefOpt / cpu / BothRev |
0.000563271 s |
0.000512877 s |
1.10 |
GenDot / JaXPipe / cpu / Primal |
0.000008655499950691592 s |
0.000014693 s |
0.59 |
GenDot / Jax / cpu / Primal |
0.00000805137997303973 s |
0.000014872 s |
0.54 |
GenDot / HLOOpt / cpu / Primal |
0.00000810146000731038 s |
0.000013814 s |
0.59 |
GenDot / PartOpt / cpu / Primal |
0.000007599399978062138 s |
0.00001455 s |
0.52 |
GenDot / IPartOpt / cpu / Primal |
0.000008302819987875409 s |
0.000015255 s |
0.54 |
GenDot / DefOpt / cpu / Primal |
0.00000757250003516674 s |
0.000014114 s |
0.54 |
GenDot / IDefOpt / cpu / Primal |
0.000007959999984450405 s |
0.000014076 s |
0.57 |
GenDot / JaXPipe / cpu / Forward |
0.000011477960088086547 s |
0.000019737 s |
0.58 |
GenDot / Jax / cpu / Forward |
0.000011042220048693709 s |
0.000020213 s |
0.55 |
GenDot / HLOOpt / cpu / Forward |
0.00001179348004370695 s |
0.000019072 s |
0.62 |
GenDot / PartOpt / cpu / Forward |
0.000011551900061022024 s |
0.000018608 s |
0.62 |
GenDot / IPartOpt / cpu / Forward |
0.000012953839996043825 s |
0.000019455 s |
0.67 |
GenDot / DefOpt / cpu / Forward |
0.000011589319983613676 s |
0.000019162 s |
0.60 |
GenDot / IDefOpt / cpu / Forward |
0.000012002180028503062 s |
0.000018916 s |
0.63 |
GenDot / JaXPipe / cpu / PreRev |
0.000011949440013268031 s |
0.000019669 s |
0.61 |
GenDot / JaXPipe / cpu / PostRev |
0.000011416520028433296 s |
0.000020344 s |
0.56 |
GenDot / JaXPipe / cpu / BothRev |
0.000012764299935952294 s |
0.000019259 s |
0.66 |
GenDot / Jax / cpu / BothRev |
0.000011430320009822026 s |
0.000020241 s |
0.56 |
GenDot / HLOOpt / cpu / PreRev |
0.000012257779999345076 s |
0.000019136 s |
0.64 |
GenDot / HLOOpt / cpu / PostRev |
0.000014491279998765094 s |
0.000019546 s |
0.74 |
GenDot / HLOOpt / cpu / BothRev |
0.000012335060109762707 s |
0.000019856 s |
0.62 |
GenDot / PartOpt / cpu / PreRev |
0.000012051580106344772 s |
0.000019271 s |
0.63 |
GenDot / PartOpt / cpu / PostRev |
0.000011537380032677902 s |
0.000020379000000000003 s |
0.57 |
GenDot / PartOpt / cpu / BothRev |
0.00001303833998463233 s |
0.000019498 s |
0.67 |
GenDot / IPartOpt / cpu / PreRev |
0.000012118520025978796 s |
0.000018723 s |
0.65 |
GenDot / IPartOpt / cpu / PostRev |
0.00001158631997896009 s |
0.000020888 s |
0.55 |
GenDot / IPartOpt / cpu / BothRev |
0.0000120522000179335 s |
0.000019431 s |
0.62 |
GenDot / DefOpt / cpu / PreRev |
0.000012296419954509477 s |
0.000019354 s |
0.64 |
GenDot / DefOpt / cpu / PostRev |
0.000012982240059500328 s |
0.000019336 s |
0.67 |
GenDot / DefOpt / cpu / BothRev |
0.000012908679855172522 s |
0.000019724 s |
0.65 |
GenDot / IDefOpt / cpu / PreRev |
0.000011804280020442092 s |
0.000019382 s |
0.61 |
GenDot / IDefOpt / cpu / PostRev |
0.00001281500004552072 s |
0.000019436 s |
0.66 |
GenDot / IDefOpt / cpu / BothRev |
0.000011647059945971703 s |
0.000019075000000000003 s |
0.61 |
GenDot / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002527 s |
0.80 |
GenDot / Jax / cuda / Primal |
0.000002015 s |
0.000002528 s |
0.80 |
GenDot / HLOOpt / cuda / Primal |
0.000002015 s |
0.000002527 s |
0.80 |
GenDot / PartOpt / cuda / Primal |
0.000002016 s |
0.000002528 s |
0.80 |
GenDot / IPartOpt / cuda / Primal |
0.000002015 s |
0.00000256 s |
0.79 |
GenDot / DefOpt / cuda / Primal |
0.000001983 s |
0.000002527 s |
0.78 |
GenDot / IDefOpt / cuda / Primal |
0.000002015 s |
0.000002528 s |
0.80 |
GenDot / JaXPipe / cuda / Forward |
0.000010016 s |
0.00001072 s |
0.93 |
GenDot / Jax / cuda / Forward |
0.000009632 s |
0.000010624 s |
0.91 |
GenDot / HLOOpt / cuda / Forward |
0.000009856 s |
0.000010591 s |
0.93 |
GenDot / PartOpt / cuda / Forward |
0.000009856 s |
0.000010464 s |
0.94 |
GenDot / IPartOpt / cuda / Forward |
0.000009856 s |
0.000010911 s |
0.90 |
GenDot / DefOpt / cuda / Forward |
0.000009632 s |
0.000010816 s |
0.89 |
GenDot / IDefOpt / cuda / Forward |
0.00000976 s |
0.000010816 s |
0.90 |
GenDot / JaXPipe / cuda / PreRev |
0.000010144 s |
0.000010816 s |
0.94 |
GenDot / JaXPipe / cuda / PostRev |
0.000009569 s |
0.000010752 s |
0.89 |
GenDot / JaXPipe / cuda / BothRev |
0.000009728 s |
0.000010528 s |
0.92 |
GenDot / Jax / cuda / BothRev |
0.000009632 s |
0.000015007 s |
0.64 |
GenDot / HLOOpt / cuda / PreRev |
0.000009664 s |
0.000010528 s |
0.92 |
GenDot / HLOOpt / cuda / PostRev |
0.000010048 s |
0.000010656 s |
0.94 |
GenDot / HLOOpt / cuda / BothRev |
0.00000976 s |
0.000010336 s |
0.94 |
GenDot / PartOpt / cuda / PreRev |
0.000009855 s |
0.000010624 s |
0.93 |
GenDot / PartOpt / cuda / PostRev |
0.000009696 s |
0.000010496 s |
0.92 |
GenDot / PartOpt / cuda / BothRev |
0.000009984 s |
0.000010945 s |
0.91 |
GenDot / IPartOpt / cuda / PreRev |
0.000009728 s |
0.000010495 s |
0.93 |
GenDot / IPartOpt / cuda / PostRev |
0.000009568 s |
0.000010368 s |
0.92 |
GenDot / IPartOpt / cuda / BothRev |
0.000009889 s |
0.00001056 s |
0.94 |
GenDot / DefOpt / cuda / PreRev |
0.000009663 s |
0.000011232 s |
0.86 |
GenDot / DefOpt / cuda / PostRev |
0.000009952 s |
0.00001056 s |
0.94 |
GenDot / DefOpt / cuda / BothRev |
0.00001008 s |
0.000010688 s |
0.94 |
GenDot / IDefOpt / cuda / PreRev |
0.000010112 s |
0.000010752 s |
0.94 |
GenDot / IDefOpt / cuda / PostRev |
0.0000096 s |
0.000011168 s |
0.86 |
GenDot / IDefOpt / cuda / BothRev |
0.000010016 s |
0.000011327 s |
0.88 |
GenDot / JaXPipe / cpu / Primal |
0.000019408 s |
0.000014693 s |
1.32 |
GenDot / Jax / cpu / Primal |
0.000019218 s |
0.000014872 s |
1.29 |
GenDot / HLOOpt / cpu / Primal |
0.000017813 s |
0.000013814 s |
1.29 |
GenDot / PartOpt / cpu / Primal |
0.000019042 s |
0.00001455 s |
1.31 |
GenDot / IPartOpt / cpu / Primal |
0.000018721000000000003 s |
0.000015255 s |
1.23 |
GenDot / DefOpt / cpu / Primal |
0.000029979 s |
0.000014114 s |
2.12 |
GenDot / IDefOpt / cpu / Primal |
0.00001834 s |
0.000014076 s |
1.30 |
GenDot / JaXPipe / cpu / Forward |
0.000024669 s |
0.000019737 s |
1.25 |
GenDot / Jax / cpu / Forward |
0.000026454 s |
0.000020213 s |
1.31 |
GenDot / HLOOpt / cpu / Forward |
0.000024443 s |
0.000019072 s |
1.28 |
GenDot / PartOpt / cpu / Forward |
0.000024813 s |
0.000018608 s |
1.33 |
GenDot / IPartOpt / cpu / Forward |
0.000024456 s |
0.000019455 s |
1.26 |
GenDot / DefOpt / cpu / Forward |
0.000024638 s |
0.000019162 s |
1.29 |
GenDot / IDefOpt / cpu / Forward |
0.00002376 s |
0.000018916 s |
1.26 |
GenDot / JaXPipe / cpu / PreRev |
0.000024978 s |
0.000019669 s |
1.27 |
GenDot / JaXPipe / cpu / PostRev |
0.000025587 s |
0.000020344 s |
1.26 |
GenDot / JaXPipe / cpu / BothRev |
0.000024689 s |
0.000019259 s |
1.28 |
GenDot / Jax / cpu / BothRev |
0.000026965 s |
0.000020241 s |
1.33 |
GenDot / HLOOpt / cpu / PreRev |
0.000025357 s |
0.000019136 s |
1.33 |
GenDot / HLOOpt / cpu / PostRev |
0.000024891 s |
0.000019546 s |
1.27 |
GenDot / HLOOpt / cpu / BothRev |
0.00002541 s |
0.000019856 s |
1.28 |
GenDot / PartOpt / cpu / PreRev |
0.000026054 s |
0.000019271 s |
1.35 |
GenDot / PartOpt / cpu / PostRev |
0.000026357 s |
0.000020379000000000003 s |
1.29 |
GenDot / PartOpt / cpu / BothRev |
0.000026232 s |
0.000019498 s |
1.35 |
GenDot / IPartOpt / cpu / PreRev |
0.000024981 s |
0.000018723 s |
1.33 |
GenDot / IPartOpt / cpu / PostRev |
0.000025918 s |
0.000020888 s |
1.24 |
GenDot / IPartOpt / cpu / BothRev |
0.000030182 s |
0.000019431 s |
1.55 |
GenDot / DefOpt / cpu / PreRev |
0.000024657 s |
0.000019354 s |
1.27 |
GenDot / DefOpt / cpu / PostRev |
0.00002571 s |
0.000019336 s |
1.33 |
GenDot / DefOpt / cpu / BothRev |
0.000025181 s |
0.000019724 s |
1.28 |
GenDot / IDefOpt / cpu / PreRev |
0.000024852 s |
0.000019382 s |
1.28 |
GenDot / IDefOpt / cpu / PostRev |
0.000024769 s |
0.000019436 s |
1.27 |
GenDot / IDefOpt / cpu / BothRev |
0.000024708 s |
0.000019075000000000003 s |
1.30 |
hlo_ffi / JaXPipe / cpu / Primal |
0.00001026426003591041 s |
0.000017462 s |
0.59 |
hlo_ffi / Jax / cpu / Primal |
0.000009718339988467053 s |
0.000017542999999999998 s |
0.55 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000009967559944925596 s |
0.000017298 s |
0.58 |
hlo_ffi / PartOpt / cpu / Primal |
0.0000092871000742889 s |
0.000017353 s |
0.54 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000009971979961846957 s |
0.000017631 s |
0.57 |
hlo_ffi / DefOpt / cpu / Primal |
0.000009495319973211736 s |
0.000017361999999999997 s |
0.55 |
hlo_ffi / IDefOpt / cpu / Primal |
0.00000944432005780982 s |
0.000017690999999999997 s |
0.53 |
hlo_ffi / JaXPipe / cpu / Forward |
0.0000139294599830464 s |
0.000024507 s |
0.57 |
hlo_ffi / Jax / cpu / Forward |
0.0000135791999855428 s |
0.00002347 s |
0.58 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000013622679925902049 s |
0.000023905 s |
0.57 |
hlo_ffi / PartOpt / cpu / Forward |
0.000013768179960607086 s |
0.000023943 s |
0.58 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000013878459976695012 s |
0.000023503 s |
0.59 |
hlo_ffi / DefOpt / cpu / Forward |
0.00001381172005494591 s |
0.000023983 s |
0.58 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000013870379989384674 s |
0.000023725 s |
0.58 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000014068299969949294 s |
0.000024384 s |
0.58 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000014098520096013087 s |
0.000023532 s |
0.60 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000014235319977160545 s |
0.000023683 s |
0.60 |
hlo_ffi / Jax / cpu / BothRev |
0.000013988200025778497 s |
0.000024125 s |
0.58 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000014236480037652654 s |
0.000023644 s |
0.60 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000016109660027723293 s |
0.000023644 s |
0.68 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000014328660054161446 s |
0.00002379 s |
0.60 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000013927000036346728 s |
0.000023865 s |
0.58 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000014315100033854832 s |
0.0000235 s |
0.61 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00001412080011505168 s |
0.000024603 s |
0.57 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000014175540090946017 s |
0.000023755 s |
0.60 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000013966919996164506 s |
0.000023591 s |
0.59 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000013886939977965084 s |
0.000023426 s |
0.59 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000013985340083308984 s |
0.00002395 s |
0.58 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000013873640018573496 s |
0.000023763 s |
0.58 |
hlo_ffi / DefOpt / cpu / BothRev |
0.0000140299399572541 s |
0.000023529 s |
0.60 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000014588140038540586 s |
0.000023718 s |
0.62 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000014058259985176848 s |
0.000023959 s |
0.59 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000014363659975060728 s |
0.000024330000000000003 s |
0.59 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001984 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / Jax / cuda / Primal |
0.000001984 s |
0.000002368 s |
0.84 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001984 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001984 s |
0.000002368 s |
0.84 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001983 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.000002368 s |
0.84 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001983 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / JaXPipe / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / Jax / cuda / Forward |
0.00000208 s |
0.000002464 s |
0.84 |
hlo_ffi / HLOOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / IPartOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / DefOpt / cuda / Forward |
0.00000208 s |
0.000002464 s |
0.84 |
hlo_ffi / IDefOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / Jax / cuda / BothRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002047 s |
0.000002433 s |
0.84 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002048 s |
0.000002431 s |
0.84 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002047 s |
0.000002433 s |
0.84 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002048 s |
0.000002431 s |
0.84 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002047 s |
0.000002463 s |
0.83 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000022758 s |
0.000017462 s |
1.30 |
hlo_ffi / Jax / cpu / Primal |
0.000021705 s |
0.000017542999999999998 s |
1.24 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000021358 s |
0.000017298 s |
1.23 |
hlo_ffi / PartOpt / cpu / Primal |
0.000021739 s |
0.000017353 s |
1.25 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000022104 s |
0.000017631 s |
1.25 |
hlo_ffi / DefOpt / cpu / Primal |
0.000021974 s |
0.000017361999999999997 s |
1.27 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000021558 s |
0.000017690999999999997 s |
1.22 |
hlo_ffi / JaXPipe / cpu / Forward |
0.00003027 s |
0.000024507 s |
1.24 |
hlo_ffi / Jax / cpu / Forward |
0.000029797 s |
0.00002347 s |
1.27 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000029703 s |
0.000023905 s |
1.24 |
hlo_ffi / PartOpt / cpu / Forward |
0.000029986 s |
0.000023943 s |
1.25 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000030212 s |
0.000023503 s |
1.29 |
hlo_ffi / DefOpt / cpu / Forward |
0.000028995 s |
0.000023983 s |
1.21 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000029699 s |
0.000023725 s |
1.25 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000029948 s |
0.000024384 s |
1.23 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000029682 s |
0.000023532 s |
1.26 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000029163 s |
0.000023683 s |
1.23 |
hlo_ffi / Jax / cpu / BothRev |
0.000030212 s |
0.000024125 s |
1.25 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000030195 s |
0.000023644 s |
1.28 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000029494 s |
0.000023644 s |
1.25 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000030975 s |
0.00002379 s |
1.30 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000030471 s |
0.000023865 s |
1.28 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000036754 s |
0.0000235 s |
1.56 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00003115 s |
0.000024603 s |
1.27 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000030224 s |
0.000023755 s |
1.27 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000030987 s |
0.000023591 s |
1.31 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000029766 s |
0.000023426 s |
1.27 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000029992 s |
0.00002395 s |
1.25 |
hlo_ffi / DefOpt / cpu / PostRev |
0.00003044 s |
0.000023763 s |
1.28 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000030658000000000004 s |
0.000023529 s |
1.30 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000031735 s |
0.000023718 s |
1.34 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000031658 s |
0.000023959 s |
1.32 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000030994 s |
0.000024330000000000003 s |
1.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0008948130001954 s |
0.001763568 s |
0.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0009404604001247 s |
0.001485087 s |
0.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009444160001294 s |
0.001712441 s |
0.55 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0008793189999778 s |
0.0015765599999999 s |
0.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0008898544001567 s |
0.0015197989999999 s |
0.59 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009432273998754 s |
0.0017493749999999 s |
0.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009511481999652 s |
0.001603363 s |
0.59 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.00217659000009 s |
0.00437569 s |
0.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0023137436000979 s |
0.004520619 s |
0.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0022392284001398 s |
0.004573614 s |
0.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0021778880001875 s |
0.004502652 s |
0.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0022469533996627 s |
0.004387929 s |
0.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0021796589997393 s |
0.004582755 s |
0.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.002219416600019 s |
0.005122053 s |
0.43 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0051047533999735 s |
0.007972802 s |
0.64 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0061896753997643 s |
0.007790199 s |
0.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0061058772000251 s |
0.007701489 s |
0.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0038429115998951 s |
0.008807338 s |
0.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0057168687999364 s |
0.008201033 s |
0.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0053523091999522 s |
0.006543085 s |
0.82 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0056836800000382 s |
0.009350031 s |
0.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0035767370001849 s |
0.0074474669999999 s |
0.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0058589523998307 s |
0.009056619 s |
0.65 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0035170327999367 s |
0.007933135 s |
0.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0056425257998853 s |
0.008086585 s |
0.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0050290429999222 s |
0.007406945 s |
0.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0056632367997735 s |
0.009752857 s |
0.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0036136127997451 s |
0.007572427 s |
0.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0055444097999497 s |
0.008630254 s |
0.64 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0049569376000363 s |
0.007841915 s |
0.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.0057989885999631 s |
0.008346618 s |
0.69 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0035721701999136 s |
0.007180154 s |
0.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0058907759999783 s |
0.007490605 s |
0.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000281664 s |
0.0002993589999999 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000280704 s |
0.000299712 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000288737 s |
0.000306464 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000280768 s |
0.00029968 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.0002814079999999 s |
0.000298464 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.00028848 s |
0.0003068159999999 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000288671 s |
0.000306144 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000557888 s |
0.000584607 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.0005401589999999 s |
0.000567936 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000558048 s |
0.000583904 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.0005573759999999 s |
0.000583296 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000559104 s |
0.000584128 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.0005572169999999 s |
0.000583488 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000557408 s |
0.00058448 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.00102784 s |
0.001060191 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000987744 s |
0.001013759 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001027744 s |
0.0010551669999999 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.000990144 s |
0.001011808 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001016385 s |
0.001040191 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001041184 s |
0.00106496 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001015041 s |
0.001041535 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001030913 s |
0.001055135 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.00097824 s |
0.001001472 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001029536 s |
0.001056832 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001028545 s |
0.001054911 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000979456 s |
0.001002655 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001029376 s |
0.001056224 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001024928 s |
0.001055168 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000963008 s |
0.000989984 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001026497 s |
0.001056703 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001024736 s |
0.001056032 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.00102288 s |
0.001059072 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001024737 s |
0.001057055 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.003871581 s |
0.001763568 s |
2.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0039086929999999 s |
0.001485087 s |
2.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0039098579999999 s |
0.001712441 s |
2.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0042190349999999 s |
0.0015765599999999 s |
2.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.003862298 s |
0.0015197989999999 s |
2.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.003975592 s |
0.0017493749999999 s |
2.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.004404758 s |
0.001603363 s |
2.75 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.009305477 s |
0.00437569 s |
2.13 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.009031173 s |
0.004520619 s |
2.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.009220028 s |
0.004573614 s |
2.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.008917317 s |
0.004502652 s |
1.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.008532958 s |
0.004387929 s |
1.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.009110337 s |
0.004582755 s |
1.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.009191176 s |
0.005122053 s |
1.79 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.014379832 s |
0.007972802 s |
1.80 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.012626419 s |
0.007790199 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.011883875 s |
0.007701489 s |
1.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.013049146 s |
0.008807338 s |
1.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0117612439999999 s |
0.008201033 s |
1.43 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.012597004 s |
0.006543085 s |
1.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.012250112 s |
0.009350031 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.012735262 s |
0.0074474669999999 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0127782319999999 s |
0.009056619 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.011520987 s |
0.007933135 s |
1.45 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0122450069999999 s |
0.008086585 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.012909061 s |
0.007406945 s |
1.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.011932401 s |
0.009752857 s |
1.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.012230362 s |
0.007572427 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.011278726 s |
0.008630254 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.013528509 s |
0.007841915 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.013437691 s |
0.008346618 s |
1.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.012209785 s |
0.007180154 s |
1.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.011579109 s |
0.007490605 s |
1.55 |
scatter_sum / JaXPipe / cpu / Primal |
0.000009154540002782596 s |
0.00001584 s |
0.58 |
scatter_sum / Jax / cpu / Primal |
0.000009623459991416894 s |
0.000015703000000000002 s |
0.61 |
scatter_sum / HLOOpt / cpu / Primal |
0.000008490639938827371 s |
0.000015737999999999997 s |
0.54 |
scatter_sum / PartOpt / cpu / Primal |
0.000008398960089834872 s |
0.000015625 s |
0.54 |
scatter_sum / IPartOpt / cpu / Primal |
0.00000821985991933616 s |
0.000015793000000000003 s |
0.52 |
scatter_sum / DefOpt / cpu / Primal |
0.000008579280020057923 s |
0.000015398 s |
0.56 |
scatter_sum / IDefOpt / cpu / Primal |
0.000008555859985790447 s |
0.000015412 s |
0.56 |
scatter_sum / JaXPipe / cpu / Forward |
0.00001269419997697696 s |
0.000022844 s |
0.56 |
scatter_sum / Jax / cpu / Forward |
0.000012817679980798855 s |
0.000022334 s |
0.57 |
scatter_sum / HLOOpt / cpu / Forward |
0.000013275460059958277 s |
0.000022719 s |
0.58 |
scatter_sum / PartOpt / cpu / Forward |
0.000012351099958323176 s |
0.000022468 s |
0.55 |
scatter_sum / IPartOpt / cpu / Forward |
0.0000129965800078935 s |
0.000022597 s |
0.58 |
scatter_sum / DefOpt / cpu / Forward |
0.000012573780113598331 s |
0.000022084 s |
0.57 |
scatter_sum / IDefOpt / cpu / Forward |
0.000012421560004440837 s |
0.000038063 s |
0.33 |
scatter_sum / JaXPipe / cpu / PreRev |
0.00001290754007641226 s |
0.000023225 s |
0.56 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000013031940034125 s |
0.000022668 s |
0.57 |
scatter_sum / JaXPipe / cpu / BothRev |
0.00001383471997542074 s |
0.000022202 s |
0.62 |
scatter_sum / Jax / cpu / BothRev |
0.000012943959973199526 s |
0.000022582 s |
0.57 |
scatter_sum / HLOOpt / cpu / PreRev |
0.00001327399999354384 s |
0.000022679 s |
0.59 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000015600480037392116 s |
0.000022522 s |
0.69 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000013094819951220416 s |
0.000022435 s |
0.58 |
scatter_sum / PartOpt / cpu / PreRev |
0.000013007339966861763 s |
0.000022439 s |
0.58 |
scatter_sum / PartOpt / cpu / PostRev |
0.000013879980015190084 s |
0.000022575 s |
0.61 |
scatter_sum / PartOpt / cpu / BothRev |
0.000014095720016484848 s |
0.000022569 s |
0.62 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000013099059942760504 s |
0.00002262 s |
0.58 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000013950880002084886 s |
0.0000223 s |
0.63 |
scatter_sum / IPartOpt / cpu / BothRev |
0.00001307709997490747 s |
0.00002268 s |
0.58 |
scatter_sum / DefOpt / cpu / PreRev |
0.000013517839961423306 s |
0.000022839000000000003 s |
0.59 |
scatter_sum / DefOpt / cpu / PostRev |
0.00001339279995590914 s |
0.000022353 s |
0.60 |
scatter_sum / DefOpt / cpu / BothRev |
0.00001310420006120694 s |
0.000022472 s |
0.58 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000013152019946574 s |
0.000022360000000000003 s |
0.59 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000013564219989348204 s |
0.000021894 s |
0.62 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000012663240067922742 s |
0.000023007 s |
0.55 |
scatter_sum / JaXPipe / cuda / Primal |
0.000009953 s |
0.000011648 s |
0.85 |
scatter_sum / Jax / cuda / Primal |
0.000009888 s |
0.000010816 s |
0.91 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009984 s |
0.000010784 s |
0.93 |
scatter_sum / PartOpt / cuda / Primal |
0.000009792 s |
0.00001088 s |
0.90 |
scatter_sum / IPartOpt / cuda / Primal |
0.000010303 s |
0.000010913 s |
0.94 |
scatter_sum / DefOpt / cuda / Primal |
0.000010336 s |
0.000011072 s |
0.93 |
scatter_sum / IDefOpt / cuda / Primal |
0.00001008 s |
0.00001104 s |
0.91 |
scatter_sum / JaXPipe / cuda / Forward |
0.00001712 s |
0.000018304 s |
0.94 |
scatter_sum / Jax / cuda / Forward |
0.000016288 s |
0.000016414999999999998 s |
0.99 |
scatter_sum / HLOOpt / cuda / Forward |
0.00001712 s |
0.000017505 s |
0.98 |
scatter_sum / PartOpt / cuda / Forward |
0.000017056 s |
0.000017313 s |
0.99 |
scatter_sum / IPartOpt / cuda / Forward |
0.00001696 s |
0.0000176 s |
0.96 |
scatter_sum / DefOpt / cuda / Forward |
0.000017503999999999997 s |
0.000017696 s |
0.99 |
scatter_sum / IDefOpt / cuda / Forward |
0.000016896000000000002 s |
0.000017183 s |
0.98 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000016864 s |
0.00001728 s |
0.98 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000016383999999999998 s |
0.000017184 s |
0.95 |
scatter_sum / JaXPipe / cuda / BothRev |
0.0000168 s |
0.000017824 s |
0.94 |
scatter_sum / Jax / cuda / BothRev |
0.000016927999999999998 s |
0.000017568000000000002 s |
0.96 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000016992 s |
0.000018176 s |
0.93 |
scatter_sum / HLOOpt / cuda / PostRev |
0.00001712 s |
0.000017536 s |
0.98 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017375999999999998 s |
0.000017664 s |
0.98 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017056 s |
0.000018016 s |
0.95 |
scatter_sum / PartOpt / cuda / PostRev |
0.000016768000000000003 s |
0.000017536 s |
0.96 |
scatter_sum / PartOpt / cuda / BothRev |
0.000017184 s |
0.000018207 s |
0.94 |
scatter_sum / IPartOpt / cuda / PreRev |
0.00001696 s |
0.000017919999999999998 s |
0.95 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000016576000000000002 s |
0.000016927999999999998 s |
0.98 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000016544 s |
0.000017824 s |
0.93 |
scatter_sum / DefOpt / cuda / PreRev |
0.000016288 s |
0.000017952 s |
0.91 |
scatter_sum / DefOpt / cuda / PostRev |
0.000017375999999999998 s |
0.000016992 s |
1.02 |
scatter_sum / DefOpt / cuda / BothRev |
0.000016672 s |
0.000017088 s |
0.98 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000016545 s |
0.0000176 s |
0.94 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016672 s |
0.000017408 s |
0.96 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000017024 s |
0.000017728 s |
0.96 |
scatter_sum / JaXPipe / cpu / Primal |
0.00001984 s |
0.00001584 s |
1.25 |
scatter_sum / Jax / cpu / Primal |
0.000032305 s |
0.000015703000000000002 s |
2.06 |
scatter_sum / HLOOpt / cpu / Primal |
0.000019872 s |
0.000015737999999999997 s |
1.26 |
scatter_sum / PartOpt / cpu / Primal |
0.000019809 s |
0.000015625 s |
1.27 |
scatter_sum / IPartOpt / cpu / Primal |
0.000020168 s |
0.000015793000000000003 s |
1.28 |
scatter_sum / DefOpt / cpu / Primal |
0.00002545 s |
0.000015398 s |
1.65 |
scatter_sum / IDefOpt / cpu / Primal |
0.000019842 s |
0.000015412 s |
1.29 |
scatter_sum / JaXPipe / cpu / Forward |
0.000029514 s |
0.000022844 s |
1.29 |
scatter_sum / Jax / cpu / Forward |
0.000029559 s |
0.000022334 s |
1.32 |
scatter_sum / HLOOpt / cpu / Forward |
0.000029877 s |
0.000022719 s |
1.32 |
scatter_sum / PartOpt / cpu / Forward |
0.000028278000000000003 s |
0.000022468 s |
1.26 |
scatter_sum / IPartOpt / cpu / Forward |
0.000028307 s |
0.000022597 s |
1.25 |
scatter_sum / DefOpt / cpu / Forward |
0.000028855 s |
0.000022084 s |
1.31 |
scatter_sum / IDefOpt / cpu / Forward |
0.000028536 s |
0.000038063 s |
0.75 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000029477 s |
0.000023225 s |
1.27 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000028519 s |
0.000022668 s |
1.26 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000029352 s |
0.000022202 s |
1.32 |
scatter_sum / Jax / cpu / BothRev |
0.000028989 s |
0.000022582 s |
1.28 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000029257 s |
0.000022679 s |
1.29 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000029472 s |
0.000022522 s |
1.31 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000029047 s |
0.000022435 s |
1.29 |
scatter_sum / PartOpt / cpu / PreRev |
0.000030898 s |
0.000022439 s |
1.38 |
scatter_sum / PartOpt / cpu / PostRev |
0.000029493 s |
0.000022575 s |
1.31 |
scatter_sum / PartOpt / cpu / BothRev |
0.000029687 s |
0.000022569 s |
1.32 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000030404 s |
0.00002262 s |
1.34 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000029498 s |
0.0000223 s |
1.32 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000029419 s |
0.00002268 s |
1.30 |
scatter_sum / DefOpt / cpu / PreRev |
0.000035113 s |
0.000022839000000000003 s |
1.54 |
scatter_sum / DefOpt / cpu / PostRev |
0.000029329 s |
0.000022353 s |
1.31 |
scatter_sum / DefOpt / cpu / BothRev |
0.000028844 s |
0.000022472 s |
1.28 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000028152 s |
0.000022360000000000003 s |
1.26 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000029448 s |
0.000021894 s |
1.35 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000028675 s |
0.000023007 s |
1.25 |
slicing / JaXPipe / cpu / Primal |
0.000006988280038058292 s |
0.000013153 s |
0.53 |
slicing / Jax / cpu / Primal |
0.000007066620037221583 s |
0.000012428 s |
0.57 |
slicing / HLOOpt / cpu / Primal |
0.000006935719939065166 s |
0.000012639 s |
0.55 |
slicing / PartOpt / cpu / Primal |
0.000006532999923365423 s |
0.000012471 s |
0.52 |
slicing / IPartOpt / cpu / Primal |
0.00000698045998433372 s |
0.000012629 s |
0.55 |
slicing / DefOpt / cpu / Primal |
0.000007035759972495726 s |
0.000012661 s |
0.56 |
slicing / IDefOpt / cpu / Primal |
0.000006859160075691761 s |
0.000012617 s |
0.54 |
slicing / JaXPipe / cpu / Forward |
0.0000106280599902675 s |
0.000016938999999999998 s |
0.63 |
slicing / Jax / cpu / Forward |
0.000010632459998305422 s |
0.000016388 s |
0.65 |
slicing / HLOOpt / cpu / Forward |
0.000010340820044802968 s |
0.000016751999999999998 s |
0.62 |
slicing / PartOpt / cpu / Forward |
0.000010007799955928933 s |
0.000016729000000000002 s |
0.60 |
slicing / IPartOpt / cpu / Forward |
0.000010841939983947668 s |
0.000016976 s |
0.64 |
slicing / DefOpt / cpu / Forward |
0.00001065325992385624 s |
0.000016904 s |
0.63 |
slicing / IDefOpt / cpu / Forward |
0.000010414560019853525 s |
0.000017279 s |
0.60 |
slicing / JaXPipe / cpu / PreRev |
0.000011049479981011246 s |
0.000017972 s |
0.61 |
slicing / JaXPipe / cpu / PostRev |
0.000011307579879940022 s |
0.000017929 s |
0.63 |
slicing / JaXPipe / cpu / BothRev |
0.000011088060036854583 s |
0.000017579000000000002 s |
0.63 |
slicing / Jax / cpu / BothRev |
0.000011041820071113762 s |
0.000017622 s |
0.63 |
slicing / HLOOpt / cpu / PreRev |
0.000011057679930672748 s |
0.000017704999999999997 s |
0.62 |
slicing / HLOOpt / cpu / PostRev |
0.000012811960023100256 s |
0.000017486 s |
0.73 |
slicing / HLOOpt / cpu / BothRev |
0.000010890100093092767 s |
0.000017561 s |
0.62 |
slicing / PartOpt / cpu / PreRev |
0.00001082938002582523 s |
0.000017477 s |
0.62 |
slicing / PartOpt / cpu / PostRev |
0.000011123199965368255 s |
0.000017547 s |
0.63 |
slicing / PartOpt / cpu / BothRev |
0.000011550819999683882 s |
0.000017319 s |
0.67 |
slicing / IPartOpt / cpu / PreRev |
0.000010511539985600393 s |
0.000017587 s |
0.60 |
slicing / IPartOpt / cpu / PostRev |
0.000011013359944627157 s |
0.000017902000000000002 s |
0.62 |
slicing / IPartOpt / cpu / BothRev |
0.00001140196009146166 s |
0.000017478 s |
0.65 |
slicing / DefOpt / cpu / PreRev |
0.000010362459906900767 s |
0.00001729 s |
0.60 |
slicing / DefOpt / cpu / PostRev |
0.00001125102000514744 s |
0.000017531000000000002 s |
0.64 |
slicing / DefOpt / cpu / BothRev |
0.000011375419999239968 s |
0.000017117 s |
0.66 |
slicing / IDefOpt / cpu / PreRev |
0.000010499399959371658 s |
0.000017479 s |
0.60 |
slicing / IDefOpt / cpu / PostRev |
0.000011229100091441068 s |
0.000017400999999999998 s |
0.65 |
slicing / IDefOpt / cpu / BothRev |
0.000010706280099839206 s |
0.000017239999999999998 s |
0.62 |
slicing / JaXPipe / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / Jax / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / HLOOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / PartOpt / cuda / Primal |
0.000001887 s |
0.000002272 s |
0.83 |
slicing / IPartOpt / cuda / Primal |
0.000001887 s |
0.000002272 s |
0.83 |
slicing / DefOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / IDefOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / JaXPipe / cuda / Forward |
0.000009984 s |
0.000010336 s |
0.97 |
slicing / Jax / cuda / Forward |
0.00001008 s |
0.00001024 s |
0.98 |
slicing / HLOOpt / cuda / Forward |
0.000010016 s |
0.000009536 s |
1.05 |
slicing / PartOpt / cuda / Forward |
0.000009696 s |
0.000010655 s |
0.91 |
slicing / IPartOpt / cuda / Forward |
0.000010017 s |
0.000010112 s |
0.99 |
slicing / DefOpt / cuda / Forward |
0.000009984 s |
0.000010464 s |
0.95 |
slicing / IDefOpt / cuda / Forward |
0.00001008 s |
0.000010176 s |
0.99 |
slicing / JaXPipe / cuda / PreRev |
0.000010143 s |
0.000010272 s |
0.99 |
slicing / JaXPipe / cuda / PostRev |
0.000009824 s |
0.000010496 s |
0.94 |
slicing / JaXPipe / cuda / BothRev |
0.000009951 s |
0.000010336 s |
0.96 |
slicing / Jax / cuda / BothRev |
0.000010944 s |
0.0000104 s |
1.05 |
slicing / HLOOpt / cuda / PreRev |
0.000010177 s |
0.000010368 s |
0.98 |
slicing / HLOOpt / cuda / PostRev |
0.000010816 s |
0.000010559 s |
1.02 |
slicing / HLOOpt / cuda / BothRev |
0.00001088 s |
0.000010208 s |
1.07 |
slicing / PartOpt / cuda / PreRev |
0.000011136 s |
0.000010336 s |
1.08 |
slicing / PartOpt / cuda / PostRev |
0.000009728 s |
0.000010144 s |
0.96 |
slicing / PartOpt / cuda / BothRev |
0.00000992 s |
0.000010273 s |
0.97 |
slicing / IPartOpt / cuda / PreRev |
0.000009664 s |
0.000010496 s |
0.92 |
slicing / IPartOpt / cuda / PostRev |
0.00001072 s |
0.000010304 s |
1.04 |
slicing / IPartOpt / cuda / BothRev |
0.000009024 s |
0.000010529 s |
0.86 |
slicing / DefOpt / cuda / PreRev |
0.000009536 s |
0.000010624 s |
0.90 |
slicing / DefOpt / cuda / PostRev |
0.00000976 s |
0.000010273 s |
0.95 |
slicing / DefOpt / cuda / BothRev |
0.000012672 s |
0.0000104 s |
1.22 |
slicing / IDefOpt / cuda / PreRev |
0.000010048 s |
0.000010304 s |
0.98 |
slicing / IDefOpt / cuda / PostRev |
0.000009856 s |
0.000010432 s |
0.94 |
slicing / IDefOpt / cuda / BothRev |
0.00000976 s |
0.000010752 s |
0.91 |
slicing / JaXPipe / cpu / Primal |
0.000016136999999999998 s |
0.000013153 s |
1.23 |
slicing / Jax / cpu / Primal |
0.000015879999999999997 s |
0.000012428 s |
1.28 |
slicing / HLOOpt / cpu / Primal |
0.000016332 s |
0.000012639 s |
1.29 |
slicing / PartOpt / cpu / Primal |
0.000016057 s |
0.000012471 s |
1.29 |
slicing / IPartOpt / cpu / Primal |
0.000016131 s |
0.000012629 s |
1.28 |
slicing / DefOpt / cpu / Primal |
0.000016325 s |
0.000012661 s |
1.29 |
slicing / IDefOpt / cpu / Primal |
0.00001594 s |
0.000012617 s |
1.26 |
slicing / JaXPipe / cpu / Forward |
0.000021582 s |
0.000016938999999999998 s |
1.27 |
slicing / Jax / cpu / Forward |
0.000020663 s |
0.000016388 s |
1.26 |
slicing / HLOOpt / cpu / Forward |
0.000021306 s |
0.000016751999999999998 s |
1.27 |
slicing / PartOpt / cpu / Forward |
0.000021174 s |
0.000016729000000000002 s |
1.27 |
slicing / IPartOpt / cpu / Forward |
0.00002111 s |
0.000016976 s |
1.24 |
slicing / DefOpt / cpu / Forward |
0.000021703 s |
0.000016904 s |
1.28 |
slicing / IDefOpt / cpu / Forward |
0.000021648 s |
0.000017279 s |
1.25 |
slicing / JaXPipe / cpu / PreRev |
0.000022414 s |
0.000017972 s |
1.25 |
slicing / JaXPipe / cpu / PostRev |
0.000027144000000000003 s |
0.000017929 s |
1.51 |
slicing / JaXPipe / cpu / BothRev |
0.000022022 s |
0.000017579000000000002 s |
1.25 |
slicing / Jax / cpu / BothRev |
0.000021541000000000003 s |
0.000017622 s |
1.22 |
slicing / HLOOpt / cpu / PreRev |
0.000022528 s |
0.000017704999999999997 s |
1.27 |
slicing / HLOOpt / cpu / PostRev |
0.000027577 s |
0.000017486 s |
1.58 |
slicing / HLOOpt / cpu / BothRev |
0.000022011 s |
0.000017561 s |
1.25 |
slicing / PartOpt / cpu / PreRev |
0.000022279 s |
0.000017477 s |
1.27 |
slicing / PartOpt / cpu / PostRev |
0.000022178 s |
0.000017547 s |
1.26 |
slicing / PartOpt / cpu / BothRev |
0.000021563 s |
0.000017319 s |
1.25 |
slicing / IPartOpt / cpu / PreRev |
0.000021747 s |
0.000017587 s |
1.24 |
slicing / IPartOpt / cpu / PostRev |
0.000022020000000000003 s |
0.000017902000000000002 s |
1.23 |
slicing / IPartOpt / cpu / BothRev |
0.000022323 s |
0.000017478 s |
1.28 |
slicing / DefOpt / cpu / PreRev |
0.000022468 s |
0.00001729 s |
1.30 |
slicing / DefOpt / cpu / PostRev |
0.000027606 s |
0.000017531000000000002 s |
1.57 |
slicing / DefOpt / cpu / BothRev |
0.000021705 s |
0.000017117 s |
1.27 |
slicing / IDefOpt / cpu / PreRev |
0.000022063 s |
0.000017479 s |
1.26 |
slicing / IDefOpt / cpu / PostRev |
0.000022101 s |
0.000017400999999999998 s |
1.27 |
slicing / IDefOpt / cpu / BothRev |
0.000021656 s |
0.000017239999999999998 s |
1.26 |
sum / JaXPipe / cpu / Primal |
0.000008094679960777285 s |
0.000014912 s |
0.54 |
sum / Jax / cpu / Primal |
0.000008412319984927308 s |
0.000014552 s |
0.58 |
sum / HLOOpt / cpu / Primal |
0.000008335219954460626 s |
0.000014439 s |
0.58 |
sum / PartOpt / cpu / Primal |
0.00000827539995952975 s |
0.000014603 s |
0.57 |
sum / IPartOpt / cpu / Primal |
0.000008628000014141435 s |
0.000014616 s |
0.59 |
sum / DefOpt / cpu / Primal |
0.000008524339955329197 s |
0.000014516 s |
0.59 |
sum / IDefOpt / cpu / Primal |
0.000008062200031417887 s |
0.000014567 s |
0.55 |
sum / JaXPipe / cpu / Forward |
0.000012649779946514171 s |
0.000020212 s |
0.63 |
sum / Jax / cpu / Forward |
0.000012850500024796929 s |
0.000019509 s |
0.66 |
sum / HLOOpt / cpu / Forward |
0.000012065720020473235 s |
0.000019181 s |
0.63 |
sum / PartOpt / cpu / Forward |
0.000012285040029382798 s |
0.000019969 s |
0.62 |
sum / IPartOpt / cpu / Forward |
0.000012768180022249 s |
0.000020112 s |
0.63 |
sum / DefOpt / cpu / Forward |
0.000012387380065774778 s |
0.000019514 s |
0.63 |
sum / IDefOpt / cpu / Forward |
0.00001203039999381872 s |
0.000020122 s |
0.60 |
sum / JaXPipe / cpu / PreRev |
0.00001196961997266044 s |
0.000019419 s |
0.62 |
sum / JaXPipe / cpu / PostRev |
0.00001230307996593183 s |
0.000018786 s |
0.65 |
sum / JaXPipe / cpu / BothRev |
0.000011749379937100456 s |
0.000018807 s |
0.62 |
sum / Jax / cpu / BothRev |
0.000011738560006051558 s |
0.000018584 s |
0.63 |
sum / HLOOpt / cpu / PreRev |
0.000011904619987035404 s |
0.000019231000000000003 s |
0.62 |
sum / HLOOpt / cpu / PostRev |
0.000013870320035493933 s |
0.000018961 s |
0.73 |
sum / HLOOpt / cpu / BothRev |
0.000011738459961634364 s |
0.000018822 s |
0.62 |
sum / PartOpt / cpu / PreRev |
0.000011421019971749048 s |
0.000018878 s |
0.60 |
sum / PartOpt / cpu / PostRev |
0.000011838479986181485 s |
0.000018693 s |
0.63 |
sum / PartOpt / cpu / BothRev |
0.000012038359982398103 s |
0.000018933 s |
0.64 |
sum / IPartOpt / cpu / PreRev |
0.000012066260114806938 s |
0.000018947 s |
0.64 |
sum / IPartOpt / cpu / PostRev |
0.00001161054002295714 s |
0.000018804 s |
0.62 |
sum / IPartOpt / cpu / BothRev |
0.000012356760089460297 s |
0.000019245000000000003 s |
0.64 |
sum / DefOpt / cpu / PreRev |
0.000012004619966319297 s |
0.000019159 s |
0.63 |
sum / DefOpt / cpu / PostRev |
0.000012117299975216156 s |
0.000018694 s |
0.65 |
sum / DefOpt / cpu / BothRev |
0.000012032840022584425 s |
0.0000187 s |
0.64 |
sum / IDefOpt / cpu / PreRev |
0.000011628960073721828 s |
0.000018622 s |
0.62 |
sum / IDefOpt / cpu / PostRev |
0.00001172898006188916 s |
0.000018793 s |
0.62 |
sum / IDefOpt / cpu / BothRev |
0.0000119376400107285 s |
0.000018927 s |
0.63 |
sum / JaXPipe / cuda / Primal |
0.000002047 s |
0.000002495 s |
0.82 |
sum / Jax / cuda / Primal |
0.000002048 s |
0.000002496 s |
0.82 |
sum / HLOOpt / cuda / Primal |
0.000002048 s |
0.000002496 s |
0.82 |
sum / PartOpt / cuda / Primal |
0.000002047 s |
0.000002496 s |
0.82 |
sum / IPartOpt / cuda / Primal |
0.000002048 s |
0.000002496 s |
0.82 |
sum / DefOpt / cuda / Primal |
0.000002047 s |
0.000002496 s |
0.82 |
sum / IDefOpt / cuda / Primal |
0.000002047 s |
0.000002495 s |
0.82 |
sum / JaXPipe / cuda / Forward |
0.000010208 s |
0.000010624 s |
0.96 |
sum / Jax / cuda / Forward |
0.000010144 s |
0.00001104 s |
0.92 |
sum / HLOOpt / cuda / Forward |
0.0000104 s |
0.00001072 s |
0.97 |
sum / PartOpt / cuda / Forward |
0.000010016 s |
0.000010496 s |
0.95 |
sum / IPartOpt / cuda / Forward |
0.000010016 s |
0.000010912 s |
0.92 |
sum / DefOpt / cuda / Forward |
0.000010911 s |
0.000010912 s |
1.00 |
sum / IDefOpt / cuda / Forward |
0.000010336 s |
0.00001056 s |
0.98 |
sum / JaXPipe / cuda / PreRev |
0.000010625 s |
0.000010176 s |
1.04 |
sum / JaXPipe / cuda / PostRev |
0.000010752 s |
0.000010432 s |
1.03 |
sum / JaXPipe / cuda / BothRev |
0.000009887 s |
0.000010271 s |
0.96 |
sum / Jax / cuda / BothRev |
0.00001008 s |
0.000010112 s |
1.00 |
sum / HLOOpt / cuda / PreRev |
0.000009792 s |
0.00001024 s |
0.96 |
sum / HLOOpt / cuda / PostRev |
0.00000944 s |
0.0000104 s |
0.91 |
sum / HLOOpt / cuda / BothRev |
0.000009856 s |
0.000010464 s |
0.94 |
sum / PartOpt / cuda / PreRev |
0.000010176 s |
0.000010592 s |
0.96 |
sum / PartOpt / cuda / PostRev |
0.000010112 s |
0.000010336 s |
0.98 |
sum / PartOpt / cuda / BothRev |
0.000010016 s |
0.0000104 s |
0.96 |
sum / IPartOpt / cuda / PreRev |
0.000009984 s |
0.000010592 s |
0.94 |
sum / IPartOpt / cuda / PostRev |
0.000009344 s |
0.0000104 s |
0.90 |
sum / IPartOpt / cuda / BothRev |
0.000009568 s |
0.00000992 s |
0.96 |
sum / DefOpt / cuda / PreRev |
0.000009792 s |
0.000010816 s |
0.91 |
sum / DefOpt / cuda / PostRev |
0.000009376 s |
0.000010432 s |
0.90 |
sum / DefOpt / cuda / BothRev |
0.000009663 s |
0.000010304 s |
0.94 |
sum / IDefOpt / cuda / PreRev |
0.000009983 s |
0.000010752 s |
0.93 |
sum / IDefOpt / cuda / PostRev |
0.000010176 s |
0.000010688 s |
0.95 |
sum / IDefOpt / cuda / BothRev |
0.000009792 s |
0.000010496 s |
0.93 |
sum / JaXPipe / cpu / Primal |
0.000018885 s |
0.000014912 s |
1.27 |
sum / Jax / cpu / Primal |
0.000018780000000000003 s |
0.000014552 s |
1.29 |
sum / HLOOpt / cpu / Primal |
0.000019347 s |
0.000014439 s |
1.34 |
sum / PartOpt / cpu / Primal |
0.000019156 s |
0.000014603 s |
1.31 |
sum / IPartOpt / cpu / Primal |
0.000018077 s |
0.000014616 s |
1.24 |
sum / DefOpt / cpu / Primal |
0.000018712 s |
0.000014516 s |
1.29 |
sum / IDefOpt / cpu / Primal |
0.000023731 s |
0.000014567 s |
1.63 |
sum / JaXPipe / cpu / Forward |
0.000025582 s |
0.000020212 s |
1.27 |
sum / Jax / cpu / Forward |
0.000025029 s |
0.000019509 s |
1.28 |
sum / HLOOpt / cpu / Forward |
0.000024867 s |
0.000019181 s |
1.30 |
sum / PartOpt / cpu / Forward |
0.000024642 s |
0.000019969 s |
1.23 |
sum / IPartOpt / cpu / Forward |
0.000025543 s |
0.000020112 s |
1.27 |
sum / DefOpt / cpu / Forward |
0.000025109 s |
0.000019514 s |
1.29 |
sum / IDefOpt / cpu / Forward |
0.000030989 s |
0.000020122 s |
1.54 |
sum / JaXPipe / cpu / PreRev |
0.000024745 s |
0.000019419 s |
1.27 |
sum / JaXPipe / cpu / PostRev |
0.000024067 s |
0.000018786 s |
1.28 |
sum / JaXPipe / cpu / BothRev |
0.000022797 s |
0.000018807 s |
1.21 |
sum / Jax / cpu / BothRev |
0.000023684 s |
0.000018584 s |
1.27 |
sum / HLOOpt / cpu / PreRev |
0.000023799 s |
0.000019231000000000003 s |
1.24 |
sum / HLOOpt / cpu / PostRev |
0.000024567 s |
0.000018961 s |
1.30 |
sum / HLOOpt / cpu / BothRev |
0.000024352 s |
0.000018822 s |
1.29 |
sum / PartOpt / cpu / PreRev |
0.00002404 s |
0.000018878 s |
1.27 |
sum / PartOpt / cpu / PostRev |
0.000024541 s |
0.000018693 s |
1.31 |
sum / PartOpt / cpu / BothRev |
0.000024032 s |
0.000018933 s |
1.27 |
sum / IPartOpt / cpu / PreRev |
0.000024361 s |
0.000018947 s |
1.29 |
sum / IPartOpt / cpu / PostRev |
0.0000247 s |
0.000018804 s |
1.31 |
sum / IPartOpt / cpu / BothRev |
0.000024032 s |
0.000019245000000000003 s |
1.25 |
sum / DefOpt / cpu / PreRev |
0.000024139 s |
0.000019159 s |
1.26 |
sum / DefOpt / cpu / PostRev |
0.00002442 s |
0.000018694 s |
1.31 |
sum / DefOpt / cpu / BothRev |
0.000023737 s |
0.0000187 s |
1.27 |
sum / IDefOpt / cpu / PreRev |
0.000024027 s |
0.000018622 s |
1.29 |
sum / IDefOpt / cpu / PostRev |
0.000023509 s |
0.000018793 s |
1.25 |
sum / IDefOpt / cpu / BothRev |
0.000024619 s |
0.000018927 s |
1.30 |
value_and_grad / JaXPipe / cpu / Primal |
0.000015381460052594775 s |
0.000023072 s |
0.67 |
value_and_grad / Jax / cpu / Primal |
0.000015481699902011313 s |
0.000022463 s |
0.69 |
value_and_grad / HLOOpt / cpu / Primal |
0.000015754139931232202 s |
0.000022896 s |
0.69 |
value_and_grad / PartOpt / cpu / Primal |
0.000015026540004328129 s |
0.000022911 s |
0.66 |
value_and_grad / IPartOpt / cpu / Primal |
0.0000145659999907366 s |
0.000022942 s |
0.63 |
value_and_grad / DefOpt / cpu / Primal |
0.0000151333398935094 s |
0.000022862 s |
0.66 |
value_and_grad / IDefOpt / cpu / Primal |
0.00001472878000640776 s |
0.000022936 s |
0.64 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033535 s |
0.000034944 s |
0.96 |
value_and_grad / Jax / cuda / Primal |
0.000033184 s |
0.000036448 s |
0.91 |
value_and_grad / HLOOpt / cuda / Primal |
0.000033664 s |
0.000035904 s |
0.94 |
value_and_grad / PartOpt / cuda / Primal |
0.000033472 s |
0.000036352 s |
0.92 |
value_and_grad / IPartOpt / cuda / Primal |
0.000033472 s |
0.000036671 s |
0.91 |
value_and_grad / DefOpt / cuda / Primal |
0.000033057000000000006 s |
0.000036384 s |
0.91 |
value_and_grad / IDefOpt / cuda / Primal |
0.000033504 s |
0.000035392 s |
0.95 |
value_and_grad / JaXPipe / cpu / Primal |
0.000028601000000000003 s |
0.000023072 s |
1.24 |
value_and_grad / Jax / cpu / Primal |
0.000028157 s |
0.000022463 s |
1.25 |
value_and_grad / HLOOpt / cpu / Primal |
0.000028392 s |
0.000022896 s |
1.24 |
value_and_grad / PartOpt / cpu / Primal |
0.000030939 s |
0.000022911 s |
1.35 |
value_and_grad / IPartOpt / cpu / Primal |
0.000028996 s |
0.000022942 s |
1.26 |
value_and_grad / DefOpt / cpu / Primal |
0.000029375 s |
0.000022862 s |
1.28 |
value_and_grad / IDefOpt / cpu / Primal |
0.000028282 s |
0.000022936 s |
1.23 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001417056 s |
0.001456319 s |
0.97 |
jaxmd20 / Jax / cuda / Primal |
0.001478624 s |
0.00147408 s |
1.00 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001346143 s |
0.001342494 s |
1.00 |
jaxmd20 / PartOpt / cuda / Primal |
0.001303138 s |
0.001357439 s |
0.96 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001306336 s |
0.001378975 s |
0.95 |
jaxmd20 / DefOpt / cuda / Primal |
0.000926559 s |
0.000939903 s |
0.99 |
jaxmd20 / IDefOpt / cuda / Primal |
0.0009497609999999 s |
0.000970495 s |
0.98 |
jaxmd20 / JaXPipe / cuda / Forward |
0.001552993 s |
0.001624415 s |
0.96 |
jaxmd20 / Jax / cuda / Forward |
0.001814432 s |
0.001870656 s |
0.97 |
jaxmd20 / HLOOpt / cuda / Forward |
0.00164416 s |
0.001714815 s |
0.96 |
jaxmd20 / PartOpt / cuda / Forward |
0.001636512 s |
0.001707679 s |
0.96 |
jaxmd20 / IPartOpt / cuda / Forward |
0.001636385 s |
0.001716833 s |
0.95 |
jaxmd20 / DefOpt / cuda / Forward |
0.00165936 s |
0.001706431 s |
0.97 |
jaxmd20 / IDefOpt / cuda / Forward |
0.001626848 s |
0.001709119 s |
0.95 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.002656319 s |
0.002762878 s |
0.96 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005333183 s |
0.005460187 s |
0.98 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.002693312 s |
0.002752605 s |
0.98 |
jaxmd20 / Jax / cuda / BothRev |
0.005326783 s |
0.005413761 s |
0.98 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.00275213 s |
0.002928446 s |
0.94 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005331266 s |
0.005459772 s |
0.98 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.002721218 s |
0.00282531 s |
0.96 |
jaxmd20 / PartOpt / cuda / PreRev |
0.0028088329999999 s |
0.002893278 s |
0.97 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005564352 s |
0.00555782 s |
1.00 |
jaxmd20 / PartOpt / cuda / BothRev |
0.002761762 s |
0.002812062 s |
0.98 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.002804225 s |
0.00290387 s |
0.97 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005539298 s |
0.005551674 s |
1.00 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.00276352 s |
0.00283331 s |
0.98 |
jaxmd20 / DefOpt / cuda / PreRev |
0.002830913 s |
0.002903646 s |
0.97 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002784509 s |
0.002856703 s |
0.97 |
jaxmd20 / DefOpt / cuda / BothRev |
0.002754179 s |
0.002813854 s |
0.98 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.0028197139999999 s |
0.002919998 s |
0.97 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.002319455 s |
0.0023530549999999 s |
0.99 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0027590089999999 s |
0.0028736929999999 s |
0.96 |
jaxmd40 / JaXPipe / cpu / Primal |
0.080871744 s |
0.066913492 s |
1.21 |
jaxmd40 / Jax / cpu / Primal |
0.086542735 s |
0.072603204 s |
1.19 |
jaxmd40 / HLOOpt / cpu / Primal |
0.107738907 s |
0.08960644 s |
1.20 |
jaxmd40 / PartOpt / cpu / Primal |
0.0875070699999999 s |
0.068425087 s |
1.28 |
jaxmd40 / IPartOpt / cpu / Primal |
0.090285724 s |
0.071339947 s |
1.27 |
jaxmd40 / DefOpt / cpu / Primal |
0.111931281 s |
0.082478036 s |
1.36 |
jaxmd40 / IDefOpt / cpu / Primal |
0.111787245 s |
0.0873339509999999 s |
1.28 |
jaxmd40 / JaXPipe / cpu / Forward |
0.2082339299999999 s |
0.162652715 s |
1.28 |
jaxmd40 / Jax / cpu / Forward |
0.11438239 s |
0.083123315 s |
1.38 |
jaxmd40 / HLOOpt / cpu / Forward |
0.202549076 s |
0.1585048329999999 s |
1.28 |
jaxmd40 / PartOpt / cpu / Forward |
0.206394582 s |
0.156655901 s |
1.32 |
jaxmd40 / IPartOpt / cpu / Forward |
0.2166409219999999 s |
0.158348678 s |
1.37 |
jaxmd40 / DefOpt / cpu / Forward |
0.215566341 s |
0.160751004 s |
1.34 |
jaxmd40 / IDefOpt / cpu / Forward |
0.219232094 s |
0.156729236 s |
1.40 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.285179328 s |
0.236199343 s |
1.21 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.1827301579999999 s |
0.139126863 s |
1.31 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.279620878 s |
0.222342406 s |
1.26 |
jaxmd40 / Jax / cpu / BothRev |
0.175687619 s |
0.139139926 s |
1.26 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.271401804 s |
0.229555762 s |
1.18 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.225985165 s |
0.166843464 s |
1.35 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.296669074 s |
0.227585488 s |
1.30 |
jaxmd40 / PartOpt / cpu / PreRev |
0.268977918 s |
0.2264403519999999 s |
1.19 |
jaxmd40 / PartOpt / cpu / PostRev |
0.158362114 s |
0.1348537139999999 s |
1.17 |
jaxmd40 / PartOpt / cpu / BothRev |
0.308250832 s |
0.243010073 s |
1.27 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.278988759 s |
0.212828879 s |
1.31 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.171958436 s |
0.1201934069999999 s |
1.43 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.307117154 s |
0.241794462 s |
1.27 |
jaxmd40 / DefOpt / cpu / PreRev |
0.265080799 s |
0.235562985 s |
1.13 |
jaxmd40 / DefOpt / cpu / PostRev |
0.224490867 s |
0.163910763 s |
1.37 |
jaxmd40 / DefOpt / cpu / BothRev |
0.328720387 s |
0.260948006 s |
1.26 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.270359511 s |
0.2254909209999999 s |
1.20 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.218760151 s |
0.174102821 s |
1.26 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.2854744329999999 s |
0.249471538 s |
1.14 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.702636371 s |
1.704837354 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.7054310700000002 s |
1.708882085 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.7146967709999998 s |
1.7198347470000002 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.696679978 s |
1.696605072 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.694667207 s |
1.693524776 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.666413894 s |
1.668373454 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.912049832 s |
1.910824517 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
7.425411464 s |
5.977225408000001 s |
1.24 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
7.481438081999999 s |
5.948982886 s |
1.26 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
7.255880417 s |
5.901809903999999 s |
1.23 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
7.565334965 s |
5.981963038 s |
1.26 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
7.679269209 s |
6.013644241 s |
1.28 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
3.254772846 s |
2.365326266 s |
1.38 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
8.417110942999999 s |
6.395372049 s |
1.32 |
This comment was automatically generated by workflow using github-action-benchmark.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.