-
Notifications
You must be signed in to change notification settings - Fork 27
Add multislice simplification #1958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4d629fd to
a3a7c56
Compare
|
|
||
| // If no results are used, this should be handled by dead code elimination | ||
| if (usedCount == 0) | ||
| return failure(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also might as well do deletion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same success comment
| op.getLoc(), resultTypes, op.getOperand(), startIndices, limitIndices, | ||
| op.getStrides(), op.getDimension(), newLeftAmount, newRightAmount); | ||
|
|
||
| // Map old results to new results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sur eto keep sharding
a3a7c56 to
61ffc4c
Compare
| } | ||
|
|
||
| auto sliceOp = rewriter.create<stablehlo::SliceOp>( | ||
| op.getLoc(), op.getOperand(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same replaceOpWithNewOp comment [with the sharding comment] from rotate
6a60956 to
b2c180e
Compare
|
|
||
| void mlir::transform::addMultiSliceOpt(RewritePatternSet &patterns, | ||
| MLIRContext &context, | ||
| PatternBenefit benefit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here on do we need this defn?
| MLIRContext &context, PatternBenefit benefit); | ||
| void addEnzymeHLOUnroll(RewritePatternSet &patterns, int64_t maxNumIterations, | ||
| MLIRContext &context, PatternBenefit benefit); | ||
| void addMultiSliceOpt(RewritePatternSet &patterns, MLIRContext &context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this still needs removing right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, sorry!
|
build fails: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 0d0f264 | Previous: bb071ab | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000006922680004208815 s |
0.00000694008001119073 s |
1.00 |
actmtch / Jax / cpu / Primal |
0.000006707680031468044 s |
0.000006830459988123039 s |
0.98 |
actmtch / HLOOpt / cpu / Primal |
0.000007436860023517511 s |
0.000009913419989970864 s |
0.75 |
actmtch / PartOpt / cpu / Primal |
0.000006741900006090873 s |
0.000006217799964360893 s |
1.08 |
actmtch / IPartOpt / cpu / Primal |
0.0000064510400079598185 s |
0.000006514500037155813 s |
0.99 |
actmtch / DefOpt / cpu / Primal |
0.000011685319950629491 s |
0.000011061580007662997 s |
1.06 |
actmtch / IDefOpt / cpu / Primal |
0.000007385560029433691 s |
0.000006896759978189948 s |
1.07 |
actmtch / JaXPipe / cpu / Forward |
0.000010776579965749989 s |
0.000010935439995591878 s |
0.99 |
actmtch / Jax / cpu / Forward |
0.000009474520002186182 s |
0.000009495160047663376 s |
1.00 |
actmtch / HLOOpt / cpu / Forward |
0.000014807180014031472 s |
0.000014464259993474116 s |
1.02 |
actmtch / PartOpt / cpu / Forward |
0.00001519465998171654 s |
0.000015027659965198836 s |
1.01 |
actmtch / IPartOpt / cpu / Forward |
0.000010630659971866407 s |
0.000010643459982020433 s |
1.00 |
actmtch / DefOpt / cpu / Forward |
0.000015256540000336828 s |
0.000014413620010600426 s |
1.06 |
actmtch / IDefOpt / cpu / Forward |
0.000010763580012280728 s |
0.00001057197997397452 s |
1.02 |
actmtch / JaXPipe / cpu / PreRev |
0.000011385760008124634 s |
0.000010950099995170605 s |
1.04 |
actmtch / JaXPipe / cpu / PostRev |
0.000010079699986818014 s |
0.000010357139954066952 s |
0.97 |
actmtch / JaXPipe / cpu / BothRev |
0.000010704240021368604 s |
0.000014666399965790334 s |
0.73 |
actmtch / Jax / cpu / BothRev |
0.000009644120018492684 s |
0.000009740360010255244 s |
0.99 |
actmtch / HLOOpt / cpu / PreRev |
0.000011047500020140431 s |
0.000010884519979299513 s |
1.01 |
actmtch / HLOOpt / cpu / PostRev |
0.000015670159982619225 s |
0.00001493965999543434 s |
1.05 |
actmtch / HLOOpt / cpu / BothRev |
0.000012228620007590508 s |
0.000012820839965570484 s |
0.95 |
actmtch / PartOpt / cpu / PreRev |
0.000010227720003967989 s |
0.000011093580005763216 s |
0.92 |
actmtch / PartOpt / cpu / PostRev |
0.00000987922003332642 s |
0.000009808780005187144 s |
1.01 |
actmtch / PartOpt / cpu / BothRev |
0.000010955340003420134 s |
0.00001122273999499157 s |
0.98 |
actmtch / IPartOpt / cpu / PreRev |
0.000014503379989037055 s |
0.0000113531800070632 s |
1.28 |
actmtch / IPartOpt / cpu / PostRev |
0.000010034760025519065 s |
0.000010120260067196797 s |
0.99 |
actmtch / IPartOpt / cpu / BothRev |
0.000010352080043958268 s |
0.00001113555997108051 s |
0.93 |
actmtch / DefOpt / cpu / PreRev |
0.000011315520032439964 s |
0.000010477780015207828 s |
1.08 |
actmtch / DefOpt / cpu / PostRev |
0.000011221400018257555 s |
0.000010674739960450098 s |
1.05 |
actmtch / DefOpt / cpu / BothRev |
0.000010702859972298027 s |
0.000010495660035303445 s |
1.02 |
actmtch / IDefOpt / cpu / PreRev |
0.00001132979998146766 s |
0.000010619839986247826 s |
1.07 |
actmtch / IDefOpt / cpu / PostRev |
0.000010499039954083857 s |
0.000011006340000676574 s |
0.95 |
actmtch / IDefOpt / cpu / BothRev |
0.000010313560014765244 s |
0.000010521520025577046 s |
0.98 |
actmtch / JaXPipe / cuda / Primal |
0.000002015 s |
0.000002047 s |
0.98 |
actmtch / Jax / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / HLOOpt / cuda / Primal |
0.000002047 s |
0.000002016 s |
1.02 |
actmtch / PartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / IPartOpt / cuda / Primal |
0.0000020170000000000003 s |
0.000002015 s |
1.00 |
actmtch / DefOpt / cuda / Primal |
0.000002015 s |
0.000002047 s |
0.98 |
actmtch / IDefOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / JaXPipe / cuda / Forward |
0.000009984 s |
0.000010847 s |
0.92 |
actmtch / Jax / cuda / Forward |
0.0000096 s |
0.00001056 s |
0.91 |
actmtch / HLOOpt / cuda / Forward |
0.00001008 s |
0.000010689 s |
0.94 |
actmtch / PartOpt / cuda / Forward |
0.00001008 s |
0.000010304 s |
0.98 |
actmtch / IPartOpt / cuda / Forward |
0.000009983 s |
0.000010592 s |
0.94 |
actmtch / DefOpt / cuda / Forward |
0.000009856 s |
0.0000104 s |
0.95 |
actmtch / IDefOpt / cuda / Forward |
0.000010016 s |
0.000010496 s |
0.95 |
actmtch / JaXPipe / cuda / PreRev |
0.000010559 s |
0.000010815 s |
0.98 |
actmtch / JaXPipe / cuda / PostRev |
0.000009696 s |
0.000010208 s |
0.95 |
actmtch / JaXPipe / cuda / BothRev |
0.000009727 s |
0.00001008 s |
0.96 |
actmtch / Jax / cuda / BothRev |
0.000009632 s |
0.00000976 s |
0.99 |
actmtch / HLOOpt / cuda / PreRev |
0.000009984 s |
0.000010687 s |
0.93 |
actmtch / HLOOpt / cuda / PostRev |
0.000010144 s |
0.000010016 s |
1.01 |
actmtch / HLOOpt / cuda / BothRev |
0.000009824 s |
0.00001008 s |
0.97 |
actmtch / PartOpt / cuda / PreRev |
0.000010048 s |
0.00001104 s |
0.91 |
actmtch / PartOpt / cuda / PostRev |
0.000010048 s |
0.00001008 s |
1.00 |
actmtch / PartOpt / cuda / BothRev |
0.00000992 s |
0.000009631 s |
1.03 |
actmtch / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000014688 s |
0.68 |
actmtch / IPartOpt / cuda / PostRev |
0.00001008 s |
0.000010688 s |
0.94 |
actmtch / IPartOpt / cuda / BothRev |
0.00001056 s |
0.00001056 s |
1 |
actmtch / DefOpt / cuda / PreRev |
0.000010208 s |
0.000010464 s |
0.98 |
actmtch / DefOpt / cuda / PostRev |
0.000009344 s |
0.0000104 s |
0.90 |
actmtch / DefOpt / cuda / BothRev |
0.000009408 s |
0.00001008 s |
0.93 |
actmtch / IDefOpt / cuda / PreRev |
0.000009568 s |
0.000010496 s |
0.91 |
actmtch / IDefOpt / cuda / PostRev |
0.000009824 s |
0.0000104 s |
0.94 |
actmtch / IDefOpt / cuda / BothRev |
0.00001008 s |
0.000010592 s |
0.95 |
actmtch / JaXPipe / tpu / Primal |
5.63025e-7 s |
5.63975e-7 s |
1.00 |
actmtch / Jax / tpu / Primal |
6.070249999999999e-7 s |
6.071e-7 s |
1.00 |
actmtch / HLOOpt / tpu / Primal |
0.000002103325 s |
0.0000021071 s |
1.00 |
actmtch / PartOpt / tpu / Primal |
6.06525e-7 s |
6.06825e-7 s |
1.00 |
actmtch / IPartOpt / tpu / Primal |
5.62525e-7 s |
5.626e-7 s |
1.00 |
actmtch / DefOpt / tpu / Primal |
0.0000021585750000000003 s |
0.0000021625 s |
1.00 |
actmtch / IDefOpt / tpu / Primal |
0.0000021019750000000003 s |
0.000002092475 s |
1.00 |
actmtch / JaXPipe / tpu / Forward |
0.0000038218 s |
0.00000382475 s |
1.00 |
actmtch / Jax / tpu / Forward |
0.000001230325 s |
0.000001216875 s |
1.01 |
actmtch / HLOOpt / tpu / Forward |
0.0000039444 s |
0.000003937474999999999 s |
1.00 |
actmtch / PartOpt / tpu / Forward |
0.0000039164 s |
0.00000391355 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.000003944375 s |
0.000003935775 s |
1.00 |
actmtch / DefOpt / tpu / Forward |
0.000003918225 s |
0.000003911125 s |
1.00 |
actmtch / IDefOpt / tpu / Forward |
0.00000393035 s |
0.000003934225 s |
1.00 |
actmtch / JaXPipe / tpu / PreRev |
0.000003477075 s |
0.000003476425 s |
1.00 |
actmtch / JaXPipe / tpu / PostRev |
0.0000016477 s |
0.000001637175 s |
1.01 |
actmtch / JaXPipe / tpu / BothRev |
0.0000034755000000000004 s |
0.000003482925 s |
1.00 |
actmtch / Jax / tpu / BothRev |
0.00000164705 s |
0.0000016424 s |
1.00 |
actmtch / HLOOpt / tpu / PreRev |
0.000003473925 s |
0.0000034862500000000003 s |
1.00 |
actmtch / HLOOpt / tpu / PostRev |
0.000003417575 s |
0.0000034163 s |
1.00 |
actmtch / HLOOpt / tpu / BothRev |
0.000003468225 s |
0.000003487525 s |
0.99 |
actmtch / PartOpt / tpu / PreRev |
0.000003409225 s |
0.000003413775 s |
1.00 |
actmtch / PartOpt / tpu / PostRev |
0.000001586025 s |
0.0000015953999999999998 s |
0.99 |
actmtch / PartOpt / tpu / BothRev |
0.000003414275 s |
0.000003413975 s |
1.00 |
actmtch / IPartOpt / tpu / PreRev |
0.000003464675 s |
0.00000347355 s |
1.00 |
actmtch / IPartOpt / tpu / PostRev |
0.00000163885 s |
0.000001643075 s |
1.00 |
actmtch / IPartOpt / tpu / BothRev |
0.000003475025 s |
0.00000348125 s |
1.00 |
actmtch / DefOpt / tpu / PreRev |
0.0000034086 s |
0.0000034240500000000003 s |
1.00 |
actmtch / DefOpt / tpu / PostRev |
0.0000034095 s |
0.0000034275 s |
0.99 |
actmtch / DefOpt / tpu / BothRev |
0.0000034197 s |
0.000003414375 s |
1.00 |
actmtch / IDefOpt / tpu / PreRev |
0.000003478075 s |
0.000003470475 s |
1.00 |
actmtch / IDefOpt / tpu / PostRev |
0.000003399 s |
0.000003411775 s |
1.00 |
actmtch / IDefOpt / tpu / BothRev |
0.00000346885 s |
0.0000034695 s |
1.00 |
actmtch / JaXPipe / cpu / Primal |
0.00001585 s |
0.00000694008001119073 s |
2.28 |
actmtch / Jax / cpu / Primal |
0.000016595 s |
0.000006830459988123039 s |
2.43 |
actmtch / HLOOpt / cpu / Primal |
0.000017224 s |
0.000009913419989970864 s |
1.74 |
actmtch / PartOpt / cpu / Primal |
0.000016223 s |
0.000006217799964360893 s |
2.61 |
actmtch / IPartOpt / cpu / Primal |
0.000016725 s |
0.000006514500037155813 s |
2.57 |
actmtch / DefOpt / cpu / Primal |
0.000017114 s |
0.000011061580007662997 s |
1.55 |
actmtch / IDefOpt / cpu / Primal |
0.000017183 s |
0.000006896759978189948 s |
2.49 |
actmtch / JaXPipe / cpu / Forward |
0.000023126 s |
0.000010935439995591878 s |
2.11 |
actmtch / Jax / cpu / Forward |
0.000022368 s |
0.000009495160047663376 s |
2.36 |
actmtch / HLOOpt / cpu / Forward |
0.000023513 s |
0.000014464259993474116 s |
1.63 |
actmtch / PartOpt / cpu / Forward |
0.000022792 s |
0.000015027659965198836 s |
1.52 |
actmtch / IPartOpt / cpu / Forward |
0.000023233 s |
0.000010643459982020433 s |
2.18 |
actmtch / DefOpt / cpu / Forward |
0.000022817 s |
0.000014413620010600426 s |
1.58 |
actmtch / IDefOpt / cpu / Forward |
0.000023443 s |
0.00001057197997397452 s |
2.22 |
actmtch / JaXPipe / cpu / PreRev |
0.000023538 s |
0.000010950099995170605 s |
2.15 |
actmtch / JaXPipe / cpu / PostRev |
0.000021261 s |
0.000010357139954066952 s |
2.05 |
actmtch / JaXPipe / cpu / BothRev |
0.000023345 s |
0.000014666399965790334 s |
1.59 |
actmtch / Jax / cpu / BothRev |
0.000021594 s |
0.000009740360010255244 s |
2.22 |
actmtch / HLOOpt / cpu / PreRev |
0.000023555 s |
0.000010884519979299513 s |
2.16 |
actmtch / HLOOpt / cpu / PostRev |
0.000023763 s |
0.00001493965999543434 s |
1.59 |
actmtch / HLOOpt / cpu / BothRev |
0.000024015000000000003 s |
0.000012820839965570484 s |
1.87 |
actmtch / PartOpt / cpu / PreRev |
0.000023667 s |
0.000011093580005763216 s |
2.13 |
actmtch / PartOpt / cpu / PostRev |
0.000021309 s |
0.000009808780005187144 s |
2.17 |
actmtch / PartOpt / cpu / BothRev |
0.00002394 s |
0.00001122273999499157 s |
2.13 |
actmtch / IPartOpt / cpu / PreRev |
0.000023736 s |
0.0000113531800070632 s |
2.09 |
actmtch / IPartOpt / cpu / PostRev |
0.000021884 s |
0.000010120260067196797 s |
2.16 |
actmtch / IPartOpt / cpu / BothRev |
0.000023677 s |
0.00001113555997108051 s |
2.13 |
actmtch / DefOpt / cpu / PreRev |
0.000023929 s |
0.000010477780015207828 s |
2.28 |
actmtch / DefOpt / cpu / PostRev |
0.000024253 s |
0.000010674739960450098 s |
2.27 |
actmtch / DefOpt / cpu / BothRev |
0.000023494 s |
0.000010495660035303445 s |
2.24 |
actmtch / IDefOpt / cpu / PreRev |
0.000023168 s |
0.000010619839986247826 s |
2.18 |
actmtch / IDefOpt / cpu / PostRev |
0.000023666 s |
0.000011006340000676574 s |
2.15 |
actmtch / IDefOpt / cpu / BothRev |
0.00002373 s |
0.000010521520025577046 s |
2.26 |
add_one / JaXPipe / cpu / Primal |
0.000007419180019496707 s |
0.00000752472000385751 s |
0.99 |
add_one / Jax / cpu / Primal |
0.000006683779993181816 s |
0.000007206639938885928 s |
0.93 |
add_one / HLOOpt / cpu / Primal |
0.000006441080022341339 s |
0.00001020033998429426 s |
0.63 |
add_one / PartOpt / cpu / Primal |
0.000006262699980652542 s |
0.000006483379984274506 s |
0.97 |
add_one / IPartOpt / cpu / Primal |
0.000006520199949591188 s |
0.000007264080022650887 s |
0.90 |
add_one / DefOpt / cpu / Primal |
0.000010351119981351077 s |
0.00001108921997001744 s |
0.93 |
add_one / IDefOpt / cpu / Primal |
0.000006974599991735886 s |
0.000006771060034225229 s |
1.03 |
add_one / JaXPipe / cpu / Forward |
0.000010205639991909264 s |
0.00001025116001983406 s |
1.00 |
add_one / Jax / cpu / Forward |
0.000011320160037939786 s |
0.00000995673997749691 s |
1.14 |
add_one / HLOOpt / cpu / Forward |
0.000014209039973138717 s |
0.000014886220023981876 s |
0.95 |
add_one / PartOpt / cpu / Forward |
0.00001368597997498 s |
0.00001484262002122705 s |
0.92 |
add_one / IPartOpt / cpu / Forward |
0.000009566959988660528 s |
0.00000993311999081925 s |
0.96 |
add_one / DefOpt / cpu / Forward |
0.000014696760035803891 s |
0.000015093059992068448 s |
0.97 |
add_one / IDefOpt / cpu / Forward |
0.000009842939998634393 s |
0.000010064300004160032 s |
0.98 |
add_one / JaXPipe / cpu / PreRev |
0.000011414900045565446 s |
0.000011601280020840933 s |
0.98 |
add_one / JaXPipe / cpu / PostRev |
0.00001103010000406357 s |
0.000011605040008362266 s |
0.95 |
add_one / JaXPipe / cpu / BothRev |
0.000013230260019554408 s |
0.00001129558000684483 s |
1.17 |
add_one / Jax / cpu / BothRev |
0.000010707639985412244 s |
0.00001170777999504935 s |
0.91 |
add_one / HLOOpt / cpu / PreRev |
0.000011337679989082972 s |
0.000011194540029464406 s |
1.01 |
add_one / HLOOpt / cpu / PostRev |
0.000015331020003941377 s |
0.000011465500001577311 s |
1.34 |
add_one / HLOOpt / cpu / BothRev |
0.0000165456600007019 s |
0.000017417280014342394 s |
0.95 |
add_one / PartOpt / cpu / PreRev |
0.000010961900052279816 s |
0.000011230060035813947 s |
0.98 |
add_one / PartOpt / cpu / PostRev |
0.000011039439987143852 s |
0.000011293500028841665 s |
0.98 |
add_one / PartOpt / cpu / BothRev |
0.000011220940032217183 s |
0.00001118008005505544 s |
1.00 |
add_one / IPartOpt / cpu / PreRev |
0.000015995380035747075 s |
0.00001686254002379428 s |
0.95 |
add_one / IPartOpt / cpu / PostRev |
0.000010996059982062434 s |
0.000011557100024219836 s |
0.95 |
add_one / IPartOpt / cpu / BothRev |
0.000011431580014686915 s |
0.00001106198002162273 s |
1.03 |
add_one / DefOpt / cpu / PreRev |
0.00001102045996049128 s |
0.000011062340045100429 s |
1.00 |
add_one / DefOpt / cpu / PostRev |
0.000011594799943850375 s |
0.000010930259950328036 s |
1.06 |
add_one / DefOpt / cpu / BothRev |
0.000011351380017003976 s |
0.000011320499997964362 s |
1.00 |
add_one / IDefOpt / cpu / PreRev |
0.000011180020001120283 s |
0.0000110442599907401 s |
1.01 |
add_one / IDefOpt / cpu / PostRev |
0.000011913299977095448 s |
0.00001119873999414267 s |
1.06 |
add_one / IDefOpt / cpu / BothRev |
0.000011332740004945665 s |
0.00001130389996433223 s |
1.00 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / DefOpt / cuda / Primal |
0.000001951 s |
0.0000019200000000000003 s |
1.02 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / JaXPipe / cuda / Forward |
0.000010175 s |
0.000010272 s |
0.99 |
add_one / Jax / cuda / Forward |
0.00001024 s |
0.000010463 s |
0.98 |
add_one / HLOOpt / cuda / Forward |
0.00001008 s |
0.000010112 s |
1.00 |
add_one / PartOpt / cuda / Forward |
0.000010016 s |
0.000009887 s |
1.01 |
add_one / IPartOpt / cuda / Forward |
0.000009887 s |
0.000010432 s |
0.95 |
add_one / DefOpt / cuda / Forward |
0.000010112 s |
0.000010272 s |
0.98 |
add_one / IDefOpt / cuda / Forward |
0.000009792 s |
0.000010208 s |
0.96 |
add_one / JaXPipe / cuda / PreRev |
0.000024063 s |
0.000026208 s |
0.92 |
add_one / JaXPipe / cuda / PostRev |
0.000024479 s |
0.000026816 s |
0.91 |
add_one / JaXPipe / cuda / BothRev |
0.000024607 s |
0.000025248 s |
0.97 |
add_one / Jax / cuda / BothRev |
0.00002464 s |
0.000025152 s |
0.98 |
add_one / HLOOpt / cuda / PreRev |
0.00002528 s |
0.00002496 s |
1.01 |
add_one / HLOOpt / cuda / PostRev |
0.000025024 s |
0.00002512 s |
1.00 |
add_one / HLOOpt / cuda / BothRev |
0.000024704 s |
0.000025344 s |
0.97 |
add_one / PartOpt / cuda / PreRev |
0.000025024 s |
0.000024992 s |
1.00 |
add_one / PartOpt / cuda / PostRev |
0.000025215 s |
0.00002528 s |
1.00 |
add_one / PartOpt / cuda / BothRev |
0.000025312 s |
0.000025439 s |
1.00 |
add_one / IPartOpt / cuda / PreRev |
0.000025056 s |
0.000025632 s |
0.98 |
add_one / IPartOpt / cuda / PostRev |
0.000025248 s |
0.00002528 s |
1.00 |
add_one / IPartOpt / cuda / BothRev |
0.000024863 s |
0.000025312 s |
0.98 |
add_one / DefOpt / cuda / PreRev |
0.000026336 s |
0.000025024 s |
1.05 |
add_one / DefOpt / cuda / PostRev |
0.000024448 s |
0.00002496 s |
0.98 |
add_one / DefOpt / cuda / BothRev |
0.000025183 s |
0.000025632 s |
0.98 |
add_one / IDefOpt / cuda / PreRev |
0.000024576 s |
0.0000256 s |
0.96 |
add_one / IDefOpt / cuda / PostRev |
0.000024352 s |
0.000024992 s |
0.97 |
add_one / IDefOpt / cuda / BothRev |
0.00002448 s |
0.000024416 s |
1.00 |
add_one / JaXPipe / tpu / Primal |
0.00000143155 s |
0.000001429075 s |
1.00 |
add_one / Jax / tpu / Primal |
0.000001401775 s |
0.000001404775 s |
1.00 |
add_one / HLOOpt / tpu / Primal |
0.000001421175 s |
0.000001432975 s |
0.99 |
add_one / PartOpt / tpu / Primal |
0.0000014000499999999998 s |
0.000001405375 s |
1.00 |
add_one / IPartOpt / tpu / Primal |
0.0000014236750000000002 s |
0.000001428 s |
1.00 |
add_one / DefOpt / tpu / Primal |
0.000001402575 s |
0.0000014243749999999998 s |
0.98 |
add_one / IDefOpt / tpu / Primal |
0.000001423625 s |
0.000001429225 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.00000186065 s |
0.0000018722 s |
0.99 |
add_one / Jax / tpu / Forward |
0.000001839975 s |
0.00000185155 s |
0.99 |
add_one / HLOOpt / tpu / Forward |
0.00000185385 s |
0.00000185065 s |
1.00 |
add_one / PartOpt / tpu / Forward |
0.000001837525 s |
0.000001847 s |
0.99 |
add_one / IPartOpt / tpu / Forward |
0.000001854875 s |
0.0000018569 s |
1.00 |
add_one / DefOpt / tpu / Forward |
0.00000184175 s |
0.0000018493 s |
1.00 |
add_one / IDefOpt / tpu / Forward |
0.000001854 s |
0.000001853475 s |
1.00 |
add_one / JaXPipe / tpu / PreRev |
0.000002245975 s |
0.0000022365750000000003 s |
1.00 |
add_one / JaXPipe / tpu / PostRev |
0.00000223725 s |
0.000002247325 s |
1.00 |
add_one / JaXPipe / tpu / BothRev |
0.0000022329 s |
0.0000022439 s |
1.00 |
add_one / Jax / tpu / BothRev |
0.0000022441 s |
0.0000022476 s |
1.00 |
add_one / HLOOpt / tpu / PreRev |
0.000002247225 s |
0.0000022388 s |
1.00 |
add_one / HLOOpt / tpu / PostRev |
0.000002236175 s |
0.000002238575 s |
1.00 |
add_one / HLOOpt / tpu / BothRev |
0.00000224165 s |
0.000002246225 s |
1.00 |
add_one / PartOpt / tpu / PreRev |
0.00000223895 s |
0.000002243125 s |
1.00 |
add_one / PartOpt / tpu / PostRev |
0.00000224895 s |
0.000002234625 s |
1.01 |
add_one / PartOpt / tpu / BothRev |
0.0000022533750000000003 s |
0.000002238475 s |
1.01 |
add_one / IPartOpt / tpu / PreRev |
0.00000223735 s |
0.000002239425 s |
1.00 |
add_one / IPartOpt / tpu / PostRev |
0.00000224365 s |
0.000002247275 s |
1.00 |
add_one / IPartOpt / tpu / BothRev |
0.00000223935 s |
0.00000224215 s |
1.00 |
add_one / DefOpt / tpu / PreRev |
0.0000022373 s |
0.0000022334 s |
1.00 |
add_one / DefOpt / tpu / PostRev |
0.0000022364 s |
0.0000022361 s |
1.00 |
add_one / DefOpt / tpu / BothRev |
0.00000223275 s |
0.000002245 s |
0.99 |
add_one / IDefOpt / tpu / PreRev |
0.00000223795 s |
0.0000022311750000000003 s |
1.00 |
add_one / IDefOpt / tpu / PostRev |
0.000002237025 s |
0.000002242125 s |
1.00 |
add_one / IDefOpt / tpu / BothRev |
0.000002233375 s |
0.000002238475 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000015702 s |
0.00000752472000385751 s |
2.09 |
add_one / Jax / cpu / Primal |
0.00001602 s |
0.000007206639938885928 s |
2.22 |
add_one / HLOOpt / cpu / Primal |
0.000015842 s |
0.00001020033998429426 s |
1.55 |
add_one / PartOpt / cpu / Primal |
0.000015823 s |
0.000006483379984274506 s |
2.44 |
add_one / IPartOpt / cpu / Primal |
0.000015977000000000003 s |
0.000007264080022650887 s |
2.20 |
add_one / DefOpt / cpu / Primal |
0.000015605 s |
0.00001108921997001744 s |
1.41 |
add_one / IDefOpt / cpu / Primal |
0.000015808 s |
0.000006771060034225229 s |
2.33 |
add_one / JaXPipe / cpu / Forward |
0.000021538000000000003 s |
0.00001025116001983406 s |
2.10 |
add_one / Jax / cpu / Forward |
0.000021609 s |
0.00000995673997749691 s |
2.17 |
add_one / HLOOpt / cpu / Forward |
0.000021216 s |
0.000014886220023981876 s |
1.43 |
add_one / PartOpt / cpu / Forward |
0.000021656 s |
0.00001484262002122705 s |
1.46 |
add_one / IPartOpt / cpu / Forward |
0.000021584 s |
0.00000993311999081925 s |
2.17 |
add_one / DefOpt / cpu / Forward |
0.000021612 s |
0.000015093059992068448 s |
1.43 |
add_one / IDefOpt / cpu / Forward |
0.00002145 s |
0.000010064300004160032 s |
2.13 |
add_one / JaXPipe / cpu / PreRev |
0.000024185000000000003 s |
0.000011601280020840933 s |
2.08 |
add_one / JaXPipe / cpu / PostRev |
0.000024051 s |
0.000011605040008362266 s |
2.07 |
add_one / JaXPipe / cpu / BothRev |
0.00002393 s |
0.00001129558000684483 s |
2.12 |
add_one / Jax / cpu / BothRev |
0.000023654 s |
0.00001170777999504935 s |
2.02 |
add_one / HLOOpt / cpu / PreRev |
0.000023871 s |
0.000011194540029464406 s |
2.13 |
add_one / HLOOpt / cpu / PostRev |
0.000023947 s |
0.000011465500001577311 s |
2.09 |
add_one / HLOOpt / cpu / BothRev |
0.000023317 s |
0.000017417280014342394 s |
1.34 |
add_one / PartOpt / cpu / PreRev |
0.000023662 s |
0.000011230060035813947 s |
2.11 |
add_one / PartOpt / cpu / PostRev |
0.000024017 s |
0.000011293500028841665 s |
2.13 |
add_one / PartOpt / cpu / BothRev |
0.000024302 s |
0.00001118008005505544 s |
2.17 |
add_one / IPartOpt / cpu / PreRev |
0.000023848000000000003 s |
0.00001686254002379428 s |
1.41 |
add_one / IPartOpt / cpu / PostRev |
0.000024048 s |
0.000011557100024219836 s |
2.08 |
add_one / IPartOpt / cpu / BothRev |
0.000023498 s |
0.00001106198002162273 s |
2.12 |
add_one / DefOpt / cpu / PreRev |
0.000024122 s |
0.000011062340045100429 s |
2.18 |
add_one / DefOpt / cpu / PostRev |
0.000024427 s |
0.000010930259950328036 s |
2.23 |
add_one / DefOpt / cpu / BothRev |
0.000023948 s |
0.000011320499997964362 s |
2.12 |
add_one / IDefOpt / cpu / PreRev |
0.000023517 s |
0.0000110442599907401 s |
2.13 |
add_one / IDefOpt / cpu / PostRev |
0.000024344000000000003 s |
0.00001119873999414267 s |
2.17 |
add_one / IDefOpt / cpu / BothRev |
0.000023821 s |
0.00001130389996433223 s |
2.11 |
add_two / JaXPipe / cpu / Primal |
0.000007554700005130144 s |
0.00000779061997491226 s |
0.97 |
add_two / Jax / cpu / Primal |
0.000006578679958693101 s |
0.0000071242400281334994 s |
0.92 |
add_two / HLOOpt / cpu / Primal |
0.000010488239995538606 s |
0.000011331699979564292 s |
0.93 |
add_two / PartOpt / cpu / Primal |
0.000006971039993004524 s |
0.000006988320010350435 s |
1.00 |
add_two / IPartOpt / cpu / Primal |
0.000006704200013700756 s |
0.000007562239989056252 s |
0.89 |
add_two / DefOpt / cpu / Primal |
0.000010738399996625958 s |
0.000007133199978852645 s |
1.51 |
add_two / IDefOpt / cpu / Primal |
0.000007055079986457713 s |
0.000006988179975451203 s |
1.01 |
add_two / JaXPipe / cpu / Forward |
0.00001010052007586637 s |
0.000010467719976077206 s |
0.96 |
add_two / Jax / cpu / Forward |
0.000010374140028943657 s |
0.00001014771996779018 s |
1.02 |
add_two / HLOOpt / cpu / Forward |
0.000014734979968125116 s |
0.000015272099981302746 s |
0.96 |
add_two / PartOpt / cpu / Forward |
0.0000145965599858755 s |
0.000014978740009610192 s |
0.97 |
add_two / IPartOpt / cpu / Forward |
0.000010327779964427464 s |
0.000010259620003125749 s |
1.01 |
add_two / DefOpt / cpu / Forward |
0.000014405180018002283 s |
0.000015081880019351956 s |
0.96 |
add_two / IDefOpt / cpu / Forward |
0.000010043919983218075 s |
0.000010221640013696745 s |
0.98 |
add_two / JaXPipe / cpu / PreRev |
0.000013687980053873617 s |
0.00001410851999935403 s |
0.97 |
add_two / JaXPipe / cpu / PostRev |
0.00001318387996434467 s |
0.000013603640009023365 s |
0.97 |
add_two / JaXPipe / cpu / BothRev |
0.000013730460022998158 s |
0.00001366785998470732 s |
1.00 |
add_two / Jax / cpu / BothRev |
0.000013033659997745418 s |
0.000013924240020060096 s |
0.94 |
add_two / HLOOpt / cpu / PreRev |
0.0000137636200452107 s |
0.000013977280032122508 s |
0.98 |
add_two / HLOOpt / cpu / PostRev |
0.000013095559988869356 s |
0.00001403646003382164 s |
0.93 |
add_two / HLOOpt / cpu / BothRev |
0.000015035820033517669 s |
0.000015436239973496413 s |
0.97 |
add_two / PartOpt / cpu / PreRev |
0.00001374123996356502 s |
0.000013534179988710091 s |
1.02 |
add_two / PartOpt / cpu / PostRev |
0.00001316482000220276 s |
0.000013693600021724703 s |
0.96 |
add_two / PartOpt / cpu / BothRev |
0.000013651259996549924 s |
0.000013705980018130504 s |
1.00 |
add_two / IPartOpt / cpu / PreRev |
0.000013624180000988416 s |
0.000013599339972643063 s |
1.00 |
add_two / IPartOpt / cpu / PostRev |
0.000013394839970715112 s |
0.00001350412002466328 s |
0.99 |
add_two / IPartOpt / cpu / BothRev |
0.000013487699989127578 s |
0.000013644700020449818 s |
0.99 |
add_two / DefOpt / cpu / PreRev |
0.000013821899974573173 s |
0.000014368500060299994 s |
0.96 |
add_two / DefOpt / cpu / PostRev |
0.000013729740003327606 s |
0.000013709300001210067 s |
1.00 |
add_two / DefOpt / cpu / BothRev |
0.000013606080028694122 s |
0.000013578719981524044 s |
1.00 |
add_two / IDefOpt / cpu / PreRev |
0.000014109220019236091 s |
0.000013646060060636956 s |
1.03 |
add_two / IDefOpt / cpu / PostRev |
0.00001442523997866374 s |
0.000014498359960271043 s |
0.99 |
add_two / IDefOpt / cpu / BothRev |
0.000013957880009911604 s |
0.000013887720015191008 s |
1.01 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001951 s |
0.98 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / JaXPipe / cuda / Forward |
0.000009824 s |
0.000010368 s |
0.95 |
add_two / Jax / cuda / Forward |
0.00000944 s |
0.000009984 s |
0.95 |
add_two / HLOOpt / cuda / Forward |
0.000009631 s |
0.000010176 s |
0.95 |
add_two / PartOpt / cuda / Forward |
0.000009215 s |
0.000010048 s |
0.92 |
add_two / IPartOpt / cuda / Forward |
0.00000976 s |
0.000010912 s |
0.89 |
add_two / DefOpt / cuda / Forward |
0.000009696 s |
0.00001008 s |
0.96 |
add_two / IDefOpt / cuda / Forward |
0.000009183 s |
0.000009312000000000002 s |
0.99 |
add_two / JaXPipe / cuda / PreRev |
0.000032416 s |
0.000032127999999999995 s |
1.01 |
add_two / JaXPipe / cuda / PostRev |
0.000031967 s |
0.000031776 s |
1.01 |
add_two / JaXPipe / cuda / BothRev |
0.000031904000000000005 s |
0.000032 s |
1.00 |
add_two / Jax / cuda / BothRev |
0.000032352 s |
0.000031967 s |
1.01 |
add_two / HLOOpt / cuda / PreRev |
0.000031584 s |
0.000031936 s |
0.99 |
add_two / HLOOpt / cuda / PostRev |
0.000031968 s |
0.000031488 s |
1.02 |
add_two / HLOOpt / cuda / BothRev |
0.0000312 s |
0.000032032 s |
0.97 |
add_two / PartOpt / cuda / PreRev |
0.0000312 s |
0.000031904000000000005 s |
0.98 |
add_two / PartOpt / cuda / PostRev |
0.000032416 s |
0.000034656 s |
0.94 |
add_two / PartOpt / cuda / BothRev |
0.000032256 s |
0.000033119999999999995 s |
0.97 |
add_two / IPartOpt / cuda / PreRev |
0.000033824 s |
0.000031584 s |
1.07 |
add_two / IPartOpt / cuda / PostRev |
0.000030719 s |
0.00003168 s |
0.97 |
add_two / IPartOpt / cuda / BothRev |
0.0000312 s |
0.000031296 s |
1.00 |
add_two / DefOpt / cuda / PreRev |
0.000031712 s |
0.000032032 s |
0.99 |
add_two / DefOpt / cuda / PostRev |
0.000032800000000000004 s |
0.00003184 s |
1.03 |
add_two / DefOpt / cuda / BothRev |
0.000032159 s |
0.000031711 s |
1.01 |
add_two / IDefOpt / cuda / PreRev |
0.000031808000000000004 s |
0.000031584 s |
1.01 |
add_two / IDefOpt / cuda / PostRev |
0.000031264 s |
0.000031456 s |
0.99 |
add_two / IDefOpt / cuda / BothRev |
0.000031168000000000004 s |
0.000032384 s |
0.96 |
add_two / JaXPipe / tpu / Primal |
0.000001441775 s |
0.000001423875 s |
1.01 |
add_two / Jax / tpu / Primal |
0.0000014679000000000002 s |
0.000001472625 s |
1.00 |
add_two / HLOOpt / tpu / Primal |
0.000001429375 s |
0.0000014379 s |
0.99 |
add_two / PartOpt / tpu / Primal |
0.000001469375 s |
0.0000014742 s |
1.00 |
add_two / IPartOpt / tpu / Primal |
0.0000014317750000000002 s |
0.0000014400250000000002 s |
0.99 |
add_two / DefOpt / tpu / Primal |
0.000001473875 s |
0.000001485775 s |
0.99 |
add_two / IDefOpt / tpu / Primal |
0.000001431325 s |
0.000001434175 s |
1.00 |
add_two / JaXPipe / tpu / Forward |
0.000001829075 s |
0.000001822325 s |
1.00 |
add_two / Jax / tpu / Forward |
0.0000018201 s |
0.0000018356 s |
0.99 |
add_two / HLOOpt / tpu / Forward |
0.0000018214250000000003 s |
0.000001828375 s |
1.00 |
add_two / PartOpt / tpu / Forward |
0.000001821675 s |
0.000001823575 s |
1.00 |
add_two / IPartOpt / tpu / Forward |
0.00000183405 s |
0.0000018298 s |
1.00 |
add_two / DefOpt / tpu / Forward |
0.0000018325250000000003 s |
0.0000018305 s |
1.00 |
add_two / IDefOpt / tpu / Forward |
0.0000018227 s |
0.000001819425 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.000002850475 s |
0.00000284435 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.000002757025 s |
0.000002750925 s |
1.00 |
add_two / JaXPipe / tpu / BothRev |
0.0000028392 s |
0.000002833275 s |
1.00 |
add_two / Jax / tpu / BothRev |
0.0000027611999999999995 s |
0.000002747925 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.0000028403 s |
0.0000028348250000000004 s |
1.00 |
add_two / HLOOpt / tpu / PostRev |
0.000002742125 s |
0.000002747775 s |
1.00 |
add_two / HLOOpt / tpu / BothRev |
0.00000283065 s |
0.0000028392250000000003 s |
1.00 |
add_two / PartOpt / tpu / PreRev |
0.000002747075 s |
0.000002740275 s |
1.00 |
add_two / PartOpt / tpu / PostRev |
0.00000283455 s |
0.0000028397 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.0000027643 s |
0.000002755125 s |
1.00 |
add_two / IPartOpt / tpu / PreRev |
0.000002848225 s |
0.000002836275 s |
1.00 |
add_two / IPartOpt / tpu / PostRev |
0.000002744625 s |
0.0000027511 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.000002835525 s |
0.00000284145 s |
1.00 |
add_two / DefOpt / tpu / PreRev |
0.000002761025 s |
0.000002759 s |
1.00 |
add_two / DefOpt / tpu / PostRev |
0.0000028406 s |
0.000002839325 s |
1.00 |
add_two / DefOpt / tpu / BothRev |
0.000002740925 s |
0.0000027486250000000003 s |
1.00 |
add_two / IDefOpt / tpu / PreRev |
0.0000028295500000000003 s |
0.000002842575 s |
1.00 |
add_two / IDefOpt / tpu / PostRev |
0.0000027453 s |
0.0000027472 s |
1.00 |
add_two / IDefOpt / tpu / BothRev |
0.000002830475 s |
0.000002838375 s |
1.00 |
add_two / JaXPipe / cpu / Primal |
0.000016513 s |
0.00000779061997491226 s |
2.12 |
add_two / Jax / cpu / Primal |
0.000016171 s |
0.0000071242400281334994 s |
2.27 |
add_two / HLOOpt / cpu / Primal |
0.000016128 s |
0.000011331699979564292 s |
1.42 |
add_two / PartOpt / cpu / Primal |
0.000016598999999999998 s |
0.000006988320010350435 s |
2.38 |
add_two / IPartOpt / cpu / Primal |
0.000016444 s |
0.000007562239989056252 s |
2.17 |
add_two / DefOpt / cpu / Primal |
0.000016295 s |
0.000007133199978852645 s |
2.28 |
add_two / IDefOpt / cpu / Primal |
0.000016339 s |
0.000006988179975451203 s |
2.34 |
add_two / JaXPipe / cpu / Forward |
0.000022985 s |
0.000010467719976077206 s |
2.20 |
add_two / Jax / cpu / Forward |
0.000021674 s |
0.00001014771996779018 s |
2.14 |
add_two / HLOOpt / cpu / Forward |
0.000022228 s |
0.000015272099981302746 s |
1.46 |
add_two / PartOpt / cpu / Forward |
0.00002151 s |
0.000014978740009610192 s |
1.44 |
add_two / IPartOpt / cpu / Forward |
0.000021884 s |
0.000010259620003125749 s |
2.13 |
add_two / DefOpt / cpu / Forward |
0.000021894 s |
0.000015081880019351956 s |
1.45 |
add_two / IDefOpt / cpu / Forward |
0.000022088 s |
0.000010221640013696745 s |
2.16 |
add_two / JaXPipe / cpu / PreRev |
0.000027844 s |
0.00001410851999935403 s |
1.97 |
add_two / JaXPipe / cpu / PostRev |
0.000028522 s |
0.000013603640009023365 s |
2.10 |
add_two / JaXPipe / cpu / BothRev |
0.000028267 s |
0.00001366785998470732 s |
2.07 |
add_two / Jax / cpu / BothRev |
0.000028512 s |
0.000013924240020060096 s |
2.05 |
add_two / HLOOpt / cpu / PreRev |
0.000027731 s |
0.000013977280032122508 s |
1.98 |
add_two / HLOOpt / cpu / PostRev |
0.000028146 s |
0.00001403646003382164 s |
2.01 |
add_two / HLOOpt / cpu / BothRev |
0.000027908 s |
0.000015436239973496413 s |
1.81 |
add_two / PartOpt / cpu / PreRev |
0.000027969 s |
0.000013534179988710091 s |
2.07 |
add_two / PartOpt / cpu / PostRev |
0.000029152 s |
0.000013693600021724703 s |
2.13 |
add_two / PartOpt / cpu / BothRev |
0.000028902 s |
0.000013705980018130504 s |
2.11 |
add_two / IPartOpt / cpu / PreRev |
0.000027492 s |
0.000013599339972643063 s |
2.02 |
add_two / IPartOpt / cpu / PostRev |
0.000029069 s |
0.00001350412002466328 s |
2.15 |
add_two / IPartOpt / cpu / BothRev |
0.000028844 s |
0.000013644700020449818 s |
2.11 |
add_two / DefOpt / cpu / PreRev |
0.000027635 s |
0.000014368500060299994 s |
1.92 |
add_two / DefOpt / cpu / PostRev |
0.000028692 s |
0.000013709300001210067 s |
2.09 |
add_two / DefOpt / cpu / BothRev |
0.00002829 s |
0.000013578719981524044 s |
2.08 |
add_two / IDefOpt / cpu / PreRev |
0.000028371 s |
0.000013646060060636956 s |
2.08 |
add_two / IDefOpt / cpu / PostRev |
0.000028766 s |
0.000014498359960271043 s |
1.98 |
add_two / IDefOpt / cpu / BothRev |
0.000028685 s |
0.000013887720015191008 s |
2.07 |
cache / JaXPipe / cpu / Primal |
0.000006798859994887607 s |
0.0000067652400139195375 s |
1.00 |
cache / Jax / cpu / Primal |
0.000007038260055196588 s |
0.000006858279984953697 s |
1.03 |
cache / HLOOpt / cpu / Primal |
0.000006787320025978261 s |
0.00000677435999023146 s |
1.00 |
cache / PartOpt / cpu / Primal |
0.000006213520036908449 s |
0.000006694500043522566 s |
0.93 |
cache / IPartOpt / cpu / Primal |
0.000006543520012201043 s |
0.000006187839981066645 s |
1.06 |
cache / DefOpt / cpu / Primal |
0.00000620830000116257 s |
0.000006312319965218194 s |
0.98 |
cache / IDefOpt / cpu / Primal |
0.0000068050199934077685 s |
0.000006500580029751291 s |
1.05 |
cache / JaXPipe / cpu / Forward |
0.000014500920005957596 s |
0.000015542040036962133 s |
0.93 |
cache / Jax / cpu / Forward |
0.000014496779986075126 s |
0.000015676000011808355 s |
0.92 |
cache / HLOOpt / cpu / Forward |
0.00001931077998051478 s |
0.00002061443994534784 s |
0.94 |
cache / PartOpt / cpu / Forward |
0.000019487320014377477 s |
0.00002036323999163869 s |
0.96 |
cache / IPartOpt / cpu / Forward |
0.000015134740024222991 s |
0.000015725520006526496 s |
0.96 |
cache / DefOpt / cpu / Forward |
0.000019329640017531347 s |
0.000023495319992434817 s |
0.82 |
cache / IDefOpt / cpu / Forward |
0.000014706659985677106 s |
0.000014389620027941418 s |
1.02 |
cache / JaXPipe / cpu / PreRev |
0.000016374439983337652 s |
0.00001592889998391911 s |
1.03 |
cache / JaXPipe / cpu / PostRev |
0.00001944923998053128 s |
0.00002100328002597962 s |
0.93 |
cache / JaXPipe / cpu / BothRev |
0.000015905099990050075 s |
0.000016675520018907264 s |
0.95 |
cache / Jax / cpu / BothRev |
0.0000203493400294974 s |
0.00002007403998504742 s |
1.01 |
cache / HLOOpt / cpu / PreRev |
0.000016112780022012883 s |
0.000016019780005080975 s |
1.01 |
cache / HLOOpt / cpu / PostRev |
0.000016222400017795734 s |
0.000020259680022718383 s |
0.80 |
cache / HLOOpt / cpu / BothRev |
0.000018553599938968547 s |
0.000018096279964083804 s |
1.03 |
cache / PartOpt / cpu / PreRev |
0.00001563560004797182 s |
0.000015796039979250053 s |
0.99 |
cache / PartOpt / cpu / PostRev |
0.00002128174000972649 s |
0.00002046994002739666 s |
1.04 |
cache / PartOpt / cpu / BothRev |
0.00001607645996955398 s |
0.000015548000001217587 s |
1.03 |
cache / IPartOpt / cpu / PreRev |
0.000021464520004883523 s |
0.000015691099979449063 s |
1.37 |
cache / IPartOpt / cpu / PostRev |
0.000025047639992408223 s |
0.000020764780028912355 s |
1.21 |
cache / IPartOpt / cpu / BothRev |
0.000015266539976437342 s |
0.00001614888001313375 s |
0.95 |
cache / DefOpt / cpu / PreRev |
0.000015241980017890457 s |
0.000015705839978181757 s |
0.97 |
cache / DefOpt / cpu / PostRev |
0.000015454540007340257 s |
0.000015708819946667064 s |
0.98 |
cache / DefOpt / cpu / BothRev |
0.000015894719972493476 s |
0.00001629423999474966 s |
0.98 |
cache / IDefOpt / cpu / PreRev |
0.00001601078000931011 s |
0.000015533799987679232 s |
1.03 |
cache / IDefOpt / cpu / PostRev |
0.000021224579986665047 s |
0.000016040699983932426 s |
1.32 |
cache / IDefOpt / cpu / BothRev |
0.00002138297991223226 s |
0.000015954640039126387 s |
1.34 |
cache / JaXPipe / cuda / Primal |
0.000002336 s |
0.000002304 s |
1.01 |
cache / Jax / cuda / Primal |
0.000002272 s |
0.000002271 s |
1.00 |
cache / HLOOpt / cuda / Primal |
0.000002304 s |
0.000002272 s |
1.01 |
cache / PartOpt / cuda / Primal |
0.000002304 s |
0.000002271 s |
1.01 |
cache / IPartOpt / cuda / Primal |
0.000002272 s |
0.000002272 s |
1 |
cache / DefOpt / cuda / Primal |
0.000002272 s |
0.000002208 s |
1.03 |
cache / IDefOpt / cuda / Primal |
0.000002335 s |
0.000002304 s |
1.01 |
cache / JaXPipe / cuda / Forward |
0.0000023670000000000004 s |
0.000002336 s |
1.01 |
cache / Jax / cuda / Forward |
0.000002304 s |
0.000002303 s |
1.00 |
cache / HLOOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002335 s |
1.01 |
cache / PartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002336 s |
1.01 |
cache / IPartOpt / cuda / Forward |
0.000002337 s |
0.000002304 s |
1.01 |
cache / DefOpt / cuda / Forward |
0.000002272 s |
0.000002272 s |
1 |
cache / IDefOpt / cuda / Forward |
0.000002304 s |
0.000002272 s |
1.01 |
cache / JaXPipe / cuda / PreRev |
0.00001168 s |
0.000013248 s |
0.88 |
cache / JaXPipe / cuda / PostRev |
0.000011488 s |
0.000012608 s |
0.91 |
cache / JaXPipe / cuda / BothRev |
0.000011231 s |
0.000011295 s |
0.99 |
cache / Jax / cuda / BothRev |
0.000011520000000000002 s |
0.000011520000000000002 s |
1 |
cache / HLOOpt / cuda / PreRev |
0.000013248 s |
0.000013184 s |
1.00 |
cache / HLOOpt / cuda / PostRev |
0.000013184 s |
0.000013152 s |
1.00 |
cache / HLOOpt / cuda / BothRev |
0.000013248 s |
0.000013183 s |
1.00 |
cache / PartOpt / cuda / PreRev |
0.000011648 s |
0.000011552 s |
1.01 |
cache / PartOpt / cuda / PostRev |
0.000011296 s |
0.000011936 s |
0.95 |
cache / PartOpt / cuda / BothRev |
0.000011296 s |
0.000012928 s |
0.87 |
cache / IPartOpt / cuda / PreRev |
0.000011296 s |
0.000012128 s |
0.93 |
cache / IPartOpt / cuda / PostRev |
0.000011424 s |
0.000011776 s |
0.97 |
cache / IPartOpt / cuda / BothRev |
0.00001168 s |
0.000011744 s |
0.99 |
cache / DefOpt / cuda / PreRev |
0.000011424 s |
0.000011103 s |
1.03 |
cache / DefOpt / cuda / PostRev |
0.000010976 s |
0.000011264 s |
0.97 |
cache / DefOpt / cuda / BothRev |
0.000011455999999999998 s |
0.000011776 s |
0.97 |
cache / IDefOpt / cuda / PreRev |
0.000011168 s |
0.000012097 s |
0.92 |
cache / IDefOpt / cuda / PostRev |
0.000012704 s |
0.00001184 s |
1.07 |
cache / IDefOpt / cuda / BothRev |
0.000011168 s |
0.000012031 s |
0.93 |
cache / JaXPipe / tpu / Primal |
0.000002470375 s |
0.000002464075 s |
1.00 |
cache / Jax / tpu / Primal |
0.000002454525 s |
0.000002457125 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.00000246085 s |
0.00000245415 s |
1.00 |
cache / PartOpt / tpu / Primal |
0.000002466325 s |
0.0000024715 s |
1.00 |
cache / IPartOpt / tpu / Primal |
0.000002462975 s |
0.0000024584 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.000002457125 s |
0.0000024666 s |
1.00 |
cache / IDefOpt / tpu / Primal |
0.0000024604 s |
0.00000247435 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.000003562475 s |
0.000003556175 s |
1.00 |
cache / Jax / tpu / Forward |
0.000003539 s |
0.000003537225 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.000003565 s |
0.0000035645250000000004 s |
1.00 |
cache / PartOpt / tpu / Forward |
0.000003538975 s |
0.000003514975 s |
1.01 |
cache / IPartOpt / tpu / Forward |
0.00000356655 s |
0.00000354975 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.000003538075 s |
0.0000035269 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.000003559175 s |
0.000003550825 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.000004946825 s |
0.000004951825 s |
1.00 |
cache / JaXPipe / tpu / PostRev |
0.0000049474 s |
0.0000049492 s |
1.00 |
cache / JaXPipe / tpu / BothRev |
0.000004991675 s |
0.000004963575 s |
1.01 |
cache / Jax / tpu / BothRev |
0.000005008325 s |
0.000004954825 s |
1.01 |
cache / HLOOpt / tpu / PreRev |
0.0000039637 s |
0.000003940500000000001 s |
1.01 |
cache / HLOOpt / tpu / PostRev |
0.000004113724999999999 s |
0.000004129325 s |
1.00 |
cache / HLOOpt / tpu / BothRev |
0.000003955025 s |
0.000003954075000000001 s |
1.00 |
cache / PartOpt / tpu / PreRev |
0.0000050171 s |
0.000004974225 s |
1.01 |
cache / PartOpt / tpu / PostRev |
0.000005007725 s |
0.000004956625 s |
1.01 |
cache / PartOpt / tpu / BothRev |
0.00000497605 s |
0.000004975175 s |
1.00 |
cache / IPartOpt / tpu / PreRev |
0.00000497825 s |
0.000004960675000000001 s |
1.00 |
cache / IPartOpt / tpu / PostRev |
0.000004959425 s |
0.0000049813 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.000004974799999999999 s |
0.000004991 s |
1.00 |
cache / DefOpt / tpu / PreRev |
0.000004974175 s |
0.00000497925 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.000004989225 s |
0.00000497525 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.000004982125 s |
0.0000049921 s |
1.00 |
cache / IDefOpt / tpu / PreRev |
0.000004968475 s |
0.0000049675750000000005 s |
1.00 |
cache / IDefOpt / tpu / PostRev |
0.000004972725 s |
0.000004989175 s |
1.00 |
cache / IDefOpt / tpu / BothRev |
0.0000049797 s |
0.000004989925 s |
1.00 |
cache / JaXPipe / cpu / Primal |
0.000017814 s |
0.0000067652400139195375 s |
2.63 |
cache / Jax / cpu / Primal |
0.000018318 s |
0.000006858279984953697 s |
2.67 |
cache / HLOOpt / cpu / Primal |
0.000018049 s |
0.00000677435999023146 s |
2.66 |
cache / PartOpt / cpu / Primal |
0.000018749 s |
0.000006694500043522566 s |
2.80 |
cache / IPartOpt / cpu / Primal |
0.000018119 s |
0.000006187839981066645 s |
2.93 |
cache / DefOpt / cpu / Primal |
0.000018558 s |
0.000006312319965218194 s |
2.94 |
cache / IDefOpt / cpu / Primal |
0.000018136 s |
0.000006500580029751291 s |
2.79 |
cache / JaXPipe / cpu / Forward |
0.000021039 s |
0.000015542040036962133 s |
1.35 |
cache / Jax / cpu / Forward |
0.000020866 s |
0.000015676000011808355 s |
1.33 |
cache / HLOOpt / cpu / Forward |
0.000021683 s |
0.00002061443994534784 s |
1.05 |
cache / PartOpt / cpu / Forward |
0.000020825 s |
0.00002036323999163869 s |
1.02 |
cache / IPartOpt / cpu / Forward |
0.000021696 s |
0.000015725520006526496 s |
1.38 |
cache / DefOpt / cpu / Forward |
0.000021148 s |
0.000023495319992434817 s |
0.90 |
cache / IDefOpt / cpu / Forward |
0.000021219 s |
0.000014389620027941418 s |
1.47 |
cache / JaXPipe / cpu / PreRev |
0.000022861 s |
0.00001592889998391911 s |
1.44 |
cache / JaXPipe / cpu / PostRev |
0.000026608 s |
0.00002100328002597962 s |
1.27 |
cache / JaXPipe / cpu / BothRev |
0.00002181 s |
0.000016675520018907264 s |
1.31 |
cache / Jax / cpu / BothRev |
0.000026408 s |
0.00002007403998504742 s |
1.32 |
cache / HLOOpt / cpu / PreRev |
0.000022897 s |
0.000016019780005080975 s |
1.43 |
cache / HLOOpt / cpu / PostRev |
0.000021878000000000003 s |
0.000020259680022718383 s |
1.08 |
cache / HLOOpt / cpu / BothRev |
0.000022923 s |
0.000018096279964083804 s |
1.27 |
cache / PartOpt / cpu / PreRev |
0.000021707 s |
0.000015796039979250053 s |
1.37 |
cache / PartOpt / cpu / PostRev |
0.000026948 s |
0.00002046994002739666 s |
1.32 |
cache / PartOpt / cpu / BothRev |
0.000023061 s |
0.000015548000001217587 s |
1.48 |
cache / IPartOpt / cpu / PreRev |
0.000022449 s |
0.000015691099979449063 s |
1.43 |
cache / IPartOpt / cpu / PostRev |
0.000026661 s |
0.000020764780028912355 s |
1.28 |
cache / IPartOpt / cpu / BothRev |
0.000022132 s |
0.00001614888001313375 s |
1.37 |
cache / DefOpt / cpu / PreRev |
0.000021317 s |
0.000015705839978181757 s |
1.36 |
cache / DefOpt / cpu / PostRev |
0.000022388000000000003 s |
0.000015708819946667064 s |
1.43 |
cache / DefOpt / cpu / BothRev |
0.000022027 s |
0.00001629423999474966 s |
1.35 |
cache / IDefOpt / cpu / PreRev |
0.000022102 s |
0.000015533799987679232 s |
1.42 |
cache / IDefOpt / cpu / PostRev |
0.000022004 s |
0.000016040699983932426 s |
1.37 |
cache / IDefOpt / cpu / BothRev |
0.000021734 s |
0.000015954640039126387 s |
1.36 |
Concat / JaXPipe / cpu / Primal |
0.000006967280023673084 s |
0.000007477999961338355 s |
0.93 |
Concat / Jax / cpu / Primal |
0.000006650620034633903 s |
0.000007448159994964953 s |
0.89 |
Concat / HLOOpt / cpu / Primal |
0.000007009540031504002 s |
0.00000979825999820605 s |
0.72 |
Concat / PartOpt / cpu / Primal |
0.000006336420001389342 s |
0.00000708663998011616 s |
0.89 |
Concat / IPartOpt / cpu / Primal |
0.000006815859987909789 s |
0.000006348000015350408 s |
1.07 |
Concat / DefOpt / cpu / Primal |
0.000010165939984290164 s |
0.000010170200021093478 s |
1.00 |
Concat / IDefOpt / cpu / Primal |
0.00000638889998299419 s |
0.000006924580011400394 s |
0.92 |
Concat / JaXPipe / cpu / Forward |
0.000009533020047456376 s |
0.000009657939999669908 s |
0.99 |
Concat / Jax / cpu / Forward |
0.00000952329996835033 s |
0.000010302459977538091 s |
0.92 |
Concat / HLOOpt / cpu / Forward |
0.000013221300005170631 s |
0.000013886780016036937 s |
0.95 |
Concat / PartOpt / cpu / Forward |
0.00001386562001243874 s |
0.00001402134001182276 s |
0.99 |
Concat / IPartOpt / cpu / Forward |
0.000010232379981971462 s |
0.000009403939984622413 s |
1.09 |
Concat / DefOpt / cpu / Forward |
0.000014705979974678483 s |
0.000014035200047146646 s |
1.05 |
Concat / IDefOpt / cpu / Forward |
0.000010126540009878228 s |
0.000010131599992746487 s |
1.00 |
Concat / JaXPipe / cpu / PreRev |
0.000011960380033997351 s |
0.000011709379987223655 s |
1.02 |
Concat / JaXPipe / cpu / PostRev |
0.000011302900029477314 s |
0.00001162401999863505 s |
0.97 |
Concat / JaXPipe / cpu / BothRev |
0.000015279180006473324 s |
0.000011647779965642256 s |
1.31 |
Concat / Jax / cpu / BothRev |
0.000011239319983360474 s |
0.00001116463993639627 s |
1.01 |
Concat / HLOOpt / cpu / PreRev |
0.00001148808001744328 s |
0.000011569120024432778 s |
0.99 |
Concat / HLOOpt / cpu / PostRev |
0.000015201919986793654 s |
0.0000153801800297515 s |
0.99 |
Concat / HLOOpt / cpu / BothRev |
0.000012988759954168928 s |
0.000012803999989046132 s |
1.01 |
Concat / PartOpt / cpu / PreRev |
0.00001113491996875382 s |
0.000011782940000557573 s |
0.95 |
Concat / PartOpt / cpu / PostRev |
0.00001100297997254529 s |
0.000011856719975185115 s |
0.93 |
Concat / PartOpt / cpu / BothRev |
0.000011534040013430056 s |
0.00001157544000307098 s |
1.00 |
Concat / IPartOpt / cpu / PreRev |
0.000011605379986576736 s |
0.000012226760018165806 s |
0.95 |
Concat / IPartOpt / cpu / PostRev |
0.000011655339994831592 s |
0.000011645340000541182 s |
1.00 |
Concat / IPartOpt / cpu / BothRev |
0.000011621439998634742 s |
0.00001154382000095211 s |
1.01 |
Concat / DefOpt / cpu / PreRev |
0.000011100419987997156 s |
0.000011529539979164836 s |
0.96 |
Concat / DefOpt / cpu / PostRev |
0.00001140251998549502 s |
0.00001154548002887168 s |
0.99 |
Concat / DefOpt / cpu / BothRev |
0.000011119960008727505 s |
0.00001121641997997358 s |
0.99 |
Concat / IDefOpt / cpu / PreRev |
0.00001146574001722911 s |
0.000011383780001779087 s |
1.01 |
Concat / IDefOpt / cpu / PostRev |
0.000011704199978339605 s |
0.000011767759997383109 s |
0.99 |
Concat / IDefOpt / cpu / BothRev |
0.000011073660007241414 s |
0.000011008720011886908 s |
1.01 |
Concat / JaXPipe / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / Jax / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / HLOOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / PartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / IPartOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / DefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / IDefOpt / cuda / Primal |
0.000001951 s |
0.000001951 s |
1 |
Concat / JaXPipe / cuda / Forward |
0.000009823 s |
0.00001008 s |
0.97 |
Concat / Jax / cuda / Forward |
0.000009568 s |
0.000011328 s |
0.84 |
Concat / HLOOpt / cuda / Forward |
0.000009888 s |
0.000010017 s |
0.99 |
Concat / PartOpt / cuda / Forward |
0.000010016 s |
0.000010368 s |
0.97 |
Concat / IPartOpt / cuda / Forward |
0.00000976 s |
0.000010464 s |
0.93 |
Concat / DefOpt / cuda / Forward |
0.000012416 s |
0.00001008 s |
1.23 |
Concat / IDefOpt / cuda / Forward |
0.000009856 s |
0.000009888 s |
1.00 |
Concat / JaXPipe / cuda / PreRev |
0.000016544 s |
0.000016639 s |
0.99 |
Concat / JaXPipe / cuda / PostRev |
0.000015616 s |
0.000016032 s |
0.97 |
Concat / JaXPipe / cuda / BothRev |
0.000015935999999999998 s |
0.000015552 s |
1.02 |
Concat / Jax / cuda / BothRev |
0.000016096 s |
0.000018657 s |
0.86 |
Concat / HLOOpt / cuda / PreRev |
0.000016063999999999997 s |
0.000017824 s |
0.90 |
Concat / HLOOpt / cuda / PostRev |
0.000015712 s |
0.000016512 s |
0.95 |
Concat / HLOOpt / cuda / BothRev |
0.000015776 s |
0.000016768999999999998 s |
0.94 |
Concat / PartOpt / cuda / PreRev |
0.00001616 s |
0.000017503999999999997 s |
0.92 |
Concat / PartOpt / cuda / PostRev |
0.000016224 s |
0.000016447 s |
0.99 |
Concat / PartOpt / cuda / BothRev |
0.000016032 s |
0.000016447 s |
0.97 |
Concat / IPartOpt / cuda / PreRev |
0.000019904 s |
0.000016735 s |
1.19 |
Concat / IPartOpt / cuda / PostRev |
0.000016383999999999998 s |
0.000016224 s |
1.01 |
Concat / IPartOpt / cuda / BothRev |
0.000016383999999999998 s |
0.000016576000000000002 s |
0.99 |
Concat / DefOpt / cuda / PreRev |
0.000015904000000000002 s |
0.000018144 s |
0.88 |
Concat / DefOpt / cuda / PostRev |
0.000016255 s |
0.000016 s |
1.02 |
Concat / DefOpt / cuda / BothRev |
0.000015776 s |
0.000018624000000000003 s |
0.85 |
Concat / IDefOpt / cuda / PreRev |
0.000016704 s |
0.000016224 s |
1.03 |
Concat / IDefOpt / cuda / PostRev |
0.000016 s |
0.000016416 s |
0.97 |
Concat / IDefOpt / cuda / BothRev |
0.000016448000000000002 s |
0.000016927999999999998 s |
0.97 |
Concat / JaXPipe / tpu / Primal |
0.0000015182 s |
0.0000015328 s |
0.99 |
Concat / Jax / tpu / Primal |
0.0000015384000000000002 s |
0.000001532775 s |
1.00 |
Concat / HLOOpt / tpu / Primal |
0.0000015207 s |
0.00000152725 s |
1.00 |
Concat / PartOpt / tpu / Primal |
0.0000015287 s |
0.0000015307999999999998 s |
1.00 |
Concat / IPartOpt / tpu / Primal |
0.00000153195 s |
0.000001521475 s |
1.01 |
Concat / DefOpt / tpu / Primal |
0.000001539625 s |
0.000001526875 s |
1.01 |
Concat / IDefOpt / tpu / Primal |
0.0000015165 s |
0.000001526525 s |
0.99 |
Concat / JaXPipe / tpu / Forward |
0.000001571925 s |
0.0000015857750000000005 s |
0.99 |
Concat / Jax / tpu / Forward |
0.000001535425 s |
0.00000155355 s |
0.99 |
Concat / HLOOpt / tpu / Forward |
0.000001580275 s |
0.0000015830499999999998 s |
1.00 |
Concat / PartOpt / tpu / Forward |
0.00000154685 s |
0.000001563175 s |
0.99 |
Concat / IPartOpt / tpu / Forward |
0.000001597975 s |
0.0000015681 s |
1.02 |
Concat / DefOpt / tpu / Forward |
0.0000015478 s |
0.00000154725 s |
1.00 |
Concat / IDefOpt / tpu / Forward |
0.0000015739 s |
0.0000015760750000000002 s |
1.00 |
Concat / JaXPipe / tpu / PreRev |
0.0000020047 s |
0.000001992425 s |
1.01 |
Concat / JaXPipe / tpu / PostRev |
0.000002085925 s |
0.0000020814 s |
1.00 |
Concat / JaXPipe / tpu / BothRev |
0.0000019940000000000003 s |
0.000002005825 s |
0.99 |
Concat / Jax / tpu / BothRev |
0.0000020712 s |
0.0000020721 s |
1.00 |
Concat / HLOOpt / tpu / PreRev |
0.0000019976500000000003 s |
0.0000020036 s |
1.00 |
Concat / HLOOpt / tpu / PostRev |
0.000002069275 s |
0.00000207245 s |
1.00 |
Concat / HLOOpt / tpu / BothRev |
0.0000020058000000000003 s |
0.0000020043 s |
1.00 |
Concat / PartOpt / tpu / PreRev |
0.0000020757000000000003 s |
0.000002071125 s |
1.00 |
Concat / PartOpt / tpu / PostRev |
0.00000199385 s |
0.00000199385 s |
1 |
Concat / PartOpt / tpu / BothRev |
0.00000206445 s |
0.000002068375 s |
1.00 |
Concat / IPartOpt / tpu / PreRev |
0.0000019965250000000003 s |
0.000001999575 s |
1.00 |
Concat / IPartOpt / tpu / PostRev |
0.000002072875 s |
0.00000206675 s |
1.00 |
Concat / IPartOpt / tpu / BothRev |
0.000001993825 s |
0.000001995025 s |
1.00 |
Concat / DefOpt / tpu / PreRev |
0.000002077275 s |
0.000002065175 s |
1.01 |
Concat / DefOpt / tpu / PostRev |
0.000001996425 s |
0.0000019980750000000003 s |
1.00 |
Concat / DefOpt / tpu / BothRev |
0.0000020737500000000003 s |
0.000002064875 s |
1.00 |
Concat / IDefOpt / tpu / PreRev |
0.000001998575 s |
0.000002001975 s |
1.00 |
Concat / IDefOpt / tpu / PostRev |
0.0000020645 s |
0.0000020654 s |
1.00 |
Concat / IDefOpt / tpu / BothRev |
0.00000200805 s |
0.00000199555 s |
1.01 |
Concat / JaXPipe / cpu / Primal |
0.000015684 s |
0.000007477999961338355 s |
2.10 |
Concat / Jax / cpu / Primal |
0.000015889 s |
0.000007448159994964953 s |
2.13 |
Concat / HLOOpt / cpu / Primal |
0.000015745 s |
0.00000979825999820605 s |
1.61 |
Concat / PartOpt / cpu / Primal |
0.000015720000000000002 s |
0.00000708663998011616 s |
2.22 |
Concat / IPartOpt / cpu / Primal |
0.000015756 s |
0.000006348000015350408 s |
2.48 |
Concat / DefOpt / cpu / Primal |
0.000015685 s |
0.000010170200021093478 s |
1.54 |
Concat / IDefOpt / cpu / Primal |
0.000015938999999999998 s |
0.000006924580011400394 s |
2.30 |
Concat / JaXPipe / cpu / Forward |
0.000021655 s |
0.000009657939999669908 s |
2.24 |
Concat / Jax / cpu / Forward |
0.000020761000000000003 s |
0.000010302459977538091 s |
2.02 |
Concat / HLOOpt / cpu / Forward |
0.000021397 s |
0.000013886780016036937 s |
1.54 |
Concat / PartOpt / cpu / Forward |
0.000021292 s |
0.00001402134001182276 s |
1.52 |
Concat / IPartOpt / cpu / Forward |
0.00002129 s |
0.000009403939984622413 s |
2.26 |
Concat / DefOpt / cpu / Forward |
0.00002141 s |
0.000014035200047146646 s |
1.53 |
Concat / IDefOpt / cpu / Forward |
0.00002141 s |
0.000010131599992746487 s |
2.11 |
Concat / JaXPipe / cpu / PreRev |
0.000024330000000000003 s |
0.000011709379987223655 s |
2.08 |
Concat / JaXPipe / cpu / PostRev |
0.000024042 s |
0.00001162401999863505 s |
2.07 |
Concat / JaXPipe / cpu / BothRev |
0.0000237 s |
0.000011647779965642256 s |
2.03 |
Concat / Jax / cpu / BothRev |
0.000024021 s |
0.00001116463993639627 s |
2.15 |
Concat / HLOOpt / cpu / PreRev |
0.00002413 s |
0.000011569120024432778 s |
2.09 |
Concat / HLOOpt / cpu / PostRev |
0.000023686 s |
0.0000153801800297515 s |
1.54 |
Concat / HLOOpt / cpu / BothRev |
0.000023358 s |
0.000012803999989046132 s |
1.82 |
Concat / PartOpt / cpu / PreRev |
0.000023957 s |
0.000011782940000557573 s |
2.03 |
Concat / PartOpt / cpu / PostRev |
0.00002407 s |
0.000011856719975185115 s |
2.03 |
Concat / PartOpt / cpu / BothRev |
0.000023997 s |
0.00001157544000307098 s |
2.07 |
Concat / IPartOpt / cpu / PreRev |
0.000023323 s |
0.000012226760018165806 s |
1.91 |
Concat / IPartOpt / cpu / PostRev |
0.000023616 s |
0.000011645340000541182 s |
2.03 |
Concat / IPartOpt / cpu / BothRev |
0.000024009 s |
0.00001154382000095211 s |
2.08 |
Concat / DefOpt / cpu / PreRev |
0.000023931 s |
0.000011529539979164836 s |
2.08 |
Concat / DefOpt / cpu / PostRev |
0.000024455 s |
0.00001154548002887168 s |
2.12 |
Concat / DefOpt / cpu / BothRev |
0.0000244 s |
0.00001121641997997358 s |
2.18 |
Concat / IDefOpt / cpu / PreRev |
0.000024365 s |
0.000011383780001779087 s |
2.14 |
Concat / IDefOpt / cpu / PostRev |
0.000024263 s |
0.000011767759997383109 s |
2.06 |
Concat / IDefOpt / cpu / BothRev |
0.000024574 s |
0.000011008720011886908 s |
2.23 |
const_scatter / JaXPipe / cpu / Primal |
0.000006806519986639614 s |
0.000007006740024735336 s |
0.97 |
const_scatter / Jax / cpu / Primal |
0.000007078879998516641 s |
0.000006900340022184537 s |
1.03 |
const_scatter / HLOOpt / cpu / Primal |
0.000006793059965275461 s |
0.000006681180020677857 s |
1.02 |
const_scatter / PartOpt / cpu / Primal |
0.000006534359990837402 s |
0.000006747300003553391 s |
0.97 |
const_scatter / IPartOpt / cpu / Primal |
0.000006526719989778939 s |
0.000007156999990911572 s |
0.91 |
const_scatter / DefOpt / cpu / Primal |
0.000011030680007024784 s |
0.00001053997998496925 s |
1.05 |
const_scatter / IDefOpt / cpu / Primal |
0.000005976659958832897 s |
0.000006687559998681536 s |
0.89 |
const_scatter / JaXPipe / cpu / Forward |
0.000009184160035147215 s |
0.000009094260012716403 s |
1.01 |
const_scatter / Jax / cpu / Forward |
0.000010364099971411631 s |
0.000009302199969170032 s |
1.11 |
const_scatter / HLOOpt / cpu / Forward |
0.000014170279964673682 s |
0.000013041919974057237 s |
1.09 |
const_scatter / PartOpt / cpu / Forward |
0.000009670559993537609 s |
0.000013533999981518718 s |
0.71 |
const_scatter / IPartOpt / cpu / Forward |
0.000009104039991143507 s |
0.000009127520015681512 s |
1.00 |
const_scatter / DefOpt / cpu / Forward |
0.00001410830002896546 s |
0.000013344820044949302 s |
1.06 |
const_scatter / IDefOpt / cpu / Forward |
0.000009989999980462015 s |
0.000009275720003643071 s |
1.08 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002999292600088 s |
0.0002991972399649 s |
1.00 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002897506600311 s |
0.0003054082600374 s |
0.95 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002848422400347 s |
0.0002823708999585 s |
1.01 |
const_scatter / Jax / cpu / BothRev |
0.0002833048200136 s |
0.0002835109200077 s |
1.00 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002834453800096 s |
0.0002834082999379 s |
1.00 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002878748599414 s |
0.0002892958200118 s |
1.00 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002846289399894 s |
0.0002893734799545 s |
0.98 |
const_scatter / PartOpt / cpu / PreRev |
0.0003120051999576 s |
0.0002835973600394 s |
1.10 |
const_scatter / PartOpt / cpu / PostRev |
0.0002886159199533 s |
0.0002832099999977 s |
1.02 |
const_scatter / PartOpt / cpu / BothRev |
0.0002829703600036 s |
0.0002877302400247 s |
0.98 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002898291000474 s |
0.0002864275199954 s |
1.01 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002901076599391 s |
0.0002833985800225 s |
1.02 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002826370599632 s |
0.0003071726200323 s |
0.92 |
const_scatter / DefOpt / cpu / PreRev |
0.0002907615200092 s |
0.0002995654199912 s |
0.97 |
const_scatter / DefOpt / cpu / PostRev |
0.0002898152199668 s |
0.0002848389600876 s |
1.02 |
const_scatter / DefOpt / cpu / BothRev |
0.0002832118999958 s |
0.0002836268600094 s |
1.00 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002859992599951 s |
0.000284376260015 s |
1.01 |
const_scatter / IDefOpt / cpu / PostRev |
0.000304742219987 s |
0.0002873844400164 s |
1.06 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002855401600209 s |
0.0002819903199906 s |
1.01 |
const_scatter / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / Jax / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
const_scatter / HLOOpt / cuda / Primal |
0.000001888 s |
0.000001888 s |
1 |
const_scatter / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
const_scatter / IPartOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
const_scatter / DefOpt / cuda / Primal |
0.000001888 s |
0.000001889 s |
1.00 |
const_scatter / IDefOpt / cuda / Primal |
0.000001888 s |
0.0000019200000000000003 s |
0.98 |
const_scatter / JaXPipe / cuda / Forward |
0.000009728 s |
0.000009759 s |
1.00 |
const_scatter / Jax / cuda / Forward |
0.000009793 s |
0.000009312000000000002 s |
1.05 |
const_scatter / HLOOpt / cuda / Forward |
0.00000976 s |
0.000009696 s |
1.01 |
const_scatter / PartOpt / cuda / Forward |
0.000009919 s |
0.000009792 s |
1.01 |
const_scatter / IPartOpt / cuda / Forward |
0.000009792 s |
0.000009632 s |
1.02 |
const_scatter / DefOpt / cuda / Forward |
0.000009696 s |
0.000009568 s |
1.01 |
const_scatter / IDefOpt / cuda / Forward |
0.000009888 s |
0.000009536 s |
1.04 |
const_scatter / JaXPipe / cuda / PreRev |
0.000013632 s |
0.00001264 s |
1.08 |
const_scatter / JaXPipe / cuda / PostRev |
0.000016864 s |
0.000016416 s |
1.03 |
const_scatter / JaXPipe / cuda / BothRev |
0.000012544 s |
0.000012448 s |
1.01 |
const_scatter / Jax / cuda / BothRev |
0.000016255999999999998 s |
0.000016 s |
1.02 |
const_scatter / HLOOpt / cuda / PreRev |
0.000012736 s |
0.000012256 s |
1.04 |
const_scatter / HLOOpt / cuda / PostRev |
0.00001296 s |
0.000012929 s |
1.00 |
const_scatter / HLOOpt / cuda / BothRev |
0.000013088 s |
0.00001264 s |
1.04 |
const_scatter / PartOpt / cuda / PreRev |
0.00001296 s |
0.000012832 s |
1.01 |
const_scatter / PartOpt / cuda / PostRev |
0.000015744 s |
0.000016351 s |
0.96 |
const_scatter / PartOpt / cuda / BothRev |
0.000013087 s |
0.000012736 s |
1.03 |
const_scatter / IPartOpt / cuda / PreRev |
0.000013952 s |
0.000012831 s |
1.09 |
const_scatter / IPartOpt / cuda / PostRev |
0.000018176 s |
0.000016672 s |
1.09 |
const_scatter / IPartOpt / cuda / BothRev |
0.000012512 s |
0.000012095 s |
1.03 |
const_scatter / DefOpt / cuda / PreRev |
0.000013792 s |
0.000012608 s |
1.09 |
const_scatter / DefOpt / cuda / PostRev |
0.000013344 s |
0.00001264 s |
1.06 |
const_scatter / DefOpt / cuda / BothRev |
0.0000136 s |
0.000012864 s |
1.06 |
const_scatter / IDefOpt / cuda / PreRev |
0.000013536 s |
0.000012992 s |
1.04 |
const_scatter / IDefOpt / cuda / PostRev |
0.000013664 s |
0.000012672 s |
1.08 |
const_scatter / IDefOpt / cuda / BothRev |
0.000012544 s |
0.000012768 s |
0.98 |
const_scatter / JaXPipe / tpu / Primal |
0.000003805825 s |
0.000003802525 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.000003811475 s |
0.000003814375 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
9.52325e-7 s |
9.53775e-7 s |
1.00 |
const_scatter / PartOpt / tpu / Primal |
0.000003796525 s |
0.0000038103 s |
1.00 |
const_scatter / IPartOpt / tpu / Primal |
0.00000381055 s |
0.0000038021 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
9.70275e-7 s |
9.6515e-7 s |
1.01 |
const_scatter / IDefOpt / tpu / Primal |
9.60375e-7 s |
9.5495e-7 s |
1.01 |
const_scatter / JaXPipe / tpu / Forward |
0.000001917925 s |
0.000001930775 s |
0.99 |
const_scatter / Jax / tpu / Forward |
0.00000651695 s |
0.0000064954000000000005 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.00000191005 s |
0.0000019141 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.000001947525 s |
0.000001929125 s |
1.01 |
const_scatter / IPartOpt / tpu / Forward |
0.00000191665 s |
0.00000192975 s |
0.99 |
const_scatter / DefOpt / tpu / Forward |
0.00000196345 s |
0.000001941575 s |
1.01 |
const_scatter / IDefOpt / tpu / Forward |
0.000001922025 s |
0.000001930225 s |
1.00 |
const_scatter / JaXPipe / tpu / PreRev |
0.0000043131 s |
0.000004328524999999999 s |
1.00 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006664625 s |
0.00000667685 s |
1.00 |
const_scatter / JaXPipe / tpu / BothRev |
0.000004303675 s |
0.000004316825 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.000006687975 s |
0.000006674625 s |
1.00 |
const_scatter / HLOOpt / tpu / PreRev |
0.00000430465 s |
0.0000043274 s |
0.99 |
const_scatter / HLOOpt / tpu / PostRev |
0.000004307225 s |
0.00000430925 s |
1.00 |
const_scatter / HLOOpt / tpu / BothRev |
0.00000429795 s |
0.000004318425 s |
1.00 |
const_scatter / PartOpt / tpu / PreRev |
0.000004303875 s |
0.0000043161 s |
1.00 |
const_scatter / PartOpt / tpu / PostRev |
0.00000666775 s |
0.00000667475 s |
1.00 |
const_scatter / PartOpt / tpu / BothRev |
0.00000428525 s |
0.0000043116 s |
0.99 |
const_scatter / IPartOpt / tpu / PreRev |
0.000004310625 s |
0.000004317900000000001 s |
1.00 |
const_scatter / IPartOpt / tpu / PostRev |
0.0000066649 s |
0.0000066586 s |
1.00 |
const_scatter / IPartOpt / tpu / BothRev |
0.000004301875 s |
0.0000042968 s |
1.00 |
const_scatter / DefOpt / tpu / PreRev |
0.000004304825 s |
0.000004312949999999999 s |
1.00 |
const_scatter / DefOpt / tpu / PostRev |
0.000004298575 s |
0.000004308525 s |
1.00 |
const_scatter / DefOpt / tpu / BothRev |
0.0000042970000000000005 s |
0.000004301725 s |
1.00 |
const_scatter / IDefOpt / tpu / PreRev |
0.000004303975 s |
0.0000043092 s |
1.00 |
const_scatter / IDefOpt / tpu / PostRev |
0.0000043128 s |
0.000004319875 s |
1.00 |
const_scatter / IDefOpt / tpu / BothRev |
0.000004294824999999999 s |
0.000004295625 s |
1.00 |
const_scatter / JaXPipe / cpu / Primal |
0.000015929 s |
0.000007006740024735336 s |
2.27 |
const_scatter / Jax / cpu / Primal |
0.000015819 s |
0.000006900340022184537 s |
2.29 |
const_scatter / HLOOpt / cpu / Primal |
0.000015668 s |
0.000006681180020677857 s |
2.35 |
const_scatter / PartOpt / cpu / Primal |
0.000015779000000000003 s |
0.000006747300003553391 s |
2.34 |
const_scatter / IPartOpt / cpu / Primal |
0.000015308 s |
0.000007156999990911572 s |
2.14 |
const_scatter / DefOpt / cpu / Primal |
0.000015705 s |
0.00001053997998496925 s |
1.49 |
const_scatter / IDefOpt / cpu / Primal |
0.000015815 s |
0.000006687559998681536 s |
2.36 |
const_scatter / JaXPipe / cpu / Forward |
0.000021005 s |
0.000009094260012716403 s |
2.31 |
const_scatter / Jax / cpu / Forward |
0.00002085 s |
0.000009302199969170032 s |
2.24 |
const_scatter / HLOOpt / cpu / Forward |
0.000020547 s |
0.000013041919974057237 s |
1.58 |
const_scatter / PartOpt / cpu / Forward |
0.00002043 s |
0.000013533999981518718 s |
1.51 |
const_scatter / IPartOpt / cpu / Forward |
0.000020848 s |
0.000009127520015681512 s |
2.28 |
const_scatter / DefOpt / cpu / Forward |
0.000020083 s |
0.000013344820044949302 s |
1.50 |
const_scatter / IDefOpt / cpu / Forward |
0.000021062 s |
0.000009275720003643071 s |
2.27 |
const_scatter / JaXPipe / cpu / PreRev |
0.000522884 s |
0.0002991972399649 s |
1.75 |
const_scatter / JaXPipe / cpu / PostRev |
0.000532339 s |
0.0003054082600374 s |
1.74 |
const_scatter / JaXPipe / cpu / BothRev |
0.000518626 s |
0.0002823708999585 s |
1.84 |
const_scatter / Jax / cpu / BothRev |
0.000533766 s |
0.0002835109200077 s |
1.88 |
const_scatter / HLOOpt / cpu / PreRev |
0.000529937 s |
0.0002834082999379 s |
1.87 |
const_scatter / HLOOpt / cpu / PostRev |
0.000530773 s |
0.0002892958200118 s |
1.83 |
const_scatter / HLOOpt / cpu / BothRev |
0.000526086 s |
0.0002893734799545 s |
1.82 |
const_scatter / PartOpt / cpu / PreRev |
0.000527366 s |
0.0002835973600394 s |
1.86 |
const_scatter / PartOpt / cpu / PostRev |
0.000535559 s |
0.0002832099999977 s |
1.89 |
const_scatter / PartOpt / cpu / BothRev |
0.000545703 s |
0.0002877302400247 s |
1.90 |
const_scatter / IPartOpt / cpu / PreRev |
0.000523297 s |
0.0002864275199954 s |
1.83 |
const_scatter / IPartOpt / cpu / PostRev |
0.0005376 s |
0.0002833985800225 s |
1.90 |
const_scatter / IPartOpt / cpu / BothRev |
0.00056088 s |
0.0003071726200323 s |
1.83 |
const_scatter / DefOpt / cpu / PreRev |
0.0005292869999999 s |
0.0002995654199912 s |
1.77 |
const_scatter / DefOpt / cpu / PostRev |
0.00052443 s |
0.0002848389600876 s |
1.84 |
const_scatter / DefOpt / cpu / BothRev |
0.000539118 s |
0.0002836268600094 s |
1.90 |
const_scatter / IDefOpt / cpu / PreRev |
0.000523262 s |
0.000284376260015 s |
1.84 |
const_scatter / IDefOpt / cpu / PostRev |
0.000529749 s |
0.0002873844400164 s |
1.84 |
const_scatter / IDefOpt / cpu / BothRev |
0.000529966 s |
0.0002819903199906 s |
1.88 |
GenDot / JaXPipe / cpu / Primal |
0.0000072258999898622274 s |
0.000006947120009499486 s |
1.04 |
GenDot / Jax / cpu / Primal |
0.000006873860029372736 s |
0.000006631260002905038 s |
1.04 |
GenDot / HLOOpt / cpu / Primal |
0.00001186787995720806 s |
0.000011154000021633693 s |
1.06 |
GenDot / PartOpt / cpu / Primal |
0.000006440200031647692 s |
0.000007032100011201692 s |
0.92 |
GenDot / IPartOpt / cpu / Primal |
0.000006469399995694402 s |
0.000007280660038304632 s |
0.89 |
GenDot / DefOpt / cpu / Primal |
0.000012264159931874018 s |
0.0000069018999965919646 s |
1.78 |
GenDot / IDefOpt / cpu / Primal |
0.000007042759989417391 s |
0.000007160159993873094 s |
0.98 |
GenDot / JaXPipe / cpu / Forward |
0.00001089339995814953 s |
0.000010356539960412192 s |
1.05 |
GenDot / Jax / cpu / Forward |
0.000010144619991478976 s |
0.000009965219987861929 s |
1.02 |
GenDot / HLOOpt / cpu / Forward |
0.000010850579965335782 s |
0.000015555780000795492 s |
0.70 |
GenDot / PartOpt / cpu / Forward |
0.000014091000002736107 s |
0.000015106039973034056 s |
0.93 |
GenDot / IPartOpt / cpu / Forward |
0.000009987120020014116 s |
0.000010579640002106316 s |
0.94 |
GenDot / DefOpt / cpu / Forward |
0.000015274380011760513 s |
0.000015775420024510822 s |
0.97 |
GenDot / IDefOpt / cpu / Forward |
0.000010764159978862154 s |
0.000010703139996621758 s |
1.01 |
GenDot / JaXPipe / cpu / PreRev |
0.000010639000010996825 s |
0.000010941340015051536 s |
0.97 |
GenDot / JaXPipe / cpu / PostRev |
0.000010367580025558708 s |
0.000009580099986123967 s |
1.08 |
GenDot / JaXPipe / cpu / BothRev |
0.000012410860026648152 s |
0.000015421240004798164 s |
0.80 |
GenDot / Jax / cpu / BothRev |
0.000010058579964606909 s |
0.00001034445996992872 s |
0.97 |
GenDot / HLOOpt / cpu / PreRev |
0.000010551619998295792 s |
0.000010394100036137388 s |
1.02 |
GenDot / HLOOpt / cpu / PostRev |
0.000010698859941840056 s |
0.00001084276000256068 s |
0.99 |
GenDot / HLOOpt / cpu / BothRev |
0.000011634260008577258 s |
0.0000126556399573019 s |
0.92 |
GenDot / PartOpt / cpu / PreRev |
0.000010222399978374596 s |
0.000010507800006962495 s |
0.97 |
GenDot / PartOpt / cpu / PostRev |
0.00000989210001534957 s |
0.000009735419971548254 s |
1.02 |
GenDot / PartOpt / cpu / BothRev |
0.000010970100020131212 s |
0.000010773899975902169 s |
1.02 |
GenDot / IPartOpt / cpu / PreRev |
0.000015729039996585926 s |
0.0000126061000264599 s |
1.25 |
GenDot / IPartOpt / cpu / PostRev |
0.00001016750002236222 s |
0.000009706900009405215 s |
1.05 |
GenDot / IPartOpt / cpu / BothRev |
0.000010763919999590145 s |
0.00001048992001415172 s |
1.03 |
GenDot / DefOpt / cpu / PreRev |
0.000010973160015055329 s |
0.0000105515800169087 s |
1.04 |
GenDot / DefOpt / cpu / PostRev |
0.000010253299960822914 s |
0.000010753359983937116 s |
0.95 |
GenDot / DefOpt / cpu / BothRev |
0.000010674860013750732 s |
0.000010645719949025078 s |
1.00 |
GenDot / IDefOpt / cpu / PreRev |
0.000010431959963170813 s |
0.000010760300001493308 s |
0.97 |
GenDot / IDefOpt / cpu / PostRev |
0.00001093580000087968 s |
0.00001044383999214915 s |
1.05 |
GenDot / IDefOpt / cpu / BothRev |
0.000011237580001761672 s |
0.000010543380021772464 s |
1.07 |
GenDot / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / Jax / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / HLOOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / PartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / DefOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / IDefOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / JaXPipe / cuda / Forward |
0.000010112 s |
0.00000992 s |
1.02 |
GenDot / Jax / cuda / Forward |
0.00000944 s |
0.000009824 s |
0.96 |
GenDot / HLOOpt / cuda / Forward |
0.000009665 s |
0.000009727 s |
0.99 |
GenDot / PartOpt / cuda / Forward |
0.000009823 s |
0.000010047 s |
0.98 |
GenDot / IPartOpt / cuda / Forward |
0.000009824 s |
0.000010817 s |
0.91 |
GenDot / DefOpt / cuda / Forward |
0.000009792 s |
0.000009888 s |
0.99 |
GenDot / IDefOpt / cuda / Forward |
0.000009791 s |
0.000010144 s |
0.97 |
GenDot / JaXPipe / cuda / PreRev |
0.000014592 s |
0.000009696 s |
1.50 |
GenDot / JaXPipe / cuda / PostRev |
0.000009759 s |
0.000010144 s |
0.96 |
GenDot / JaXPipe / cuda / BothRev |
0.000010048 s |
0.00001168 s |
0.86 |
GenDot / Jax / cuda / BothRev |
0.000009664 s |
0.000010144 s |
0.95 |
GenDot / HLOOpt / cuda / PreRev |
0.000010464 s |
0.000010048 s |
1.04 |
GenDot / HLOOpt / cuda / PostRev |
0.000010464 s |
0.000010208 s |
1.03 |
GenDot / HLOOpt / cuda / BothRev |
0.000010368 s |
0.000009856 s |
1.05 |
GenDot / PartOpt / cuda / PreRev |
0.000009888 s |
0.00000976 s |
1.01 |
GenDot / PartOpt / cuda / PostRev |
0.000010176 s |
0.000010271 s |
0.99 |
GenDot / PartOpt / cuda / BothRev |
0.000010336 s |
0.000009983 s |
1.04 |
GenDot / IPartOpt / cuda / PreRev |
0.00000992 s |
0.000010145 s |
0.98 |
GenDot / IPartOpt / cuda / PostRev |
0.000010368 s |
0.000009888 s |
1.05 |
GenDot / IPartOpt / cuda / BothRev |
0.000010048 s |
0.000009984 s |
1.01 |
GenDot / DefOpt / cuda / PreRev |
0.000009984 s |
0.000010688 s |
0.93 |
GenDot / DefOpt / cuda / PostRev |
0.000010048 s |
0.000009889 s |
1.02 |
GenDot / DefOpt / cuda / BothRev |
0.000009792 s |
0.0000104 s |
0.94 |
GenDot / IDefOpt / cuda / PreRev |
0.000010144 s |
0.000009728 s |
1.04 |
GenDot / IDefOpt / cuda / PostRev |
0.000009888 s |
0.000010208 s |
0.97 |
GenDot / IDefOpt / cuda / BothRev |
0.000010176 s |
0.000009856 s |
1.03 |
GenDot / JaXPipe / tpu / Primal |
9.30175e-7 s |
9.30225e-7 s |
1.00 |
GenDot / Jax / tpu / Primal |
9.36125e-7 s |
9.36325e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.00000157635 s |
0.000001582275 s |
1.00 |
GenDot / PartOpt / tpu / Primal |
9.36e-7 s |
9.367e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.40075e-7 s |
9.4025e-7 s |
1.00 |
GenDot / DefOpt / tpu / Primal |
0.0000015000749999999998 s |
0.00000150015 s |
1.00 |
GenDot / IDefOpt / tpu / Primal |
0.0000015762 s |
0.00000157865 s |
1.00 |
GenDot / JaXPipe / tpu / Forward |
0.0000031579 s |
0.0000031637 s |
1.00 |
GenDot / Jax / tpu / Forward |
0.000002334675 s |
0.000002335475 s |
1.00 |
GenDot / HLOOpt / tpu / Forward |
0.0000031101750000000003 s |
0.000003120325 s |
1.00 |
GenDot / PartOpt / tpu / Forward |
0.00000321845 s |
0.0000032214749999999995 s |
1.00 |
GenDot / IPartOpt / tpu / Forward |
0.0000031199 s |
0.0000031289499999999995 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.000003214625 s |
0.0000032214749999999995 s |
1.00 |
GenDot / IDefOpt / tpu / Forward |
0.0000031197000000000004 s |
0.0000031285 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.00000297265 s |
0.000002972925 s |
1.00 |
GenDot / JaXPipe / tpu / PostRev |
0.00000240415 s |
0.000002405675 s |
1.00 |
GenDot / JaXPipe / tpu / BothRev |
0.000002969875 s |
0.00000296155 s |
1.00 |
GenDot / Jax / tpu / BothRev |
0.0000024108750000000004 s |
0.000002403525 s |
1.00 |
GenDot / HLOOpt / tpu / PreRev |
0.000002966125 s |
0.00000296425 s |
1.00 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029226500000000003 s |
0.00000292975 s |
1.00 |
GenDot / HLOOpt / tpu / BothRev |
0.0000029594 s |
0.0000029618 s |
1.00 |
GenDot / PartOpt / tpu / PreRev |
0.000002939775 s |
0.0000029347 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.0000023864 s |
0.0000024008 s |
0.99 |
GenDot / PartOpt / tpu / BothRev |
0.000002940775 s |
0.000002934525 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.000002965775 s |
0.000002959775 s |
1.00 |
GenDot / IPartOpt / tpu / PostRev |
0.0000024031 s |
0.000002402725 s |
1.00 |
GenDot / IPartOpt / tpu / BothRev |
0.000002962625 s |
0.00000295655 s |
1.00 |
GenDot / DefOpt / tpu / PreRev |
0.0000029423749999999995 s |
0.0000029418 s |
1.00 |
GenDot / DefOpt / tpu / PostRev |
0.000002965775 s |
0.000002963375 s |
1.00 |
GenDot / DefOpt / tpu / BothRev |
0.0000029407 s |
0.0000029399 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.00000295815 s |
0.0000029657 s |
1.00 |
GenDot / IDefOpt / tpu / PostRev |
0.0000029338 s |
0.000002932425 s |
1.00 |
GenDot / IDefOpt / tpu / BothRev |
0.000002964075 s |
0.0000029619 s |
1.00 |
GenDot / JaXPipe / cpu / Primal |
0.000018145 s |
0.000006947120009499486 s |
2.61 |
GenDot / Jax / cpu / Primal |
0.000017959 s |
0.000006631260002905038 s |
2.71 |
GenDot / HLOOpt / cpu / Primal |
0.000017406000000000002 s |
0.000011154000021633693 s |
1.56 |
GenDot / PartOpt / cpu / Primal |
0.000018187 s |
0.000007032100011201692 s |
2.59 |
GenDot / IPartOpt / cpu / Primal |
0.000018488 s |
0.000007280660038304632 s |
2.54 |
GenDot / DefOpt / cpu / Primal |
0.00001746 s |
0.0000069018999965919646 s |
2.53 |
GenDot / IDefOpt / cpu / Primal |
0.000017147 s |
0.000007160159993873094 s |
2.39 |
GenDot / JaXPipe / cpu / Forward |
0.000025105 s |
0.000010356539960412192 s |
2.42 |
GenDot / Jax / cpu / Forward |
0.000024854 s |
0.000009965219987861929 s |
2.49 |
GenDot / HLOOpt / cpu / Forward |
0.000023496 s |
0.000015555780000795492 s |
1.51 |
GenDot / PartOpt / cpu / Forward |
0.000023468 s |
0.000015106039973034056 s |
1.55 |
GenDot / IPartOpt / cpu / Forward |
0.000023391 s |
0.000010579640002106316 s |
2.21 |
GenDot / DefOpt / cpu / Forward |
0.000023726 s |
0.000015775420024510822 s |
1.50 |
GenDot / IDefOpt / cpu / Forward |
0.000023155 s |
0.000010703139996621758 s |
2.16 |
GenDot / JaXPipe / cpu / PreRev |
0.000023872 s |
0.000010941340015051536 s |
2.18 |
GenDot / JaXPipe / cpu / PostRev |
0.000025339 s |
0.000009580099986123967 s |
2.64 |
GenDot / JaXPipe / cpu / BothRev |
0.000023873 s |
0.000015421240004798164 s |
1.55 |
GenDot / Jax / cpu / BothRev |
0.000025151 s |
0.00001034445996992872 s |
2.43 |
GenDot / HLOOpt / cpu / PreRev |
0.00002311 s |
0.000010394100036137388 s |
2.22 |
GenDot / HLOOpt / cpu / PostRev |
0.000023884 s |
0.00001084276000256068 s |
2.20 |
GenDot / HLOOpt / cpu / BothRev |
0.000023582 s |
0.0000126556399573019 s |
1.86 |
GenDot / PartOpt / cpu / PreRev |
0.000024048 s |
0.000010507800006962495 s |
2.29 |
GenDot / PartOpt / cpu / PostRev |
0.000025532 s |
0.000009735419971548254 s |
2.62 |
GenDot / PartOpt / cpu / BothRev |
0.000023629 s |
0.000010773899975902169 s |
2.19 |
GenDot / IPartOpt / cpu / PreRev |
0.000023398 s |
0.0000126061000264599 s |
1.86 |
GenDot / IPartOpt / cpu / PostRev |
0.0000248 s |
0.000009706900009405215 s |
2.55 |
GenDot / IPartOpt / cpu / BothRev |
0.000023735 s |
0.00001048992001415172 s |
2.26 |
GenDot / DefOpt / cpu / PreRev |
0.000024281 s |
0.0000105515800169087 s |
2.30 |
GenDot / DefOpt / cpu / PostRev |
0.000024015000000000003 s |
0.000010753359983937116 s |
2.23 |
GenDot / DefOpt / cpu / BothRev |
0.000023639 s |
0.000010645719949025078 s |
2.22 |
GenDot / IDefOpt / cpu / PreRev |
0.000023231 s |
0.000010760300001493308 s |
2.16 |
GenDot / IDefOpt / cpu / PostRev |
0.000023372 s |
0.00001044383999214915 s |
2.24 |
GenDot / IDefOpt / cpu / BothRev |
0.00002337 s |
0.000010543380021772464 s |
2.22 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000010553960000834197 s |
0.00001215756002238777 s |
0.87 |
hlo_ffi / Jax / cpu / Primal |
0.000010557359992162674 s |
0.000011048639989894584 s |
0.96 |
hlo_ffi / HLOOpt / cpu / Primal |
0.00001379296002596675 s |
0.000014333660019474336 s |
0.96 |
hlo_ffi / PartOpt / cpu / Primal |
0.000010393359980298556 s |
0.0000105436999729136 s |
0.99 |
hlo_ffi / IPartOpt / cpu / Primal |
0.00001040902003296651 s |
0.000011186599986103829 s |
0.93 |
hlo_ffi / DefOpt / cpu / Primal |
0.000014546019965564484 s |
0.000011220920032428696 s |
1.30 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000010131940025530638 s |
0.000010587520037006473 s |
0.96 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000014821639997535385 s |
0.000015703720000601605 s |
0.94 |
hlo_ffi / Jax / cpu / Forward |
0.000014716419927935933 s |
0.0000157695599955332 s |
0.93 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000015233600024657787 s |
0.00001574168001752696 s |
0.97 |
hlo_ffi / PartOpt / cpu / Forward |
0.00001550762000078976 s |
0.00001541290002023743 s |
1.01 |
hlo_ffi / IPartOpt / cpu / Forward |
0.00001527420000456914 s |
0.000016654940009175333 s |
0.92 |
hlo_ffi / DefOpt / cpu / Forward |
0.00001563409995469556 s |
0.000015555039981336448 s |
1.01 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000015012900003057438 s |
0.00001551384004415013 s |
0.97 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000014940260007278992 s |
0.000015577020049022396 s |
0.96 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.00001484106001953478 s |
0.00001549124001940072 s |
0.96 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000015342179976869376 s |
0.000018198480001956345 s |
0.84 |
hlo_ffi / Jax / cpu / BothRev |
0.00001493138001023908 s |
0.00001639481999518466 s |
0.91 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000015015779981695232 s |
0.00001555846000883321 s |
0.97 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000014814880041740252 s |
0.00001602827996975975 s |
0.92 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.00001689178000560787 s |
0.000017372659985994687 s |
0.97 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000015088300015122514 s |
0.000015273759972842528 s |
0.99 |
hlo_ffi / PartOpt / cpu / PostRev |
0.00001531143999272899 s |
0.000015502440010095598 s |
0.99 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000014818060017205425 s |
0.000015788059963597335 s |
0.94 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000014639300025010015 s |
0.000015949380012898474 s |
0.92 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000014955619972170098 s |
0.000016042460019889402 s |
0.93 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000014924800034350482 s |
0.00001584979996550828 s |
0.94 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000015131500022107505 s |
0.000015901400001894216 s |
0.95 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000014702599992233444 s |
0.000015517680012635537 s |
0.95 |
hlo_ffi / DefOpt / cpu / BothRev |
0.00001492728003540833 s |
0.000015647959990019445 s |
0.95 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.00001536083998871618 s |
0.000015500539984714123 s |
0.99 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.00001526971997009241 s |
0.00001546057996165473 s |
0.99 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000014659799990113243 s |
0.000015286819971151998 s |
0.96 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / Jax / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001983 s |
0.000001984 s |
1.00 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / Jax / cuda / Forward |
0.00000208 s |
0.000002049 s |
1.02 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IPartOpt / cuda / Forward |
0.000002079 s |
0.00000208 s |
1.00 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002048 s |
0.00000208 s |
0.98 |
hlo_ffi / IDefOpt / cuda / Forward |
0.00000208 s |
0.000002079 s |
1.00 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002079 s |
0.000002048 s |
1.02 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.00000208 s |
0.000002048 s |
1.02 |
hlo_ffi / Jax / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002079 s |
0.000002048 s |
1.02 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002079 s |
0.000002047 s |
1.02 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002079 s |
0.000002047 s |
1.02 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Primal |
9.19725e-7 s |
9.09975e-7 s |
1.01 |
hlo_ffi / Jax / tpu / Primal |
9.49775e-7 s |
9.754e-7 s |
0.97 |
hlo_ffi / HLOOpt / tpu / Primal |
8.9505e-7 s |
9.47675e-7 s |
0.94 |
hlo_ffi / PartOpt / tpu / Primal |
9.5035e-7 s |
9.80375e-7 s |
0.97 |
hlo_ffi / IPartOpt / tpu / Primal |
8.9825e-7 s |
9.451e-7 s |
0.95 |
hlo_ffi / DefOpt / tpu / Primal |
9.59225e-7 s |
9.7455e-7 s |
0.98 |
hlo_ffi / IDefOpt / tpu / Primal |
8.98775e-7 s |
9.5405e-7 s |
0.94 |
hlo_ffi / JaXPipe / tpu / Forward |
9.495e-7 s |
9.489e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.81375e-7 s |
9.817e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.74125e-7 s |
9.73525e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.345e-7 s |
9.5905e-7 s |
0.97 |
hlo_ffi / IPartOpt / tpu / Forward |
9.73625e-7 s |
9.73825e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.33925e-7 s |
9.5865e-7 s |
0.97 |
hlo_ffi / IDefOpt / tpu / Forward |
9.73875e-7 s |
9.7355e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.32275e-7 s |
9.5385e-7 s |
0.98 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.647e-7 s |
9.64375e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.5965e-7 s |
9.948e-7 s |
0.96 |
hlo_ffi / Jax / tpu / BothRev |
9.651e-7 s |
9.65025e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.59825e-7 s |
9.9415e-7 s |
0.97 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.651e-7 s |
9.642e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.59675e-7 s |
9.946e-7 s |
0.96 |
hlo_ffi / PartOpt / tpu / PreRev |
9.6465e-7 s |
9.64275e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.598e-7 s |
9.9475e-7 s |
0.96 |
hlo_ffi / PartOpt / tpu / BothRev |
9.65175e-7 s |
9.645e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.60125e-7 s |
9.946249999999998e-7 s |
0.97 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.65575e-7 s |
9.646e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.60225e-7 s |
9.946e-7 s |
0.97 |
hlo_ffi / DefOpt / tpu / PreRev |
9.6525e-7 s |
9.6435e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.60375e-7 s |
9.9435e-7 s |
0.97 |
hlo_ffi / DefOpt / tpu / BothRev |
9.650749999999998e-7 s |
9.64725e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.60275e-7 s |
9.944000000000002e-7 s |
0.97 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.64675e-7 s |
9.642e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.60075e-7 s |
9.94575e-7 s |
0.97 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000021682 s |
0.00001215756002238777 s |
1.78 |
hlo_ffi / Jax / cpu / Primal |
0.000021685 s |
0.000011048639989894584 s |
1.96 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000021221 s |
0.000014333660019474336 s |
1.48 |
hlo_ffi / PartOpt / cpu / Primal |
0.000021028000000000003 s |
0.0000105436999729136 s |
1.99 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000021547 s |
0.000011186599986103829 s |
1.93 |
hlo_ffi / DefOpt / cpu / Primal |
0.000021409 s |
0.000011220920032428696 s |
1.91 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000021307 s |
0.000010587520037006473 s |
2.01 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000030411 s |
0.000015703720000601605 s |
1.94 |
hlo_ffi / Jax / cpu / Forward |
0.000030104 s |
0.0000157695599955332 s |
1.91 |
hlo_ffi / HLOOpt / cpu / Forward |
0.00002972 s |
0.00001574168001752696 s |
1.89 |
hlo_ffi / PartOpt / cpu / Forward |
0.000029394 s |
0.00001541290002023743 s |
1.91 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000029744 s |
0.000016654940009175333 s |
1.79 |
hlo_ffi / DefOpt / cpu / Forward |
0.000029495 s |
0.000015555039981336448 s |
1.90 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000029736 s |
0.00001551384004415013 s |
1.92 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000030484 s |
0.000015577020049022396 s |
1.96 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000029783 s |
0.00001549124001940072 s |
1.92 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000030404 s |
0.000018198480001956345 s |
1.67 |
hlo_ffi / Jax / cpu / BothRev |
0.000029675 s |
0.00001639481999518466 s |
1.81 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000029893 s |
0.00001555846000883321 s |
1.92 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.00002977 s |
0.00001602827996975975 s |
1.86 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000030118 s |
0.000017372659985994687 s |
1.73 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000030551 s |
0.000015273759972842528 s |
2.00 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000030883 s |
0.000015502440010095598 s |
1.99 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000030171 s |
0.000015788059963597335 s |
1.91 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000030291 s |
0.000015949380012898474 s |
1.90 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000030817 s |
0.000016042460019889402 s |
1.92 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000030333 s |
0.00001584979996550828 s |
1.91 |
hlo_ffi / DefOpt / cpu / PreRev |
0.0000305 s |
0.000015901400001894216 s |
1.92 |
hlo_ffi / DefOpt / cpu / PostRev |
0.00003033 s |
0.000015517680012635537 s |
1.95 |
hlo_ffi / DefOpt / cpu / BothRev |
0.00003099 s |
0.000015647959990019445 s |
1.98 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000030658000000000004 s |
0.000015500539984714123 s |
1.98 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000051691 s |
0.00001546057996165473 s |
3.34 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.00003017 s |
0.000015286819971151998 s |
1.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0011608180000621 s |
0.0012036823999551 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.000921300200116 s |
0.0009620957998777 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.000965019799878 s |
0.0009929958000611 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0009083613998882 s |
0.0009454152001126 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0009227689999534 s |
0.0009428865999325 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009837171998697 s |
0.0009812250000322 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009558567999192 s |
0.0009777559998838 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0026824629999282 s |
0.002858737400038 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0023384537998936 s |
0.0023105184000087 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0021713529999942 s |
0.0022813554001004 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0022423364000133 s |
0.0021734424000896 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0021497572000953 s |
0.0021884054000111 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022551081999154 s |
0.0024744702000134 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.002412242399987 s |
0.0022505462000481 s |
1.07 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0061717777999547 s |
0.006784250200053 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0052599124000153 s |
0.0058888162000585 s |
0.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0054720860000998 s |
0.0056171007999182 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0056609621999996 s |
0.0059114769999723 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0055049397999027 s |
0.0058158163998086 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.005428001600103 s |
0.0045190218000243 s |
1.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.006919869199919 s |
0.0065032250000513 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0054702464000911 s |
0.0046092143998066 s |
1.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0059115796000696 s |
0.0070172094000554 s |
0.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0057619930000328 s |
0.0033629600000494 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0063755414000297 s |
0.0063722097999743 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0057007753999641 s |
0.0054374347999328 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0032509261998711 s |
0.0033442355998886 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0054783122000117 s |
0.0050351783999758 s |
1.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0031110091998925 s |
0.0075923654000689 s |
0.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0052721056000336 s |
0.0051982864000819 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.006876614200064 s |
0.0033124989999123 s |
2.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0058515341999736 s |
0.0049496529998577 s |
1.18 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0061410750000504 s |
0.0033382535999407 s |
1.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000281534 s |
0.000279551 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000281278 s |
0.00028 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.00028931 s |
0.000287296 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000282398 s |
0.000281216 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000282879 s |
0.0002812149999999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.000288798 s |
0.000287392 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.0002890539999999 s |
0.000287263 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000559964 s |
0.000558686 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000539837 s |
0.000539007 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000559772 s |
0.000558366 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.000560508 s |
0.000559166 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.00056086 s |
0.000558559 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.00055974 s |
0.000558239 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.00056006 s |
0.000557855 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001028856 s |
0.001026045 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000987865 s |
0.000983869 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001022745 s |
0.001019773 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.00098537 s |
0.000982365 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.00100956 s |
0.001006621 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.0010360249999999 s |
0.001031101 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001009081 s |
0.001007485 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001025818 s |
0.001023357 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000974137 s |
0.000972957 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001026009 s |
0.001019998 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001024665 s |
0.001020509 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000975097 s |
0.000971806 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001029945 s |
0.0010250539999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001019482 s |
0.001018141 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000957914 s |
0.000955134 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001020505 s |
0.001018557 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.0010201849999999 s |
0.001018461 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001021593 s |
0.001017341 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.0010210809999999 s |
0.00101523 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.00012412975 s |
0.000130776 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.00012663325 s |
0.00012379575 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.000152624 s |
0.0001602895 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013437525 s |
0.00013092325 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.00013076625 s |
0.00013860425 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.0001485145 s |
0.0001448835 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.0001508324999999 s |
0.000158363 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.0002121357499999 s |
0.00021344325 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.0002612459999999 s |
0.000262739 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00021226825 s |
0.0002202455 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.00021842225 s |
0.0002149439999999 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.00021231325 s |
0.00021632625 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.00021834975 s |
0.0002179289999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.0002123635 s |
0.000215482 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.0003545575 s |
0.00035604775 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.000256988 s |
0.00025613975 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.0003549145 s |
0.0003558845 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.00025698375 s |
0.0002572075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.000354719 s |
0.00035595475 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029080325 s |
0.00029121575 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035460975 s |
0.0003563505 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.00035542875 s |
0.0003559474999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.00027109 s |
0.0002721675 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.00035554475 s |
0.00035589125 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.0003546405 s |
0.0003560404999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.0002720775 s |
0.000272059 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.00035490025 s |
0.00035620475 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.00035800825 s |
0.00035833425 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.0002838684999999 s |
0.00028397 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.00035760675 s |
0.00035829275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.0003568052499999 s |
0.0003583175 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00030104975 s |
0.00030107325 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.000357122 s |
0.0003580094999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0022880039999999 s |
0.0012036823999551 s |
1.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0023439339999999 s |
0.0009620957998777 s |
2.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.00246858 s |
0.0009929958000611 s |
2.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002261953 s |
0.0009454152001126 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.002333813 s |
0.0009428865999325 s |
2.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002340501 s |
0.0009812250000322 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.00231937 s |
0.0009777559998838 s |
2.37 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.00589891 s |
0.002858737400038 s |
2.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.006237733 s |
0.0023105184000087 s |
2.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0060525299999999 s |
0.0022813554001004 s |
2.65 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.006347947 s |
0.0021734424000896 s |
2.92 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0062175 s |
0.0021884054000111 s |
2.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005916849 s |
0.0024744702000134 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0061690389999999 s |
0.0022505462000481 s |
2.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0123798969999999 s |
0.006784250200053 s |
1.82 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.009809592 s |
0.0058888162000585 s |
1.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.009828801 s |
0.0056171007999182 s |
1.75 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.010812429 s |
0.0059114769999723 s |
1.83 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.009623281 s |
0.0058158163998086 s |
1.65 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.010003285 s |
0.0045190218000243 s |
2.21 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0117749439999999 s |
0.0065032250000513 s |
1.81 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.009447268 s |
0.0046092143998066 s |
2.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.012011901 s |
0.0070172094000554 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.011384556 s |
0.0033629600000494 s |
3.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.010043784 s |
0.0063722097999743 s |
1.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.010663173 s |
0.0054374347999328 s |
1.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.010075192 s |
0.0033442355998886 s |
3.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.009827913 s |
0.0050351783999758 s |
1.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0091100779999999 s |
0.0075923654000689 s |
1.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.009522662 s |
0.0051982864000819 s |
1.83 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.009607598 s |
0.0033124989999123 s |
2.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.01002465 s |
0.0049496529998577 s |
2.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.009256671 s |
0.0033382535999407 s |
2.77 |
scatter_sum / JaXPipe / cpu / Primal |
0.000008765140000832616 s |
0.0000080933799927152 s |
1.08 |
scatter_sum / Jax / cpu / Primal |
0.00000830382003186969 s |
0.00000773041997490509 s |
1.07 |
scatter_sum / HLOOpt / cpu / Primal |
0.00001165101999504259 s |
0.000010982679996232036 s |
1.06 |
scatter_sum / PartOpt / cpu / Primal |
0.00000729794000108086 s |
0.000007304119981199619 s |
1.00 |
scatter_sum / IPartOpt / cpu / Primal |
0.000007218680011646939 s |
0.000007588260004922631 s |
0.95 |
scatter_sum / DefOpt / cpu / Primal |
0.000007342320022871718 s |
0.000007340080019275774 s |
1.00 |
scatter_sum / IDefOpt / cpu / Primal |
0.000007587279960716842 s |
0.000007766280004943837 s |
0.98 |
scatter_sum / JaXPipe / cpu / Forward |
0.000011682040021696592 s |
0.000011774819968195516 s |
0.99 |
scatter_sum / Jax / cpu / Forward |
0.000011735579992091516 s |
0.00001104679997297353 s |
1.06 |
scatter_sum / HLOOpt / cpu / Forward |
0.000016802159971121 s |
0.000016549519978070747 s |
1.02 |
scatter_sum / PartOpt / cpu / Forward |
0.00001270353999643703 s |
0.000016693700017640368 s |
0.76 |
scatter_sum / IPartOpt / cpu / Forward |
0.000011650680035018013 s |
0.00001139787999818509 s |
1.02 |
scatter_sum / DefOpt / cpu / Forward |
0.000017205119966092753 s |
0.00001691736004431732 s |
1.02 |
scatter_sum / IDefOpt / cpu / Forward |
0.000011998819991276832 s |
0.000011772519965234096 s |
1.02 |
scatter_sum / JaXPipe / cpu / PreRev |
0.00001135266001256241 s |
0.000012000640008409392 s |
0.95 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000011807319997387824 s |
0.000011409239968998008 s |
1.03 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000011457300033725914 s |
0.000016204619996642577 s |
0.71 |
scatter_sum / Jax / cpu / BothRev |
0.000011159440018673197 s |
0.000011913580010514124 s |
0.94 |
scatter_sum / HLOOpt / cpu / PreRev |
0.00001145110003562877 s |
0.000011849580014313687 s |
0.97 |
scatter_sum / HLOOpt / cpu / PostRev |
0.00001612585995644622 s |
0.00001634619997275877 s |
0.99 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000012872859979324858 s |
0.000013083580033708132 s |
0.98 |
scatter_sum / PartOpt / cpu / PreRev |
0.000011292019980828628 s |
0.000011277140010861333 s |
1.00 |
scatter_sum / PartOpt / cpu / PostRev |
0.000011635459941317094 s |
0.000011436099994170943 s |
1.02 |
scatter_sum / PartOpt / cpu / BothRev |
0.000011186780011485096 s |
0.000012017060034850149 s |
0.93 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000011781960020016412 s |
0.00001792121997823415 s |
0.66 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000011918680002054316 s |
0.000011673999988488504 s |
1.02 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000011082959990744713 s |
0.000011753560038414434 s |
0.94 |
scatter_sum / DefOpt / cpu / PreRev |
0.00001176368002234085 s |
0.000012247980012034531 s |
0.96 |
scatter_sum / DefOpt / cpu / PostRev |
0.000011209939966647653 s |
0.00001137746000495099 s |
0.99 |
scatter_sum / DefOpt / cpu / BothRev |
0.000010957640024571446 s |
0.000011400659977880424 s |
0.96 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000011308880038995994 s |
0.00001144964000559412 s |
0.99 |
scatter_sum / IDefOpt / cpu / PostRev |
0.0000114527999812708 s |
0.00001141109995842271 s |
1.00 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000011729199995897944 s |
0.00001149439997789159 s |
1.02 |
scatter_sum / JaXPipe / cuda / Primal |
0.000009824 s |
0.000010464 s |
0.94 |
scatter_sum / Jax / cuda / Primal |
0.000009504 s |
0.000009887 s |
0.96 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009984 s |
0.000010272 s |
0.97 |
scatter_sum / PartOpt / cuda / Primal |
0.000009664 s |
0.000009887 s |
0.98 |
scatter_sum / IPartOpt / cuda / Primal |
0.000010144 s |
0.000010209 s |
0.99 |
scatter_sum / DefOpt / cuda / Primal |
0.000009919 s |
0.000009855 s |
1.01 |
scatter_sum / IDefOpt / cuda / Primal |
0.000009888 s |
0.000010176 s |
0.97 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017024 s |
0.000017344 s |
0.98 |
scatter_sum / Jax / cuda / Forward |
0.00001664 s |
0.000016865000000000002 s |
0.99 |
scatter_sum / HLOOpt / cuda / Forward |
0.000016672 s |
0.000016576000000000002 s |
1.01 |
scatter_sum / PartOpt / cuda / Forward |
0.000016863 s |
0.00001728 s |
0.98 |
scatter_sum / IPartOpt / cuda / Forward |
0.000017023 s |
0.000017375999999999998 s |
0.98 |
scatter_sum / DefOpt / cuda / Forward |
0.0000168 s |
0.000016768000000000003 s |
1.00 |
scatter_sum / IDefOpt / cuda / Forward |
0.000016352 s |
0.00001712 s |
0.96 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000016544 s |
0.0000168 s |
0.98 |
scatter_sum / JaXPipe / cuda / PostRev |
0.00001648 s |
0.00001744 s |
0.94 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000021504 s |
0.000018048 s |
1.19 |
scatter_sum / Jax / cuda / BothRev |
0.000016608 s |
0.00001664 s |
1.00 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000016768000000000003 s |
0.000017055000000000002 s |
0.98 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000016831 s |
0.000016063999999999997 s |
1.05 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017472 s |
0.00001664 s |
1.05 |
scatter_sum / PartOpt / cuda / PreRev |
0.00001744 s |
0.000017152 s |
1.02 |
scatter_sum / PartOpt / cuda / PostRev |
0.000016736 s |
0.000016832 s |
0.99 |
scatter_sum / PartOpt / cuda / BothRev |
0.0000168 s |
0.000016927000000000002 s |
0.99 |
scatter_sum / IPartOpt / cuda / PreRev |
0.00001632 s |
0.00001696 s |
0.96 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000016607 s |
0.000016768000000000003 s |
0.99 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000016672 s |
0.000016608 s |
1.00 |
scatter_sum / DefOpt / cuda / PreRev |
0.000016703 s |
0.000017215 s |
0.97 |
scatter_sum / DefOpt / cuda / PostRev |
0.000016352 s |
0.000017024 s |
0.96 |
scatter_sum / DefOpt / cuda / BothRev |
0.000016736 s |
0.00001616 s |
1.04 |
scatter_sum / IDefOpt / cuda / PreRev |
0.00001696 s |
0.000016832 s |
1.01 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016128 s |
0.000016705 s |
0.97 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000016353 s |
0.00001696 s |
0.96 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001342825 s |
0.00000135085 s |
0.99 |
scatter_sum / Jax / tpu / Primal |
0.0000014135 s |
0.000001414475 s |
1.00 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001352825 s |
0.00000136035 s |
0.99 |
scatter_sum / PartOpt / tpu / Primal |
0.0000014136 s |
0.000001414375 s |
1.00 |
scatter_sum / IPartOpt / tpu / Primal |
0.00000135215 s |
0.0000013597749999999998 s |
0.99 |
scatter_sum / DefOpt / tpu / Primal |
0.000001413525 s |
0.0000014149 s |
1.00 |
scatter_sum / IDefOpt / tpu / Primal |
0.000001351825 s |
0.0000013597 s |
0.99 |
scatter_sum / JaXPipe / tpu / Forward |
0.00000271625 s |
0.000002709275 s |
1.00 |
scatter_sum / Jax / tpu / Forward |
0.000002733775 s |
0.000002732525 s |
1.00 |
scatter_sum / HLOOpt / tpu / Forward |
0.0000027132 s |
0.00000270835 s |
1.00 |
scatter_sum / PartOpt / tpu / Forward |
0.0000027002 s |
0.00000269965 s |
1.00 |
scatter_sum / IPartOpt / tpu / Forward |
0.0000027129750000000005 s |
0.0000027152500000000004 s |
1.00 |
scatter_sum / DefOpt / tpu / Forward |
0.0000026986500000000004 s |
0.000002706425 s |
1.00 |
scatter_sum / IDefOpt / tpu / Forward |
0.0000027177 s |
0.0000027088250000000003 s |
1.00 |
scatter_sum / JaXPipe / tpu / PreRev |
0.000002688625 s |
0.000002704275 s |
0.99 |
scatter_sum / JaXPipe / tpu / PostRev |
0.0000026967000000000004 s |
0.000002697 s |
1.00 |
scatter_sum / JaXPipe / tpu / BothRev |
0.0000027138 s |
0.000002714475 s |
1.00 |
scatter_sum / Jax / tpu / BothRev |
0.0000027545 s |
0.00000274965 s |
1.00 |
scatter_sum / HLOOpt / tpu / PreRev |
0.000002710075 s |
0.00000272885 s |
0.99 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000027480500000000003 s |
0.000002756025 s |
1.00 |
scatter_sum / HLOOpt / tpu / BothRev |
0.00000271045 s |
0.00000272025 s |
1.00 |
scatter_sum / PartOpt / tpu / PreRev |
0.000002754775 s |
0.00000275605 s |
1.00 |
scatter_sum / PartOpt / tpu / PostRev |
0.0000027078 s |
0.0000027178 s |
1.00 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027567 s |
0.0000027538750000000004 s |
1.00 |
scatter_sum / IPartOpt / tpu / PreRev |
0.000002709025 s |
0.0000027201 s |
1.00 |
scatter_sum / IPartOpt / tpu / PostRev |
0.000002750975 s |
0.0000027556750000000005 s |
1.00 |
scatter_sum / IPartOpt / tpu / BothRev |
0.000002707225 s |
0.00000271685 s |
1.00 |
scatter_sum / DefOpt / tpu / PreRev |
0.0000027538750000000004 s |
0.00000274965 s |
1.00 |
scatter_sum / DefOpt / tpu / PostRev |
0.00000270455 s |
0.00000271375 s |
1.00 |
scatter_sum / DefOpt / tpu / BothRev |
0.000002749775 s |
0.000002750275 s |
1.00 |
scatter_sum / IDefOpt / tpu / PreRev |
0.00000270745 s |
0.0000027207 s |
1.00 |
scatter_sum / IDefOpt / tpu / PostRev |
0.000002752325 s |
0.0000027518 s |
1.00 |
scatter_sum / IDefOpt / tpu / BothRev |
0.000002710975 s |
0.000002717075 s |
1.00 |
scatter_sum / JaXPipe / cpu / Primal |
0.000019248000000000003 s |
0.0000080933799927152 s |
2.38 |
scatter_sum / Jax / cpu / Primal |
0.000018504 s |
0.00000773041997490509 s |
2.39 |
scatter_sum / HLOOpt / cpu / Primal |
0.000019428 s |
0.000010982679996232036 s |
1.77 |
scatter_sum / PartOpt / cpu / Primal |
0.000019185 s |
0.000007304119981199619 s |
2.63 |
scatter_sum / IPartOpt / cpu / Primal |
0.000018976 s |
0.000007588260004922631 s |
2.50 |
scatter_sum / DefOpt / cpu / Primal |
0.000019284 s |
0.000007340080019275774 s |
2.63 |
scatter_sum / IDefOpt / cpu / Primal |
0.00001875 s |
0.000007766280004943837 s |
2.41 |
scatter_sum / JaXPipe / cpu / Forward |
0.000027811 s |
0.000011774819968195516 s |
2.36 |
scatter_sum / Jax / cpu / Forward |
0.000026865 s |
0.00001104679997297353 s |
2.43 |
scatter_sum / HLOOpt / cpu / Forward |
0.000026818000000000003 s |
0.000016549519978070747 s |
1.62 |
scatter_sum / PartOpt / cpu / Forward |
0.000026831 s |
0.000016693700017640368 s |
1.61 |
scatter_sum / IPartOpt / cpu / Forward |
0.000026879 s |
0.00001139787999818509 s |
2.36 |
scatter_sum / DefOpt / cpu / Forward |
0.000026846 s |
0.00001691736004431732 s |
1.59 |
scatter_sum / IDefOpt / cpu / Forward |
0.000026784 s |
0.000011772519965234096 s |
2.28 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000027299 s |
0.000012000640008409392 s |
2.27 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000027858 s |
0.000011409239968998008 s |
2.44 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000027888 s |
0.000016204619996642577 s |
1.72 |
scatter_sum / Jax / cpu / BothRev |
0.000027651 s |
0.000011913580010514124 s |
2.32 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000027563 s |
0.000011849580014313687 s |
2.33 |
scatter_sum / HLOOpt / cpu / PostRev |
0.00002736 s |
0.00001634619997275877 s |
1.67 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000027457 s |
0.000013083580033708132 s |
2.10 |
scatter_sum / PartOpt / cpu / PreRev |
0.000026585 s |
0.000011277140010861333 s |
2.36 |
scatter_sum / PartOpt / cpu / PostRev |
0.000026972 s |
0.000011436099994170943 s |
2.36 |
scatter_sum / PartOpt / cpu / BothRev |
0.000027799 s |
0.000012017060034850149 s |
2.31 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000027805 s |
0.00001792121997823415 s |
1.55 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000027316 s |
0.000011673999988488504 s |
2.34 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000027128 s |
0.000011753560038414434 s |
2.31 |
scatter_sum / DefOpt / cpu / PreRev |
0.000027201 s |
0.000012247980012034531 s |
2.22 |
scatter_sum / DefOpt / cpu / PostRev |
0.000027904000000000003 s |
0.00001137746000495099 s |
2.45 |
scatter_sum / DefOpt / cpu / BothRev |
0.000026927 s |
0.000011400659977880424 s |
2.36 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000027345 s |
0.00001144964000559412 s |
2.39 |
scatter_sum / IDefOpt / cpu / PostRev |
0.00002788 s |
0.00001141109995842271 s |
2.44 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000027385 s |
0.00001149439997789159 s |
2.38 |
slicing / JaXPipe / cpu / Primal |
0.000006716739999319544 s |
0.000007002580023254268 s |
0.96 |
slicing / Jax / cpu / Primal |
0.00000592898000832065 s |
0.00000600012003815209 s |
0.99 |
slicing / HLOOpt / cpu / Primal |
0.00001015080001707247 s |
0.000009939419978763908 s |
1.02 |
slicing / PartOpt / cpu / Primal |
0.000005991240013827337 s |
0.000006265100055315998 s |
0.96 |
slicing / IPartOpt / cpu / Primal |
0.000005967719998807297 s |
0.000006429720006053685 s |
0.93 |
slicing / DefOpt / cpu / Primal |
0.000010611260031510027 s |
0.000010525640009291236 s |
1.01 |
slicing / IDefOpt / cpu / Primal |
0.000006269440009418759 s |
0.000006683040001007612 s |
0.94 |
slicing / JaXPipe / cpu / Forward |
0.000009699180027382682 s |
0.00000964760001807008 s |
1.01 |
slicing / Jax / cpu / Forward |
0.000009746279965838769 s |
0.000010153560006074256 s |
0.96 |
slicing / HLOOpt / cpu / Forward |
0.000013895640013288355 s |
0.00001357187995381537 s |
1.02 |
slicing / PartOpt / cpu / Forward |
0.000014040380010555964 s |
0.000013793300013276166 s |
1.02 |
slicing / IPartOpt / cpu / Forward |
0.00000886831999196147 s |
0.000009253480029656205 s |
0.96 |
slicing / DefOpt / cpu / Forward |
0.000014229019980120938 s |
0.00001335547999588016 s |
1.07 |
slicing / IDefOpt / cpu / Forward |
0.000008901359960873378 s |
0.00000917934004064591 s |
0.97 |
slicing / JaXPipe / cpu / PreRev |
0.000010068740011774935 s |
0.000010112960035257857 s |
1.00 |
slicing / JaXPipe / cpu / PostRev |
0.000010061659950224566 s |
0.000009997219967772253 s |
1.01 |
slicing / JaXPipe / cpu / BothRev |
0.000013906099993619135 s |
0.000013674820002052 s |
1.02 |
slicing / Jax / cpu / BothRev |
0.000010186320014327068 s |
0.000010126220013262357 s |
1.01 |
slicing / HLOOpt / cpu / PreRev |
0.000009474439966652426 s |
0.00000982377998298034 s |
0.96 |
slicing / HLOOpt / cpu / PostRev |
0.00001028783995025151 s |
0.000010500840053282444 s |
0.98 |
slicing / HLOOpt / cpu / BothRev |
0.000011148860030516517 s |
0.00001144218002082198 s |
0.97 |
slicing / PartOpt / cpu / PreRev |
0.000009621040017009364 s |
0.00000988726000286988 s |
0.97 |
slicing / PartOpt / cpu / PostRev |
0.000010587219985609407 s |
0.000010319300008632126 s |
1.03 |
slicing / PartOpt / cpu / BothRev |
0.000010054600006697 s |
0.000009877820011752193 s |
1.02 |
slicing / IPartOpt / cpu / PreRev |
0.000014428679960474256 s |
0.000009871659958662347 s |
1.46 |
slicing / IPartOpt / cpu / PostRev |
0.000010026840018326764 s |
0.000010186019962930005 s |
0.98 |
slicing / IPartOpt / cpu / BothRev |
0.000010097339982166889 s |
0.00000961596000706777 s |
1.05 |
slicing / DefOpt / cpu / PreRev |
0.00000954265995460446 s |
0.000009731240015753427 s |
0.98 |
slicing / DefOpt / cpu / PostRev |
0.000010105219944307464 s |
0.00001040664000356628 s |
0.97 |
slicing / DefOpt / cpu / BothRev |
0.000010099879991685155 s |
0.00000972724000348535 s |
1.04 |
slicing / IDefOpt / cpu / PreRev |
0.00000962346000051184 s |
0.000009798860037335545 s |
0.98 |
slicing / IDefOpt / cpu / PostRev |
0.00000974014002167678 s |
0.000010772420000648709 s |
0.90 |
slicing / IDefOpt / cpu / BothRev |
0.00000984720002634276 s |
0.000010212100023636597 s |
0.96 |
slicing / JaXPipe / cuda / Primal |
0.000001888 s |
0.000001888 s |
1 |
slicing / Jax / cuda / Primal |
0.000001888 s |
0.000001888 s |
1 |
slicing / HLOOpt / cuda / Primal |
0.000001919 s |
0.000001919 s |
1 |
slicing / PartOpt / cuda / Primal |
0.000001889 s |
0.000001888 s |
1.00 |
slicing / IPartOpt / cuda / Primal |
0.000001888 s |
0.000001888 s |
1 |
slicing / DefOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
slicing / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
slicing / JaXPipe / cuda / Forward |
0.00000944 s |
0.000009376 s |
1.01 |
slicing / Jax / cuda / Forward |
0.000009856 s |
0.000009728 s |
1.01 |
slicing / HLOOpt / cuda / Forward |
0.00001008 s |
0.00000912 s |
1.11 |
slicing / PartOpt / cuda / Forward |
0.000009568 s |
0.00000976 s |
0.98 |
slicing / IPartOpt / cuda / Forward |
0.000009824 s |
0.00000944 s |
1.04 |
slicing / DefOpt / cuda / Forward |
0.000009568 s |
0.000009857 s |
0.97 |
slicing / IDefOpt / cuda / Forward |
0.000009024 s |
0.000009472 s |
0.95 |
slicing / JaXPipe / cuda / PreRev |
0.000009952 s |
0.000009633 s |
1.03 |
slicing / JaXPipe / cuda / PostRev |
0.00000992 s |
0.00001008 s |
0.98 |
slicing / JaXPipe / cuda / BothRev |
0.000010016 s |
0.00000992 s |
1.01 |
slicing / Jax / cuda / BothRev |
0.00000992 s |
0.000010016 s |
0.99 |
slicing / HLOOpt / cuda / PreRev |
0.000010272 s |
0.0000096 s |
1.07 |
slicing / HLOOpt / cuda / PostRev |
0.000009952 s |
0.000010081 s |
0.99 |
slicing / HLOOpt / cuda / BothRev |
0.000009888 s |
0.0000104 s |
0.95 |
slicing / PartOpt / cuda / PreRev |
0.000009984 s |
0.000010016 s |
1.00 |
slicing / PartOpt / cuda / PostRev |
0.000009888 s |
0.000009952 s |
0.99 |
slicing / PartOpt / cuda / BothRev |
0.000010208 s |
0.000010048 s |
1.02 |
slicing / IPartOpt / cuda / PreRev |
0.000009984 s |
0.00001008 s |
0.99 |
slicing / IPartOpt / cuda / PostRev |
0.000009888 s |
0.000009632 s |
1.03 |
slicing / IPartOpt / cuda / BothRev |
0.000010112 s |
0.000009889 s |
1.02 |
slicing / DefOpt / cuda / PreRev |
0.000009823 s |
0.000010016 s |
0.98 |
slicing / DefOpt / cuda / PostRev |
0.000009697 s |
0.000009536 s |
1.02 |
slicing / DefOpt / cuda / BothRev |
0.0000096 s |
0.000010112 s |
0.95 |
slicing / IDefOpt / cuda / PreRev |
0.000010208 s |
0.000010208 s |
1 |
slicing / IDefOpt / cuda / PostRev |
0.00000976 s |
0.000010048 s |
0.97 |
slicing / IDefOpt / cuda / BothRev |
0.00000992 s |
0.000011104 s |
0.89 |
slicing / JaXPipe / tpu / Primal |
0.000001024 s |
0.000001022125 s |
1.00 |
slicing / Jax / tpu / Primal |
9.74075e-7 s |
9.83425e-7 s |
0.99 |
slicing / HLOOpt / tpu / Primal |
0.0000010366 s |
0.000001026025 s |
1.01 |
slicing / PartOpt / tpu / Primal |
9.7105e-7 s |
9.68325e-7 s |
1.00 |
slicing / IPartOpt / tpu / Primal |
0.000001037825 s |
0.00000102485 s |
1.01 |
slicing / DefOpt / tpu / Primal |
9.72025e-7 s |
9.74625e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
0.000001028425 s |
0.000001027875 s |
1.00 |
slicing / JaXPipe / tpu / Forward |
0.00000141255 s |
0.000001408975 s |
1.00 |
slicing / Jax / tpu / Forward |
0.000001475275 s |
0.0000014854 s |
0.99 |
slicing / HLOOpt / tpu / Forward |
0.0000015242000000000002 s |
0.0000015198999999999998 s |
1.00 |
slicing / PartOpt / tpu / Forward |
0.0000015037749999999998 s |
0.000001496775 s |
1.00 |
slicing / IPartOpt / tpu / Forward |
0.000001523425 s |
0.0000015180250000000002 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.000001495875 s |
0.00000149955 s |
1.00 |
slicing / IDefOpt / tpu / Forward |
0.00000152015 s |
0.00000152035 s |
1.00 |
slicing / JaXPipe / tpu / PreRev |
0.0000025674 s |
0.000002570325 s |
1.00 |
slicing / JaXPipe / tpu / PostRev |
0.000002519825 s |
0.0000025251 s |
1.00 |
slicing / JaXPipe / tpu / BothRev |
0.000002578775 s |
0.00000258565 s |
1.00 |
slicing / Jax / tpu / BothRev |
0.000002530825 s |
0.00000254905 s |
0.99 |
slicing / HLOOpt / tpu / PreRev |
0.0000025881250000000003 s |
0.000002579625 s |
1.00 |
slicing / HLOOpt / tpu / PostRev |
0.0000025354500000000004 s |
0.000002538575 s |
1.00 |
slicing / HLOOpt / tpu / BothRev |
0.00000258575 s |
0.0000025944750000000003 s |
1.00 |
slicing / PartOpt / tpu / PreRev |
0.0000025413750000000003 s |
0.00000254365 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.00000258345 s |
0.000002580025 s |
1.00 |
slicing / PartOpt / tpu / BothRev |
0.0000025367250000000003 s |
0.000002537975 s |
1.00 |
slicing / IPartOpt / tpu / PreRev |
0.0000025880750000000006 s |
0.00000258125 s |
1.00 |
slicing / IPartOpt / tpu / PostRev |
0.0000025395250000000005 s |
0.000002543975 s |
1.00 |
slicing / IPartOpt / tpu / BothRev |
0.0000025823 s |
0.0000025812 s |
1.00 |
slicing / DefOpt / tpu / PreRev |
0.000002547775 s |
0.000002536625 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.000002575075 s |
0.00000258605 s |
1.00 |
slicing / DefOpt / tpu / BothRev |
0.00000253825 s |
0.000002543825 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.000002592275 s |
0.0000025917750000000003 s |
1.00 |
slicing / IDefOpt / tpu / PostRev |
0.000002537275 s |
0.000002549775 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.0000025813 s |
0.000002588275 s |
1.00 |
slicing / JaXPipe / cpu / Primal |
0.000015689000000000002 s |
0.000007002580023254268 s |
2.24 |
slicing / Jax / cpu / Primal |
0.000015109 s |
0.00000600012003815209 s |
2.52 |
slicing / HLOOpt / cpu / Primal |
0.000015185 s |
0.000009939419978763908 s |
1.53 |
slicing / PartOpt / cpu / Primal |
0.000015294 s |
0.000006265100055315998 s |
2.44 |
slicing / IPartOpt / cpu / Primal |
0.000015533 s |
0.000006429720006053685 s |
2.42 |
slicing / DefOpt / cpu / Primal |
0.000015465 s |
0.000010525640009291236 s |
1.47 |
slicing / IDefOpt / cpu / Primal |
0.000015288 s |
0.000006683040001007612 s |
2.29 |
slicing / JaXPipe / cpu / Forward |
0.00002067 s |
0.00000964760001807008 s |
2.14 |
slicing / Jax / cpu / Forward |
0.000020265 s |
0.000010153560006074256 s |
2.00 |
slicing / HLOOpt / cpu / Forward |
0.000020623 s |
0.00001357187995381537 s |
1.52 |
slicing / PartOpt / cpu / Forward |
0.000020228 s |
0.000013793300013276166 s |
1.47 |
slicing / IPartOpt / cpu / Forward |
0.000020157 s |
0.000009253480029656205 s |
2.18 |
slicing / DefOpt / cpu / Forward |
0.000020287 s |
0.00001335547999588016 s |
1.52 |
slicing / IDefOpt / cpu / Forward |
0.000020499 s |
0.00000917934004064591 s |
2.23 |
slicing / JaXPipe / cpu / PreRev |
0.000021722000000000003 s |
0.000010112960035257857 s |
2.15 |
slicing / JaXPipe / cpu / PostRev |
0.000021033 s |
0.000009997219967772253 s |
2.10 |
slicing / JaXPipe / cpu / BothRev |
0.000021313 s |
0.000013674820002052 s |
1.56 |
slicing / Jax / cpu / BothRev |
0.000021313 s |
0.000010126220013262357 s |
2.10 |
slicing / HLOOpt / cpu / PreRev |
0.00002168 s |
0.00000982377998298034 s |
2.21 |
slicing / HLOOpt / cpu / PostRev |
0.000021214 s |
0.000010500840053282444 s |
2.02 |
slicing / HLOOpt / cpu / BothRev |
0.000021834 s |
0.00001144218002082198 s |
1.91 |
slicing / PartOpt / cpu / PreRev |
0.000021129 s |
0.00000988726000286988 s |
2.14 |
slicing / PartOpt / cpu / PostRev |
0.000021519 s |
0.000010319300008632126 s |
2.09 |
slicing / PartOpt / cpu / BothRev |
0.000021453 s |
0.000009877820011752193 s |
2.17 |
slicing / IPartOpt / cpu / PreRev |
0.000021369 s |
0.000009871659958662347 s |
2.16 |
slicing / IPartOpt / cpu / PostRev |
0.000021425 s |
0.000010186019962930005 s |
2.10 |
slicing / IPartOpt / cpu / BothRev |
0.000021672 s |
0.00000961596000706777 s |
2.25 |
slicing / DefOpt / cpu / PreRev |
0.000021211 s |
0.000009731240015753427 s |
2.18 |
slicing / DefOpt / cpu / PostRev |
0.000021285 s |
0.00001040664000356628 s |
2.05 |
slicing / DefOpt / cpu / BothRev |
0.000033042 s |
0.00000972724000348535 s |
3.40 |
slicing / IDefOpt / cpu / PreRev |
0.000021184 s |
0.000009798860037335545 s |
2.16 |
slicing / IDefOpt / cpu / PostRev |
0.000021409 s |
0.000010772420000648709 s |
1.99 |
slicing / IDefOpt / cpu / BothRev |
0.000021161 s |
0.000010212100023636597 s |
2.07 |
sum / JaXPipe / cpu / Primal |
0.000008477319997837185 s |
0.00000784006000685622 s |
1.08 |
sum / Jax / cpu / Primal |
0.000007931499940241338 s |
0.000007316700002775179 s |
1.08 |
sum / HLOOpt / cpu / Primal |
0.000011924299969905406 s |
0.000010999159976563532 s |
1.08 |
sum / PartOpt / cpu / Primal |
0.000007570420020783786 s |
0.00000762464000217733 s |
0.99 |
sum / IPartOpt / cpu / Primal |
0.000007981140006450005 s |
0.00000783147999754874 s |
1.02 |
sum / DefOpt / cpu / Primal |
0.000012293260024307528 s |
0.000011770680039262516 s |
1.04 |
sum / IDefOpt / cpu / Primal |
0.000007645960022273356 s |
0.000007909559990366688 s |
0.97 |
sum / JaXPipe / cpu / Forward |
0.00001122850000683684 s |
0.000011585400006879352 s |
0.97 |
sum / Jax / cpu / Forward |
0.000010824320006577182 s |
0.0000113637000140443 s |
0.95 |
sum / HLOOpt / cpu / Forward |
0.000016411979995609727 s |
0.000016045620050135767 s |
1.02 |
sum / PartOpt / cpu / Forward |
0.00001554558001771511 s |
0.000016026719968067483 s |
0.97 |
sum / IPartOpt / cpu / Forward |
0.000011276080031166202 s |
0.000010967219986923738 s |
1.03 |
sum / DefOpt / cpu / Forward |
0.000016322839992426453 s |
0.000015576400019199356 s |
1.05 |
sum / IDefOpt / cpu / Forward |
0.000011488020036267698 s |
0.000011566979983399506 s |
0.99 |
sum / JaXPipe / cpu / PreRev |
0.000011544139997567982 s |
0.00001090174001546984 s |
1.06 |
sum / JaXPipe / cpu / PostRev |
0.000011101579948444853 s |
0.000011536099991644733 s |
0.96 |
sum / JaXPipe / cpu / BothRev |
0.00001122014000429772 s |
0.000010744080000222312 s |
1.04 |
sum / Jax / cpu / BothRev |
0.00001140031998147606 s |
0.000011325500045131776 s |
1.01 |
sum / HLOOpt / cpu / PreRev |
0.000010597039972708444 s |
0.000010893339995163842 s |
0.97 |
sum / HLOOpt / cpu / PostRev |
0.000015284199971574708 s |
0.000014673560008304775 s |
1.04 |
sum / HLOOpt / cpu / BothRev |
0.000012698499995167367 s |
0.000012468359982449328 s |
1.02 |
sum / PartOpt / cpu / PreRev |
0.000010363100000176927 s |
0.000010480200025995145 s |
0.99 |
sum / PartOpt / cpu / PostRev |
0.000010984059999827875 s |
0.000010896019994106607 s |
1.01 |
sum / PartOpt / cpu / BothRev |
0.000010877179984163376 s |
0.000010456680047354894 s |
1.04 |
sum / IPartOpt / cpu / PreRev |
0.000010818119953910354 s |
0.000010520199975871946 s |
1.03 |
sum / IPartOpt / cpu / PostRev |
0.000011132939989693114 s |
0.00001105634001760336 s |
1.01 |
sum / IPartOpt / cpu / BothRev |
0.000010372919996370913 s |
0.000010889280029005022 s |
0.95 |
sum / DefOpt / cpu / PreRev |
0.00001091736000489618 s |
0.000010331519979445149 s |
1.06 |
sum / DefOpt / cpu / PostRev |
0.000010614139964673088 s |
0.000010265839946441702 s |
1.03 |
sum / DefOpt / cpu / BothRev |
0.000010385520008640014 s |
0.000010974580018228153 s |
0.95 |
sum / IDefOpt / cpu / PreRev |
0.000010924219977823669 s |
0.000010357479986851103 s |
1.05 |
sum / IDefOpt / cpu / PostRev |
0.00001105566003388958 s |
0.000010391179985163035 s |
1.06 |
sum / IDefOpt / cpu / BothRev |
0.00001053876002515608 s |
0.00001077867999811133 s |
0.98 |
sum / JaXPipe / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / Jax / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / HLOOpt / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / PartOpt / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / IPartOpt / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / DefOpt / cuda / Primal |
0.00000208 s |
0.00000208 s |
1 |
sum / IDefOpt / cuda / Primal |
0.000002079 s |
0.00000208 s |
1.00 |
sum / JaXPipe / cuda / Forward |
0.000010048 s |
0.00001008 s |
1.00 |
sum / Jax / cuda / Forward |
0.000009952 s |
0.000009983 s |
1.00 |
sum / HLOOpt / cuda / Forward |
0.000009888 s |
0.000010368 s |
0.95 |
sum / PartOpt / cuda / Forward |
0.000010112 s |
0.000010176 s |
0.99 |
sum / IPartOpt / cuda / Forward |
0.000009471 s |
0.000010209 s |
0.93 |
sum / DefOpt / cuda / Forward |
0.000010016 s |
0.0000104 s |
0.96 |
sum / IDefOpt / cuda / Forward |
0.000012416 s |
0.000009856 s |
1.26 |
sum / JaXPipe / cuda / PreRev |
0.00000976 s |
0.00000976 s |
1 |
sum / JaXPipe / cuda / PostRev |
0.000009888 s |
0.000010048 s |
0.98 |
sum / JaXPipe / cuda / BothRev |
0.000009215 s |
0.00001008 s |
0.91 |
sum / Jax / cuda / BothRev |
0.000009984 s |
0.000009824 s |
1.02 |
sum / HLOOpt / cuda / PreRev |
0.000009568 s |
0.00000912 s |
1.05 |
sum / HLOOpt / cuda / PostRev |
0.000009472 s |
0.000009567 s |
0.99 |
sum / HLOOpt / cuda / BothRev |
0.000009376 s |
0.000010016 s |
0.94 |
sum / PartOpt / cuda / PreRev |
0.000009568 s |
0.000009664 s |
0.99 |
sum / PartOpt / cuda / PostRev |
0.000009536 s |
0.000010016 s |
0.95 |
sum / PartOpt / cuda / BothRev |
0.000009984 s |
0.00001008 s |
0.99 |
sum / IPartOpt / cuda / PreRev |
0.00000976 s |
0.000009729 s |
1.00 |
sum / IPartOpt / cuda / PostRev |
0.000009728 s |
0.000009984 s |
0.97 |
sum / IPartOpt / cuda / BothRev |
0.000009696 s |
0.00000944 s |
1.03 |
sum / DefOpt / cuda / PreRev |
0.00000944 s |
0.000009632 s |
0.98 |
sum / DefOpt / cuda / PostRev |
0.000009695 s |
0.00000992 s |
0.98 |
sum / DefOpt / cuda / BothRev |
0.000009376 s |
0.000009408 s |
1.00 |
sum / IDefOpt / cuda / PreRev |
0.000009504 s |
0.000009728 s |
0.98 |
sum / IDefOpt / cuda / PostRev |
0.000009792 s |
0.000009792 s |
1 |
sum / IDefOpt / cuda / BothRev |
0.000009472 s |
0.000010048 s |
0.94 |
sum / JaXPipe / tpu / Primal |
5.10075e-7 s |
5.10625e-7 s |
1.00 |
sum / Jax / tpu / Primal |
5.580749999999999e-7 s |
5.588000000000001e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.210749999999999e-7 s |
5.2155e-7 s |
1.00 |
sum / PartOpt / tpu / Primal |
5.5785e-7 s |
5.58175e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.24325e-7 s |
5.215e-7 s |
1.01 |
sum / DefOpt / tpu / Primal |
5.58175e-7 s |
5.5785e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.212749999999999e-7 s |
5.21375e-7 s |
1.00 |
sum / JaXPipe / tpu / Forward |
0.0000015494250000000002 s |
0.000001560875 s |
0.99 |
sum / Jax / tpu / Forward |
0.000001495775 s |
0.0000015087 s |
0.99 |
sum / HLOOpt / tpu / Forward |
0.0000015278 s |
0.0000015303 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.0000014937 s |
0.000001495275 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.0000015276999999999998 s |
0.00000153475 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.000001492675 s |
0.000001499025 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.000001527575 s |
0.000001527325 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.0000010494 s |
0.000001049075 s |
1.00 |
sum / JaXPipe / tpu / PostRev |
0.000001089375 s |
0.0000010885999999999998 s |
1.00 |
sum / JaXPipe / tpu / BothRev |
0.000001048975 s |
0.0000010480000000000002 s |
1.00 |
sum / Jax / tpu / BothRev |
0.0000010938 s |
0.000001091025 s |
1.00 |
sum / HLOOpt / tpu / PreRev |
0.0000010491999999999998 s |
0.00000105175 s |
1.00 |
sum / HLOOpt / tpu / PostRev |
0.00000108985 s |
0.00000108495 s |
1.00 |
sum / HLOOpt / tpu / BothRev |
0.0000010468 s |
0.000001052325 s |
0.99 |
sum / PartOpt / tpu / PreRev |
0.000001083725 s |
0.0000010978 s |
0.99 |
sum / PartOpt / tpu / PostRev |
0.00000105685 s |
0.000001062175 s |
0.99 |
sum / PartOpt / tpu / BothRev |
0.0000010948749999999998 s |
0.0000010918 s |
1.00 |
sum / IPartOpt / tpu / PreRev |
0.000001053325 s |
0.00000106675 s |
0.99 |
sum / IPartOpt / tpu / PostRev |
0.00000109635 s |
0.0000010903 s |
1.01 |
sum / IPartOpt / tpu / BothRev |
0.000001064525 s |
0.000001048225 s |
1.02 |
sum / DefOpt / tpu / PreRev |
0.000001094325 s |
0.00000110485 s |
0.99 |
sum / DefOpt / tpu / PostRev |
0.0000010501249999999998 s |
0.0000010486 s |
1.00 |
sum / DefOpt / tpu / BothRev |
0.00000109 s |
0.000001090675 s |
1.00 |
sum / IDefOpt / tpu / PreRev |
0.0000010482 s |
0.000001056775 s |
0.99 |
sum / IDefOpt / tpu / PostRev |
0.000001086775 s |
0.000001095125 s |
0.99 |
sum / IDefOpt / tpu / BothRev |
0.0000010505499999999998 s |
0.000001049075 s |
1.00 |
sum / JaXPipe / cpu / Primal |
0.000017732999999999998 s |
0.00000784006000685622 s |
2.26 |
sum / Jax / cpu / Primal |
0.000018069000000000003 s |
0.000007316700002775179 s |
2.47 |
sum / HLOOpt / cpu / Primal |
0.000017887 s |
0.000010999159976563532 s |
1.63 |
sum / PartOpt / cpu / Primal |
0.000017638 s |
0.00000762464000217733 s |
2.31 |
sum / IPartOpt / cpu / Primal |
0.00001797 s |
0.00000783147999754874 s |
2.29 |
sum / DefOpt / cpu / Primal |
0.000017839 s |
0.000011770680039262516 s |
1.52 |
sum / IDefOpt / cpu / Primal |
0.000017911 s |
0.000007909559990366688 s |
2.26 |
sum / JaXPipe / cpu / Forward |
0.000024515 s |
0.000011585400006879352 s |
2.12 |
sum / Jax / cpu / Forward |
0.000023883 s |
0.0000113637000140443 s |
2.10 |
sum / HLOOpt / cpu / Forward |
0.00002416 s |
0.000016045620050135767 s |
1.51 |
sum / PartOpt / cpu / Forward |
0.000024595 s |
0.000016026719968067483 s |
1.53 |
sum / IPartOpt / cpu / Forward |
0.000024258 s |
0.000010967219986923738 s |
2.21 |
sum / DefOpt / cpu / Forward |
0.000024408 s |
0.000015576400019199356 s |
1.57 |
sum / IDefOpt / cpu / Forward |
0.000024267 s |
0.000011566979983399506 s |
2.10 |
sum / JaXPipe / cpu / PreRev |
0.000023244 s |
0.00001090174001546984 s |
2.13 |
sum / JaXPipe / cpu / PostRev |
0.00002293 s |
0.000011536099991644733 s |
1.99 |
sum / JaXPipe / cpu / BothRev |
0.000023313 s |
0.000010744080000222312 s |
2.17 |
sum / Jax / cpu / BothRev |
0.000023085 s |
0.000011325500045131776 s |
2.04 |
sum / HLOOpt / cpu / PreRev |
0.000023427 s |
0.000010893339995163842 s |
2.15 |
sum / HLOOpt / cpu / PostRev |
0.000023423 s |
0.000014673560008304775 s |
1.60 |
sum / HLOOpt / cpu / BothRev |
0.000023585 s |
0.000012468359982449328 s |
1.89 |
sum / PartOpt / cpu / PreRev |
0.000023511 s |
0.000010480200025995145 s |
2.24 |
sum / PartOpt / cpu / PostRev |
0.000023625 s |
0.000010896019994106607 s |
2.17 |
sum / PartOpt / cpu / BothRev |
0.000023114 s |
0.000010456680047354894 s |
2.21 |
sum / IPartOpt / cpu / PreRev |
0.000023069 s |
0.000010520199975871946 s |
2.19 |
sum / IPartOpt / cpu / PostRev |
0.000023356 s |
0.00001105634001760336 s |
2.11 |
sum / IPartOpt / cpu / BothRev |
0.00002318 s |
0.000010889280029005022 s |
2.13 |
sum / DefOpt / cpu / PreRev |
0.000022857 s |
0.000010331519979445149 s |
2.21 |
sum / DefOpt / cpu / PostRev |
0.000023348 s |
0.000010265839946441702 s |
2.27 |
sum / DefOpt / cpu / BothRev |
0.000023691 s |
0.000010974580018228153 s |
2.16 |
sum / IDefOpt / cpu / PreRev |
0.000022476 s |
0.000010357479986851103 s |
2.17 |
sum / IDefOpt / cpu / PostRev |
0.000024012 s |
0.000010391179985163035 s |
2.31 |
sum / IDefOpt / cpu / BothRev |
0.00002375 s |
0.00001077867999811133 s |
2.20 |
value_and_grad / JaXPipe / cpu / Primal |
0.000014494239976556856 s |
0.000014128279972283053 s |
1.03 |
value_and_grad / Jax / cpu / Primal |
0.000014001580002513948 s |
0.000014739260004716923 s |
0.95 |
value_and_grad / HLOOpt / cpu / Primal |
0.000013488899958247202 s |
0.000013682799981324932 s |
0.99 |
value_and_grad / PartOpt / cpu / Primal |
0.00001410103997841361 s |
0.000013681440013897371 s |
1.03 |
value_and_grad / IPartOpt / cpu / Primal |
0.00001366767999570584 s |
0.000013595520003946147 s |
1.01 |
value_and_grad / DefOpt / cpu / Primal |
0.000013524939968192484 s |
0.000013901780012020026 s |
0.97 |
value_and_grad / IDefOpt / cpu / Primal |
0.00001371530001051724 s |
0.00001404642001944012 s |
0.98 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033119000000000006 s |
0.000033249 s |
1.00 |
value_and_grad / Jax / cuda / Primal |
0.000032927 s |
0.000032832 s |
1.00 |
value_and_grad / HLOOpt / cuda / Primal |
0.000032576 s |
0.00003328 s |
0.98 |
value_and_grad / PartOpt / cuda / Primal |
0.000032351 s |
0.000032896000000000005 s |
0.98 |
value_and_grad / IPartOpt / cuda / Primal |
0.000033024 s |
0.000032992 s |
1.00 |
value_and_grad / DefOpt / cuda / Primal |
0.00003248 s |
0.000032512 s |
1.00 |
value_and_grad / IDefOpt / cuda / Primal |
0.000033055 s |
0.000032832 s |
1.01 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000042013 s |
0.000014128279972283053 s |
2.97 |
value_and_grad / Jax / cpu / Primal |
0.000027322 s |
0.000014739260004716923 s |
1.85 |
value_and_grad / HLOOpt / cpu / Primal |
0.0000279 s |
0.000013682799981324932 s |
2.04 |
value_and_grad / PartOpt / cpu / Primal |
0.000027473 s |
0.000013681440013897371 s |
2.01 |
value_and_grad / IPartOpt / cpu / Primal |
0.000027779000000000003 s |
0.000013595520003946147 s |
2.04 |
value_and_grad / DefOpt / cpu / Primal |
0.000027589 s |
0.000013901780012020026 s |
1.98 |
value_and_grad / IDefOpt / cpu / Primal |
0.00002723 s |
0.00001404642001944012 s |
1.94 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001468535 s |
0.0015381709999999 s |
0.95 |
jaxmd20 / Jax / cuda / Primal |
0.0014484379999999 s |
0.001580476 s |
0.92 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001142072 s |
0.001078782 s |
1.06 |
jaxmd20 / PartOpt / cuda / Primal |
0.001283383 s |
0.001315996 s |
0.98 |
jaxmd20 / IPartOpt / cuda / Primal |
0.0013215269999999 s |
0.001333052 s |
0.99 |
jaxmd20 / DefOpt / cuda / Primal |
0.000520124 s |
0.000523038 s |
0.99 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000514588 s |
0.000492254 s |
1.05 |
jaxmd20 / JaXPipe / cuda / Forward |
0.000828187 s |
0.000816734 s |
1.01 |
jaxmd20 / Jax / cuda / Forward |
0.001789269 s |
0.001811772 s |
0.99 |
jaxmd20 / HLOOpt / cuda / Forward |
0.000829979 s |
0.000823358 s |
1.01 |
jaxmd20 / PartOpt / cuda / Forward |
0.000817563 s |
0.0008217259999999 s |
0.99 |
jaxmd20 / IPartOpt / cuda / Forward |
0.000884282 s |
0.000826653 s |
1.07 |
jaxmd20 / DefOpt / cuda / Forward |
0.000834523 s |
0.000815997 s |
1.02 |
jaxmd20 / IDefOpt / cuda / Forward |
0.000815099 s |
0.000817886 s |
1.00 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.00166495 s |
0.001678363 s |
0.99 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005326971 s |
0.005320914 s |
1.00 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.001658197 s |
0.001663451 s |
1.00 |
jaxmd20 / Jax / cuda / BothRev |
0.005293977 s |
0.005268499 s |
1.00 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.001711731 s |
0.001726523 s |
0.99 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005219067 s |
0.00516002 s |
1.01 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.00165666 s |
0.001661531 s |
1.00 |
jaxmd20 / PartOpt / cuda / PreRev |
0.00175567 s |
0.001712668 s |
1.03 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005410015 s |
0.005340304 s |
1.01 |
jaxmd20 / PartOpt / cuda / BothRev |
0.00166738 s |
0.001663068 s |
1.00 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.0017363699999999 s |
0.001705212 s |
1.02 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.0054033879999999 s |
0.0053764979999999 s |
1.01 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.001664406 s |
0.001627164 s |
1.02 |
jaxmd20 / DefOpt / cuda / PreRev |
0.00174658 s |
0.0017215309999999 s |
1.01 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002723662 s |
0.0027170809999999 s |
1.00 |
jaxmd20 / DefOpt / cuda / BothRev |
0.001649492 s |
0.001738428 s |
0.95 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.001738804 s |
0.00173222 s |
1.00 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.001977362 s |
0.001977914 s |
1.00 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.00165698 s |
0.001666619 s |
0.99 |
jaxmd20 / JaXPipe / tpu / Primal |
0.009277995 s |
0.0092832325 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.00927855375 s |
0.00926823125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.00917455625 s |
0.009165471875 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.0091971925 s |
0.009195720625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.0092001 s |
0.00920227 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.008749179375 s |
0.008745300625 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.00863421875 s |
0.008635025625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.01726184 s |
0.017267543125 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.01875432375 s |
0.018740548125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.017236080625 s |
0.01723612625 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.017263274375 s |
0.01726744625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.017268051875 s |
0.017262624375 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.01725859 s |
0.01726469125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.017266670625 s |
0.017266065625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.025350906875 s |
0.025341306875 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.021560056875 s |
0.021869649375 s |
0.99 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.025341431875 s |
0.02535814125 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.021876835 s |
0.021872894375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.02533986625 s |
0.0253574406249999 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.0209853275 s |
0.02097280625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.025255084375 s |
0.0252753712499999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.0253586406249999 s |
0.0253516543749999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.021514848125 s |
0.02152932625 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.025273721875 s |
0.025272335625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.0253291575 s |
0.025349014375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.021520535 s |
0.021520705625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.025251193125 s |
0.025271966875 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.0253626824999999 s |
0.025350971875 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.0187649318749999 s |
0.018796156875 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.025278385 s |
0.025270051875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.0253348375 s |
0.025350833125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.01809764875 s |
0.01810748375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.02525376375 s |
0.02526988125 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.0895589899999999 s |
0.064794319 s |
1.38 |
jaxmd40 / Jax / cpu / Primal |
0.070354388 s |
0.060799723 s |
1.16 |
jaxmd40 / HLOOpt / cpu / Primal |
0.0988557229999999 s |
0.084688236 s |
1.17 |
jaxmd40 / PartOpt / cpu / Primal |
0.089056278 s |
0.064421721 s |
1.38 |
jaxmd40 / IPartOpt / cpu / Primal |
0.0858872449999999 s |
0.070357754 s |
1.22 |
jaxmd40 / DefOpt / cpu / Primal |
0.1080280999999999 s |
0.086803284 s |
1.24 |
jaxmd40 / IDefOpt / cpu / Primal |
0.097163111 s |
0.079787198 s |
1.22 |
jaxmd40 / JaXPipe / cpu / Forward |
0.1999315599999999 s |
0.155879245 s |
1.28 |
jaxmd40 / Jax / cpu / Forward |
0.106891377 s |
0.08651161 s |
1.24 |
jaxmd40 / HLOOpt / cpu / Forward |
0.1949410939999999 s |
0.159918111 s |
1.22 |
jaxmd40 / PartOpt / cpu / Forward |
0.197680442 s |
0.157475402 s |
1.26 |
jaxmd40 / IPartOpt / cpu / Forward |
0.205484542 s |
0.156772428 s |
1.31 |
jaxmd40 / DefOpt / cpu / Forward |
0.202310212 s |
0.167871725 s |
1.21 |
jaxmd40 / IDefOpt / cpu / Forward |
0.20375368 s |
0.15935788 s |
1.28 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.256641134 s |
0.229345345 s |
1.12 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.155128748 s |
0.138166208 s |
1.12 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.251759801 s |
0.220160343 s |
1.14 |
jaxmd40 / Jax / cpu / BothRev |
0.178771298 s |
0.134181835 s |
1.33 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.248316298 s |
0.207645172 s |
1.20 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.213463205 s |
0.174557981 s |
1.22 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.276987482 s |
0.23236999 s |
1.19 |
jaxmd40 / PartOpt / cpu / PreRev |
0.2774425 s |
0.227327331 s |
1.22 |
jaxmd40 / PartOpt / cpu / PostRev |
0.148465212 s |
0.131031843 s |
1.13 |
jaxmd40 / PartOpt / cpu / BothRev |
0.299941758 s |
0.2489365919999999 s |
1.20 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.258336119 s |
0.231753635 s |
1.11 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.150197237 s |
0.139559405 s |
1.08 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.296021058 s |
0.252466919 s |
1.17 |
jaxmd40 / DefOpt / cpu / PreRev |
0.255458953 s |
0.21722004 s |
1.18 |
jaxmd40 / DefOpt / cpu / PostRev |
0.197120848 s |
0.168523171 s |
1.17 |
jaxmd40 / DefOpt / cpu / BothRev |
0.268880939 s |
0.256271871 s |
1.05 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.264546132 s |
0.230775248 s |
1.15 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.204063954 s |
0.167811243 s |
1.22 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.288346569 s |
0.2396261509999999 s |
1.20 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.7036897489999998 s |
1.704613862 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.706865788 s |
1.707337545 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.717634159 s |
1.71679985 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.69872664 s |
1.699223816 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.697451992 s |
1.697202868 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.666030468 s |
1.667867349 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.911652507 s |
1.913616855 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.92906063625 s |
4.01163404 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.03864475625 s |
3.03868473125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.12119580875 s |
3.121042244375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.059036788125 s |
3.05902196125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.05900937875 s |
3.059123505625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.102641335 s |
2.10262360875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
4.35619512875 s |
4.356125429375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.880717086 s |
5.964431313 s |
1.15 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.784770517 s |
5.889115877 s |
1.15 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.813897442 s |
5.979979623999999 s |
1.14 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.817683051 s |
5.929690399 s |
1.15 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.936965362 s |
5.923135273 s |
1.17 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.755515989 s |
2.22214844 s |
1.24 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
7.511662121 s |
6.30221173 s |
1.19 |
This comment was automatically generated by workflow using github-action-benchmark.
|
@jumerckx sorry this again hit rebase nonsense |
ref: #1949