Skip to content

Conversation

@jumerckx
Copy link
Collaborator

ref: #1949

@jumerckx jumerckx self-assigned this Jan 21, 2026

// If no results are used, this should be handled by dead code elimination
if (usedCount == 0)
return failure();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also might as well do deletion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same success comment

op.getLoc(), resultTypes, op.getOperand(), startIndices, limitIndices,
op.getStrides(), op.getDimension(), newLeftAmount, newRightAmount);

// Map old results to new results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sur eto keep sharding

@jumerckx jumerckx force-pushed the jm/multislice_opt branch 2 times, most recently from a3a7c56 to 61ffc4c Compare January 22, 2026 10:30
Base automatically changed from jm/multirotate_and_multislice to main January 23, 2026 02:56
An error occurred while trying to automatically change base from jm/multirotate_and_multislice to main January 23, 2026 02:56
}

auto sliceOp = rewriter.create<stablehlo::SliceOp>(
op.getLoc(), op.getOperand(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same replaceOpWithNewOp comment [with the sharding comment] from rotate

Co-authored-by: William S. Moses <gh@wsmoses.com>

void mlir::transform::addMultiSliceOpt(RewritePatternSet &patterns,
MLIRContext &context,
PatternBenefit benefit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here on do we need this defn?

MLIRContext &context, PatternBenefit benefit);
void addEnzymeHLOUnroll(RewritePatternSet &patterns, int64_t maxNumIterations,
MLIRContext &context, PatternBenefit benefit);
void addMultiSliceOpt(RewritePatternSet &patterns, MLIRContext &context,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still needs removing right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, sorry!

@wsmoses
Copy link
Member

wsmoses commented Jan 24, 2026

build fails:

//src/enzyme_ad/jax:EnzymeHLOPatternsIncGen_filegroup___gen_populate_patterns_func_decls_425041425_genrule) bazel-out/k8-opt-exec-ST-0465588ec812/bin/enzymexlamlir-tblgen -gen-populate-patterns-func-decls src/enzyme_ad/jax/TransformOps/TransformOps.td -I external/llvm-project/mlir/include -I ... (remaining 11 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
src/enzyme_ad/jax/TransformOps/TransformOps.td:2179:1: error: Unknown token when expecting a type
def ReduceUnusedMultiRotate : EnzymeHLOPatternOp<
^
[3,421 / 8,108] Compiling src/google/protobuf/compiler/java/helpers.cc [for tool]; 1s disk-cache, processwrapper-sandbox
[3,422 / 8,108] checking cached actions
Target //:enzymexlamlir-opt failed to build

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnzymeJAX Benchmarks

Details
Benchmark suite Current: 0d0f264 Previous: bb071ab Ratio
actmtch / JaXPipe / cpu / Primal 0.000006922680004208815 s 0.00000694008001119073 s 1.00
actmtch / Jax / cpu / Primal 0.000006707680031468044 s 0.000006830459988123039 s 0.98
actmtch / HLOOpt / cpu / Primal 0.000007436860023517511 s 0.000009913419989970864 s 0.75
actmtch / PartOpt / cpu / Primal 0.000006741900006090873 s 0.000006217799964360893 s 1.08
actmtch / IPartOpt / cpu / Primal 0.0000064510400079598185 s 0.000006514500037155813 s 0.99
actmtch / DefOpt / cpu / Primal 0.000011685319950629491 s 0.000011061580007662997 s 1.06
actmtch / IDefOpt / cpu / Primal 0.000007385560029433691 s 0.000006896759978189948 s 1.07
actmtch / JaXPipe / cpu / Forward 0.000010776579965749989 s 0.000010935439995591878 s 0.99
actmtch / Jax / cpu / Forward 0.000009474520002186182 s 0.000009495160047663376 s 1.00
actmtch / HLOOpt / cpu / Forward 0.000014807180014031472 s 0.000014464259993474116 s 1.02
actmtch / PartOpt / cpu / Forward 0.00001519465998171654 s 0.000015027659965198836 s 1.01
actmtch / IPartOpt / cpu / Forward 0.000010630659971866407 s 0.000010643459982020433 s 1.00
actmtch / DefOpt / cpu / Forward 0.000015256540000336828 s 0.000014413620010600426 s 1.06
actmtch / IDefOpt / cpu / Forward 0.000010763580012280728 s 0.00001057197997397452 s 1.02
actmtch / JaXPipe / cpu / PreRev 0.000011385760008124634 s 0.000010950099995170605 s 1.04
actmtch / JaXPipe / cpu / PostRev 0.000010079699986818014 s 0.000010357139954066952 s 0.97
actmtch / JaXPipe / cpu / BothRev 0.000010704240021368604 s 0.000014666399965790334 s 0.73
actmtch / Jax / cpu / BothRev 0.000009644120018492684 s 0.000009740360010255244 s 0.99
actmtch / HLOOpt / cpu / PreRev 0.000011047500020140431 s 0.000010884519979299513 s 1.01
actmtch / HLOOpt / cpu / PostRev 0.000015670159982619225 s 0.00001493965999543434 s 1.05
actmtch / HLOOpt / cpu / BothRev 0.000012228620007590508 s 0.000012820839965570484 s 0.95
actmtch / PartOpt / cpu / PreRev 0.000010227720003967989 s 0.000011093580005763216 s 0.92
actmtch / PartOpt / cpu / PostRev 0.00000987922003332642 s 0.000009808780005187144 s 1.01
actmtch / PartOpt / cpu / BothRev 0.000010955340003420134 s 0.00001122273999499157 s 0.98
actmtch / IPartOpt / cpu / PreRev 0.000014503379989037055 s 0.0000113531800070632 s 1.28
actmtch / IPartOpt / cpu / PostRev 0.000010034760025519065 s 0.000010120260067196797 s 0.99
actmtch / IPartOpt / cpu / BothRev 0.000010352080043958268 s 0.00001113555997108051 s 0.93
actmtch / DefOpt / cpu / PreRev 0.000011315520032439964 s 0.000010477780015207828 s 1.08
actmtch / DefOpt / cpu / PostRev 0.000011221400018257555 s 0.000010674739960450098 s 1.05
actmtch / DefOpt / cpu / BothRev 0.000010702859972298027 s 0.000010495660035303445 s 1.02
actmtch / IDefOpt / cpu / PreRev 0.00001132979998146766 s 0.000010619839986247826 s 1.07
actmtch / IDefOpt / cpu / PostRev 0.000010499039954083857 s 0.000011006340000676574 s 0.95
actmtch / IDefOpt / cpu / BothRev 0.000010313560014765244 s 0.000010521520025577046 s 0.98
actmtch / JaXPipe / cuda / Primal 0.000002015 s 0.000002047 s 0.98
actmtch / Jax / cuda / Primal 0.000002016 s 0.000002016 s 1
actmtch / HLOOpt / cuda / Primal 0.000002047 s 0.000002016 s 1.02
actmtch / PartOpt / cuda / Primal 0.000002016 s 0.000002016 s 1
actmtch / IPartOpt / cuda / Primal 0.0000020170000000000003 s 0.000002015 s 1.00
actmtch / DefOpt / cuda / Primal 0.000002015 s 0.000002047 s 0.98
actmtch / IDefOpt / cuda / Primal 0.000002016 s 0.000002016 s 1
actmtch / JaXPipe / cuda / Forward 0.000009984 s 0.000010847 s 0.92
actmtch / Jax / cuda / Forward 0.0000096 s 0.00001056 s 0.91
actmtch / HLOOpt / cuda / Forward 0.00001008 s 0.000010689 s 0.94
actmtch / PartOpt / cuda / Forward 0.00001008 s 0.000010304 s 0.98
actmtch / IPartOpt / cuda / Forward 0.000009983 s 0.000010592 s 0.94
actmtch / DefOpt / cuda / Forward 0.000009856 s 0.0000104 s 0.95
actmtch / IDefOpt / cuda / Forward 0.000010016 s 0.000010496 s 0.95
actmtch / JaXPipe / cuda / PreRev 0.000010559 s 0.000010815 s 0.98
actmtch / JaXPipe / cuda / PostRev 0.000009696 s 0.000010208 s 0.95
actmtch / JaXPipe / cuda / BothRev 0.000009727 s 0.00001008 s 0.96
actmtch / Jax / cuda / BothRev 0.000009632 s 0.00000976 s 0.99
actmtch / HLOOpt / cuda / PreRev 0.000009984 s 0.000010687 s 0.93
actmtch / HLOOpt / cuda / PostRev 0.000010144 s 0.000010016 s 1.01
actmtch / HLOOpt / cuda / BothRev 0.000009824 s 0.00001008 s 0.97
actmtch / PartOpt / cuda / PreRev 0.000010048 s 0.00001104 s 0.91
actmtch / PartOpt / cuda / PostRev 0.000010048 s 0.00001008 s 1.00
actmtch / PartOpt / cuda / BothRev 0.00000992 s 0.000009631 s 1.03
actmtch / IPartOpt / cuda / PreRev 0.000010048 s 0.000014688 s 0.68
actmtch / IPartOpt / cuda / PostRev 0.00001008 s 0.000010688 s 0.94
actmtch / IPartOpt / cuda / BothRev 0.00001056 s 0.00001056 s 1
actmtch / DefOpt / cuda / PreRev 0.000010208 s 0.000010464 s 0.98
actmtch / DefOpt / cuda / PostRev 0.000009344 s 0.0000104 s 0.90
actmtch / DefOpt / cuda / BothRev 0.000009408 s 0.00001008 s 0.93
actmtch / IDefOpt / cuda / PreRev 0.000009568 s 0.000010496 s 0.91
actmtch / IDefOpt / cuda / PostRev 0.000009824 s 0.0000104 s 0.94
actmtch / IDefOpt / cuda / BothRev 0.00001008 s 0.000010592 s 0.95
actmtch / JaXPipe / tpu / Primal 5.63025e-7 s 5.63975e-7 s 1.00
actmtch / Jax / tpu / Primal 6.070249999999999e-7 s 6.071e-7 s 1.00
actmtch / HLOOpt / tpu / Primal 0.000002103325 s 0.0000021071 s 1.00
actmtch / PartOpt / tpu / Primal 6.06525e-7 s 6.06825e-7 s 1.00
actmtch / IPartOpt / tpu / Primal 5.62525e-7 s 5.626e-7 s 1.00
actmtch / DefOpt / tpu / Primal 0.0000021585750000000003 s 0.0000021625 s 1.00
actmtch / IDefOpt / tpu / Primal 0.0000021019750000000003 s 0.000002092475 s 1.00
actmtch / JaXPipe / tpu / Forward 0.0000038218 s 0.00000382475 s 1.00
actmtch / Jax / tpu / Forward 0.000001230325 s 0.000001216875 s 1.01
actmtch / HLOOpt / tpu / Forward 0.0000039444 s 0.000003937474999999999 s 1.00
actmtch / PartOpt / tpu / Forward 0.0000039164 s 0.00000391355 s 1.00
actmtch / IPartOpt / tpu / Forward 0.000003944375 s 0.000003935775 s 1.00
actmtch / DefOpt / tpu / Forward 0.000003918225 s 0.000003911125 s 1.00
actmtch / IDefOpt / tpu / Forward 0.00000393035 s 0.000003934225 s 1.00
actmtch / JaXPipe / tpu / PreRev 0.000003477075 s 0.000003476425 s 1.00
actmtch / JaXPipe / tpu / PostRev 0.0000016477 s 0.000001637175 s 1.01
actmtch / JaXPipe / tpu / BothRev 0.0000034755000000000004 s 0.000003482925 s 1.00
actmtch / Jax / tpu / BothRev 0.00000164705 s 0.0000016424 s 1.00
actmtch / HLOOpt / tpu / PreRev 0.000003473925 s 0.0000034862500000000003 s 1.00
actmtch / HLOOpt / tpu / PostRev 0.000003417575 s 0.0000034163 s 1.00
actmtch / HLOOpt / tpu / BothRev 0.000003468225 s 0.000003487525 s 0.99
actmtch / PartOpt / tpu / PreRev 0.000003409225 s 0.000003413775 s 1.00
actmtch / PartOpt / tpu / PostRev 0.000001586025 s 0.0000015953999999999998 s 0.99
actmtch / PartOpt / tpu / BothRev 0.000003414275 s 0.000003413975 s 1.00
actmtch / IPartOpt / tpu / PreRev 0.000003464675 s 0.00000347355 s 1.00
actmtch / IPartOpt / tpu / PostRev 0.00000163885 s 0.000001643075 s 1.00
actmtch / IPartOpt / tpu / BothRev 0.000003475025 s 0.00000348125 s 1.00
actmtch / DefOpt / tpu / PreRev 0.0000034086 s 0.0000034240500000000003 s 1.00
actmtch / DefOpt / tpu / PostRev 0.0000034095 s 0.0000034275 s 0.99
actmtch / DefOpt / tpu / BothRev 0.0000034197 s 0.000003414375 s 1.00
actmtch / IDefOpt / tpu / PreRev 0.000003478075 s 0.000003470475 s 1.00
actmtch / IDefOpt / tpu / PostRev 0.000003399 s 0.000003411775 s 1.00
actmtch / IDefOpt / tpu / BothRev 0.00000346885 s 0.0000034695 s 1.00
actmtch / JaXPipe / cpu / Primal 0.00001585 s 0.00000694008001119073 s 2.28
actmtch / Jax / cpu / Primal 0.000016595 s 0.000006830459988123039 s 2.43
actmtch / HLOOpt / cpu / Primal 0.000017224 s 0.000009913419989970864 s 1.74
actmtch / PartOpt / cpu / Primal 0.000016223 s 0.000006217799964360893 s 2.61
actmtch / IPartOpt / cpu / Primal 0.000016725 s 0.000006514500037155813 s 2.57
actmtch / DefOpt / cpu / Primal 0.000017114 s 0.000011061580007662997 s 1.55
actmtch / IDefOpt / cpu / Primal 0.000017183 s 0.000006896759978189948 s 2.49
actmtch / JaXPipe / cpu / Forward 0.000023126 s 0.000010935439995591878 s 2.11
actmtch / Jax / cpu / Forward 0.000022368 s 0.000009495160047663376 s 2.36
actmtch / HLOOpt / cpu / Forward 0.000023513 s 0.000014464259993474116 s 1.63
actmtch / PartOpt / cpu / Forward 0.000022792 s 0.000015027659965198836 s 1.52
actmtch / IPartOpt / cpu / Forward 0.000023233 s 0.000010643459982020433 s 2.18
actmtch / DefOpt / cpu / Forward 0.000022817 s 0.000014413620010600426 s 1.58
actmtch / IDefOpt / cpu / Forward 0.000023443 s 0.00001057197997397452 s 2.22
actmtch / JaXPipe / cpu / PreRev 0.000023538 s 0.000010950099995170605 s 2.15
actmtch / JaXPipe / cpu / PostRev 0.000021261 s 0.000010357139954066952 s 2.05
actmtch / JaXPipe / cpu / BothRev 0.000023345 s 0.000014666399965790334 s 1.59
actmtch / Jax / cpu / BothRev 0.000021594 s 0.000009740360010255244 s 2.22
actmtch / HLOOpt / cpu / PreRev 0.000023555 s 0.000010884519979299513 s 2.16
actmtch / HLOOpt / cpu / PostRev 0.000023763 s 0.00001493965999543434 s 1.59
actmtch / HLOOpt / cpu / BothRev 0.000024015000000000003 s 0.000012820839965570484 s 1.87
actmtch / PartOpt / cpu / PreRev 0.000023667 s 0.000011093580005763216 s 2.13
actmtch / PartOpt / cpu / PostRev 0.000021309 s 0.000009808780005187144 s 2.17
actmtch / PartOpt / cpu / BothRev 0.00002394 s 0.00001122273999499157 s 2.13
actmtch / IPartOpt / cpu / PreRev 0.000023736 s 0.0000113531800070632 s 2.09
actmtch / IPartOpt / cpu / PostRev 0.000021884 s 0.000010120260067196797 s 2.16
actmtch / IPartOpt / cpu / BothRev 0.000023677 s 0.00001113555997108051 s 2.13
actmtch / DefOpt / cpu / PreRev 0.000023929 s 0.000010477780015207828 s 2.28
actmtch / DefOpt / cpu / PostRev 0.000024253 s 0.000010674739960450098 s 2.27
actmtch / DefOpt / cpu / BothRev 0.000023494 s 0.000010495660035303445 s 2.24
actmtch / IDefOpt / cpu / PreRev 0.000023168 s 0.000010619839986247826 s 2.18
actmtch / IDefOpt / cpu / PostRev 0.000023666 s 0.000011006340000676574 s 2.15
actmtch / IDefOpt / cpu / BothRev 0.00002373 s 0.000010521520025577046 s 2.26
add_one / JaXPipe / cpu / Primal 0.000007419180019496707 s 0.00000752472000385751 s 0.99
add_one / Jax / cpu / Primal 0.000006683779993181816 s 0.000007206639938885928 s 0.93
add_one / HLOOpt / cpu / Primal 0.000006441080022341339 s 0.00001020033998429426 s 0.63
add_one / PartOpt / cpu / Primal 0.000006262699980652542 s 0.000006483379984274506 s 0.97
add_one / IPartOpt / cpu / Primal 0.000006520199949591188 s 0.000007264080022650887 s 0.90
add_one / DefOpt / cpu / Primal 0.000010351119981351077 s 0.00001108921997001744 s 0.93
add_one / IDefOpt / cpu / Primal 0.000006974599991735886 s 0.000006771060034225229 s 1.03
add_one / JaXPipe / cpu / Forward 0.000010205639991909264 s 0.00001025116001983406 s 1.00
add_one / Jax / cpu / Forward 0.000011320160037939786 s 0.00000995673997749691 s 1.14
add_one / HLOOpt / cpu / Forward 0.000014209039973138717 s 0.000014886220023981876 s 0.95
add_one / PartOpt / cpu / Forward 0.00001368597997498 s 0.00001484262002122705 s 0.92
add_one / IPartOpt / cpu / Forward 0.000009566959988660528 s 0.00000993311999081925 s 0.96
add_one / DefOpt / cpu / Forward 0.000014696760035803891 s 0.000015093059992068448 s 0.97
add_one / IDefOpt / cpu / Forward 0.000009842939998634393 s 0.000010064300004160032 s 0.98
add_one / JaXPipe / cpu / PreRev 0.000011414900045565446 s 0.000011601280020840933 s 0.98
add_one / JaXPipe / cpu / PostRev 0.00001103010000406357 s 0.000011605040008362266 s 0.95
add_one / JaXPipe / cpu / BothRev 0.000013230260019554408 s 0.00001129558000684483 s 1.17
add_one / Jax / cpu / BothRev 0.000010707639985412244 s 0.00001170777999504935 s 0.91
add_one / HLOOpt / cpu / PreRev 0.000011337679989082972 s 0.000011194540029464406 s 1.01
add_one / HLOOpt / cpu / PostRev 0.000015331020003941377 s 0.000011465500001577311 s 1.34
add_one / HLOOpt / cpu / BothRev 0.0000165456600007019 s 0.000017417280014342394 s 0.95
add_one / PartOpt / cpu / PreRev 0.000010961900052279816 s 0.000011230060035813947 s 0.98
add_one / PartOpt / cpu / PostRev 0.000011039439987143852 s 0.000011293500028841665 s 0.98
add_one / PartOpt / cpu / BothRev 0.000011220940032217183 s 0.00001118008005505544 s 1.00
add_one / IPartOpt / cpu / PreRev 0.000015995380035747075 s 0.00001686254002379428 s 0.95
add_one / IPartOpt / cpu / PostRev 0.000010996059982062434 s 0.000011557100024219836 s 0.95
add_one / IPartOpt / cpu / BothRev 0.000011431580014686915 s 0.00001106198002162273 s 1.03
add_one / DefOpt / cpu / PreRev 0.00001102045996049128 s 0.000011062340045100429 s 1.00
add_one / DefOpt / cpu / PostRev 0.000011594799943850375 s 0.000010930259950328036 s 1.06
add_one / DefOpt / cpu / BothRev 0.000011351380017003976 s 0.000011320499997964362 s 1.00
add_one / IDefOpt / cpu / PreRev 0.000011180020001120283 s 0.0000110442599907401 s 1.01
add_one / IDefOpt / cpu / PostRev 0.000011913299977095448 s 0.00001119873999414267 s 1.06
add_one / IDefOpt / cpu / BothRev 0.000011332740004945665 s 0.00001130389996433223 s 1.00
add_one / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / Jax / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / HLOOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / IPartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / DefOpt / cuda / Primal 0.000001951 s 0.0000019200000000000003 s 1.02
add_one / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_one / JaXPipe / cuda / Forward 0.000010175 s 0.000010272 s 0.99
add_one / Jax / cuda / Forward 0.00001024 s 0.000010463 s 0.98
add_one / HLOOpt / cuda / Forward 0.00001008 s 0.000010112 s 1.00
add_one / PartOpt / cuda / Forward 0.000010016 s 0.000009887 s 1.01
add_one / IPartOpt / cuda / Forward 0.000009887 s 0.000010432 s 0.95
add_one / DefOpt / cuda / Forward 0.000010112 s 0.000010272 s 0.98
add_one / IDefOpt / cuda / Forward 0.000009792 s 0.000010208 s 0.96
add_one / JaXPipe / cuda / PreRev 0.000024063 s 0.000026208 s 0.92
add_one / JaXPipe / cuda / PostRev 0.000024479 s 0.000026816 s 0.91
add_one / JaXPipe / cuda / BothRev 0.000024607 s 0.000025248 s 0.97
add_one / Jax / cuda / BothRev 0.00002464 s 0.000025152 s 0.98
add_one / HLOOpt / cuda / PreRev 0.00002528 s 0.00002496 s 1.01
add_one / HLOOpt / cuda / PostRev 0.000025024 s 0.00002512 s 1.00
add_one / HLOOpt / cuda / BothRev 0.000024704 s 0.000025344 s 0.97
add_one / PartOpt / cuda / PreRev 0.000025024 s 0.000024992 s 1.00
add_one / PartOpt / cuda / PostRev 0.000025215 s 0.00002528 s 1.00
add_one / PartOpt / cuda / BothRev 0.000025312 s 0.000025439 s 1.00
add_one / IPartOpt / cuda / PreRev 0.000025056 s 0.000025632 s 0.98
add_one / IPartOpt / cuda / PostRev 0.000025248 s 0.00002528 s 1.00
add_one / IPartOpt / cuda / BothRev 0.000024863 s 0.000025312 s 0.98
add_one / DefOpt / cuda / PreRev 0.000026336 s 0.000025024 s 1.05
add_one / DefOpt / cuda / PostRev 0.000024448 s 0.00002496 s 0.98
add_one / DefOpt / cuda / BothRev 0.000025183 s 0.000025632 s 0.98
add_one / IDefOpt / cuda / PreRev 0.000024576 s 0.0000256 s 0.96
add_one / IDefOpt / cuda / PostRev 0.000024352 s 0.000024992 s 0.97
add_one / IDefOpt / cuda / BothRev 0.00002448 s 0.000024416 s 1.00
add_one / JaXPipe / tpu / Primal 0.00000143155 s 0.000001429075 s 1.00
add_one / Jax / tpu / Primal 0.000001401775 s 0.000001404775 s 1.00
add_one / HLOOpt / tpu / Primal 0.000001421175 s 0.000001432975 s 0.99
add_one / PartOpt / tpu / Primal 0.0000014000499999999998 s 0.000001405375 s 1.00
add_one / IPartOpt / tpu / Primal 0.0000014236750000000002 s 0.000001428 s 1.00
add_one / DefOpt / tpu / Primal 0.000001402575 s 0.0000014243749999999998 s 0.98
add_one / IDefOpt / tpu / Primal 0.000001423625 s 0.000001429225 s 1.00
add_one / JaXPipe / tpu / Forward 0.00000186065 s 0.0000018722 s 0.99
add_one / Jax / tpu / Forward 0.000001839975 s 0.00000185155 s 0.99
add_one / HLOOpt / tpu / Forward 0.00000185385 s 0.00000185065 s 1.00
add_one / PartOpt / tpu / Forward 0.000001837525 s 0.000001847 s 0.99
add_one / IPartOpt / tpu / Forward 0.000001854875 s 0.0000018569 s 1.00
add_one / DefOpt / tpu / Forward 0.00000184175 s 0.0000018493 s 1.00
add_one / IDefOpt / tpu / Forward 0.000001854 s 0.000001853475 s 1.00
add_one / JaXPipe / tpu / PreRev 0.000002245975 s 0.0000022365750000000003 s 1.00
add_one / JaXPipe / tpu / PostRev 0.00000223725 s 0.000002247325 s 1.00
add_one / JaXPipe / tpu / BothRev 0.0000022329 s 0.0000022439 s 1.00
add_one / Jax / tpu / BothRev 0.0000022441 s 0.0000022476 s 1.00
add_one / HLOOpt / tpu / PreRev 0.000002247225 s 0.0000022388 s 1.00
add_one / HLOOpt / tpu / PostRev 0.000002236175 s 0.000002238575 s 1.00
add_one / HLOOpt / tpu / BothRev 0.00000224165 s 0.000002246225 s 1.00
add_one / PartOpt / tpu / PreRev 0.00000223895 s 0.000002243125 s 1.00
add_one / PartOpt / tpu / PostRev 0.00000224895 s 0.000002234625 s 1.01
add_one / PartOpt / tpu / BothRev 0.0000022533750000000003 s 0.000002238475 s 1.01
add_one / IPartOpt / tpu / PreRev 0.00000223735 s 0.000002239425 s 1.00
add_one / IPartOpt / tpu / PostRev 0.00000224365 s 0.000002247275 s 1.00
add_one / IPartOpt / tpu / BothRev 0.00000223935 s 0.00000224215 s 1.00
add_one / DefOpt / tpu / PreRev 0.0000022373 s 0.0000022334 s 1.00
add_one / DefOpt / tpu / PostRev 0.0000022364 s 0.0000022361 s 1.00
add_one / DefOpt / tpu / BothRev 0.00000223275 s 0.000002245 s 0.99
add_one / IDefOpt / tpu / PreRev 0.00000223795 s 0.0000022311750000000003 s 1.00
add_one / IDefOpt / tpu / PostRev 0.000002237025 s 0.000002242125 s 1.00
add_one / IDefOpt / tpu / BothRev 0.000002233375 s 0.000002238475 s 1.00
add_one / JaXPipe / cpu / Primal 0.000015702 s 0.00000752472000385751 s 2.09
add_one / Jax / cpu / Primal 0.00001602 s 0.000007206639938885928 s 2.22
add_one / HLOOpt / cpu / Primal 0.000015842 s 0.00001020033998429426 s 1.55
add_one / PartOpt / cpu / Primal 0.000015823 s 0.000006483379984274506 s 2.44
add_one / IPartOpt / cpu / Primal 0.000015977000000000003 s 0.000007264080022650887 s 2.20
add_one / DefOpt / cpu / Primal 0.000015605 s 0.00001108921997001744 s 1.41
add_one / IDefOpt / cpu / Primal 0.000015808 s 0.000006771060034225229 s 2.33
add_one / JaXPipe / cpu / Forward 0.000021538000000000003 s 0.00001025116001983406 s 2.10
add_one / Jax / cpu / Forward 0.000021609 s 0.00000995673997749691 s 2.17
add_one / HLOOpt / cpu / Forward 0.000021216 s 0.000014886220023981876 s 1.43
add_one / PartOpt / cpu / Forward 0.000021656 s 0.00001484262002122705 s 1.46
add_one / IPartOpt / cpu / Forward 0.000021584 s 0.00000993311999081925 s 2.17
add_one / DefOpt / cpu / Forward 0.000021612 s 0.000015093059992068448 s 1.43
add_one / IDefOpt / cpu / Forward 0.00002145 s 0.000010064300004160032 s 2.13
add_one / JaXPipe / cpu / PreRev 0.000024185000000000003 s 0.000011601280020840933 s 2.08
add_one / JaXPipe / cpu / PostRev 0.000024051 s 0.000011605040008362266 s 2.07
add_one / JaXPipe / cpu / BothRev 0.00002393 s 0.00001129558000684483 s 2.12
add_one / Jax / cpu / BothRev 0.000023654 s 0.00001170777999504935 s 2.02
add_one / HLOOpt / cpu / PreRev 0.000023871 s 0.000011194540029464406 s 2.13
add_one / HLOOpt / cpu / PostRev 0.000023947 s 0.000011465500001577311 s 2.09
add_one / HLOOpt / cpu / BothRev 0.000023317 s 0.000017417280014342394 s 1.34
add_one / PartOpt / cpu / PreRev 0.000023662 s 0.000011230060035813947 s 2.11
add_one / PartOpt / cpu / PostRev 0.000024017 s 0.000011293500028841665 s 2.13
add_one / PartOpt / cpu / BothRev 0.000024302 s 0.00001118008005505544 s 2.17
add_one / IPartOpt / cpu / PreRev 0.000023848000000000003 s 0.00001686254002379428 s 1.41
add_one / IPartOpt / cpu / PostRev 0.000024048 s 0.000011557100024219836 s 2.08
add_one / IPartOpt / cpu / BothRev 0.000023498 s 0.00001106198002162273 s 2.12
add_one / DefOpt / cpu / PreRev 0.000024122 s 0.000011062340045100429 s 2.18
add_one / DefOpt / cpu / PostRev 0.000024427 s 0.000010930259950328036 s 2.23
add_one / DefOpt / cpu / BothRev 0.000023948 s 0.000011320499997964362 s 2.12
add_one / IDefOpt / cpu / PreRev 0.000023517 s 0.0000110442599907401 s 2.13
add_one / IDefOpt / cpu / PostRev 0.000024344000000000003 s 0.00001119873999414267 s 2.17
add_one / IDefOpt / cpu / BothRev 0.000023821 s 0.00001130389996433223 s 2.11
add_two / JaXPipe / cpu / Primal 0.000007554700005130144 s 0.00000779061997491226 s 0.97
add_two / Jax / cpu / Primal 0.000006578679958693101 s 0.0000071242400281334994 s 0.92
add_two / HLOOpt / cpu / Primal 0.000010488239995538606 s 0.000011331699979564292 s 0.93
add_two / PartOpt / cpu / Primal 0.000006971039993004524 s 0.000006988320010350435 s 1.00
add_two / IPartOpt / cpu / Primal 0.000006704200013700756 s 0.000007562239989056252 s 0.89
add_two / DefOpt / cpu / Primal 0.000010738399996625958 s 0.000007133199978852645 s 1.51
add_two / IDefOpt / cpu / Primal 0.000007055079986457713 s 0.000006988179975451203 s 1.01
add_two / JaXPipe / cpu / Forward 0.00001010052007586637 s 0.000010467719976077206 s 0.96
add_two / Jax / cpu / Forward 0.000010374140028943657 s 0.00001014771996779018 s 1.02
add_two / HLOOpt / cpu / Forward 0.000014734979968125116 s 0.000015272099981302746 s 0.96
add_two / PartOpt / cpu / Forward 0.0000145965599858755 s 0.000014978740009610192 s 0.97
add_two / IPartOpt / cpu / Forward 0.000010327779964427464 s 0.000010259620003125749 s 1.01
add_two / DefOpt / cpu / Forward 0.000014405180018002283 s 0.000015081880019351956 s 0.96
add_two / IDefOpt / cpu / Forward 0.000010043919983218075 s 0.000010221640013696745 s 0.98
add_two / JaXPipe / cpu / PreRev 0.000013687980053873617 s 0.00001410851999935403 s 0.97
add_two / JaXPipe / cpu / PostRev 0.00001318387996434467 s 0.000013603640009023365 s 0.97
add_two / JaXPipe / cpu / BothRev 0.000013730460022998158 s 0.00001366785998470732 s 1.00
add_two / Jax / cpu / BothRev 0.000013033659997745418 s 0.000013924240020060096 s 0.94
add_two / HLOOpt / cpu / PreRev 0.0000137636200452107 s 0.000013977280032122508 s 0.98
add_two / HLOOpt / cpu / PostRev 0.000013095559988869356 s 0.00001403646003382164 s 0.93
add_two / HLOOpt / cpu / BothRev 0.000015035820033517669 s 0.000015436239973496413 s 0.97
add_two / PartOpt / cpu / PreRev 0.00001374123996356502 s 0.000013534179988710091 s 1.02
add_two / PartOpt / cpu / PostRev 0.00001316482000220276 s 0.000013693600021724703 s 0.96
add_two / PartOpt / cpu / BothRev 0.000013651259996549924 s 0.000013705980018130504 s 1.00
add_two / IPartOpt / cpu / PreRev 0.000013624180000988416 s 0.000013599339972643063 s 1.00
add_two / IPartOpt / cpu / PostRev 0.000013394839970715112 s 0.00001350412002466328 s 0.99
add_two / IPartOpt / cpu / BothRev 0.000013487699989127578 s 0.000013644700020449818 s 0.99
add_two / DefOpt / cpu / PreRev 0.000013821899974573173 s 0.000014368500060299994 s 0.96
add_two / DefOpt / cpu / PostRev 0.000013729740003327606 s 0.000013709300001210067 s 1.00
add_two / DefOpt / cpu / BothRev 0.000013606080028694122 s 0.000013578719981524044 s 1.00
add_two / IDefOpt / cpu / PreRev 0.000014109220019236091 s 0.000013646060060636956 s 1.03
add_two / IDefOpt / cpu / PostRev 0.00001442523997866374 s 0.000014498359960271043 s 0.99
add_two / IDefOpt / cpu / BothRev 0.000013957880009911604 s 0.000013887720015191008 s 1.01
add_two / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.000001951 s 0.98
add_two / Jax / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / HLOOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / IPartOpt / cuda / Primal 0.0000019200000000000003 s 0.000001951 s 0.98
add_two / DefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
add_two / JaXPipe / cuda / Forward 0.000009824 s 0.000010368 s 0.95
add_two / Jax / cuda / Forward 0.00000944 s 0.000009984 s 0.95
add_two / HLOOpt / cuda / Forward 0.000009631 s 0.000010176 s 0.95
add_two / PartOpt / cuda / Forward 0.000009215 s 0.000010048 s 0.92
add_two / IPartOpt / cuda / Forward 0.00000976 s 0.000010912 s 0.89
add_two / DefOpt / cuda / Forward 0.000009696 s 0.00001008 s 0.96
add_two / IDefOpt / cuda / Forward 0.000009183 s 0.000009312000000000002 s 0.99
add_two / JaXPipe / cuda / PreRev 0.000032416 s 0.000032127999999999995 s 1.01
add_two / JaXPipe / cuda / PostRev 0.000031967 s 0.000031776 s 1.01
add_two / JaXPipe / cuda / BothRev 0.000031904000000000005 s 0.000032 s 1.00
add_two / Jax / cuda / BothRev 0.000032352 s 0.000031967 s 1.01
add_two / HLOOpt / cuda / PreRev 0.000031584 s 0.000031936 s 0.99
add_two / HLOOpt / cuda / PostRev 0.000031968 s 0.000031488 s 1.02
add_two / HLOOpt / cuda / BothRev 0.0000312 s 0.000032032 s 0.97
add_two / PartOpt / cuda / PreRev 0.0000312 s 0.000031904000000000005 s 0.98
add_two / PartOpt / cuda / PostRev 0.000032416 s 0.000034656 s 0.94
add_two / PartOpt / cuda / BothRev 0.000032256 s 0.000033119999999999995 s 0.97
add_two / IPartOpt / cuda / PreRev 0.000033824 s 0.000031584 s 1.07
add_two / IPartOpt / cuda / PostRev 0.000030719 s 0.00003168 s 0.97
add_two / IPartOpt / cuda / BothRev 0.0000312 s 0.000031296 s 1.00
add_two / DefOpt / cuda / PreRev 0.000031712 s 0.000032032 s 0.99
add_two / DefOpt / cuda / PostRev 0.000032800000000000004 s 0.00003184 s 1.03
add_two / DefOpt / cuda / BothRev 0.000032159 s 0.000031711 s 1.01
add_two / IDefOpt / cuda / PreRev 0.000031808000000000004 s 0.000031584 s 1.01
add_two / IDefOpt / cuda / PostRev 0.000031264 s 0.000031456 s 0.99
add_two / IDefOpt / cuda / BothRev 0.000031168000000000004 s 0.000032384 s 0.96
add_two / JaXPipe / tpu / Primal 0.000001441775 s 0.000001423875 s 1.01
add_two / Jax / tpu / Primal 0.0000014679000000000002 s 0.000001472625 s 1.00
add_two / HLOOpt / tpu / Primal 0.000001429375 s 0.0000014379 s 0.99
add_two / PartOpt / tpu / Primal 0.000001469375 s 0.0000014742 s 1.00
add_two / IPartOpt / tpu / Primal 0.0000014317750000000002 s 0.0000014400250000000002 s 0.99
add_two / DefOpt / tpu / Primal 0.000001473875 s 0.000001485775 s 0.99
add_two / IDefOpt / tpu / Primal 0.000001431325 s 0.000001434175 s 1.00
add_two / JaXPipe / tpu / Forward 0.000001829075 s 0.000001822325 s 1.00
add_two / Jax / tpu / Forward 0.0000018201 s 0.0000018356 s 0.99
add_two / HLOOpt / tpu / Forward 0.0000018214250000000003 s 0.000001828375 s 1.00
add_two / PartOpt / tpu / Forward 0.000001821675 s 0.000001823575 s 1.00
add_two / IPartOpt / tpu / Forward 0.00000183405 s 0.0000018298 s 1.00
add_two / DefOpt / tpu / Forward 0.0000018325250000000003 s 0.0000018305 s 1.00
add_two / IDefOpt / tpu / Forward 0.0000018227 s 0.000001819425 s 1.00
add_two / JaXPipe / tpu / PreRev 0.000002850475 s 0.00000284435 s 1.00
add_two / JaXPipe / tpu / PostRev 0.000002757025 s 0.000002750925 s 1.00
add_two / JaXPipe / tpu / BothRev 0.0000028392 s 0.000002833275 s 1.00
add_two / Jax / tpu / BothRev 0.0000027611999999999995 s 0.000002747925 s 1.00
add_two / HLOOpt / tpu / PreRev 0.0000028403 s 0.0000028348250000000004 s 1.00
add_two / HLOOpt / tpu / PostRev 0.000002742125 s 0.000002747775 s 1.00
add_two / HLOOpt / tpu / BothRev 0.00000283065 s 0.0000028392250000000003 s 1.00
add_two / PartOpt / tpu / PreRev 0.000002747075 s 0.000002740275 s 1.00
add_two / PartOpt / tpu / PostRev 0.00000283455 s 0.0000028397 s 1.00
add_two / PartOpt / tpu / BothRev 0.0000027643 s 0.000002755125 s 1.00
add_two / IPartOpt / tpu / PreRev 0.000002848225 s 0.000002836275 s 1.00
add_two / IPartOpt / tpu / PostRev 0.000002744625 s 0.0000027511 s 1.00
add_two / IPartOpt / tpu / BothRev 0.000002835525 s 0.00000284145 s 1.00
add_two / DefOpt / tpu / PreRev 0.000002761025 s 0.000002759 s 1.00
add_two / DefOpt / tpu / PostRev 0.0000028406 s 0.000002839325 s 1.00
add_two / DefOpt / tpu / BothRev 0.000002740925 s 0.0000027486250000000003 s 1.00
add_two / IDefOpt / tpu / PreRev 0.0000028295500000000003 s 0.000002842575 s 1.00
add_two / IDefOpt / tpu / PostRev 0.0000027453 s 0.0000027472 s 1.00
add_two / IDefOpt / tpu / BothRev 0.000002830475 s 0.000002838375 s 1.00
add_two / JaXPipe / cpu / Primal 0.000016513 s 0.00000779061997491226 s 2.12
add_two / Jax / cpu / Primal 0.000016171 s 0.0000071242400281334994 s 2.27
add_two / HLOOpt / cpu / Primal 0.000016128 s 0.000011331699979564292 s 1.42
add_two / PartOpt / cpu / Primal 0.000016598999999999998 s 0.000006988320010350435 s 2.38
add_two / IPartOpt / cpu / Primal 0.000016444 s 0.000007562239989056252 s 2.17
add_two / DefOpt / cpu / Primal 0.000016295 s 0.000007133199978852645 s 2.28
add_two / IDefOpt / cpu / Primal 0.000016339 s 0.000006988179975451203 s 2.34
add_two / JaXPipe / cpu / Forward 0.000022985 s 0.000010467719976077206 s 2.20
add_two / Jax / cpu / Forward 0.000021674 s 0.00001014771996779018 s 2.14
add_two / HLOOpt / cpu / Forward 0.000022228 s 0.000015272099981302746 s 1.46
add_two / PartOpt / cpu / Forward 0.00002151 s 0.000014978740009610192 s 1.44
add_two / IPartOpt / cpu / Forward 0.000021884 s 0.000010259620003125749 s 2.13
add_two / DefOpt / cpu / Forward 0.000021894 s 0.000015081880019351956 s 1.45
add_two / IDefOpt / cpu / Forward 0.000022088 s 0.000010221640013696745 s 2.16
add_two / JaXPipe / cpu / PreRev 0.000027844 s 0.00001410851999935403 s 1.97
add_two / JaXPipe / cpu / PostRev 0.000028522 s 0.000013603640009023365 s 2.10
add_two / JaXPipe / cpu / BothRev 0.000028267 s 0.00001366785998470732 s 2.07
add_two / Jax / cpu / BothRev 0.000028512 s 0.000013924240020060096 s 2.05
add_two / HLOOpt / cpu / PreRev 0.000027731 s 0.000013977280032122508 s 1.98
add_two / HLOOpt / cpu / PostRev 0.000028146 s 0.00001403646003382164 s 2.01
add_two / HLOOpt / cpu / BothRev 0.000027908 s 0.000015436239973496413 s 1.81
add_two / PartOpt / cpu / PreRev 0.000027969 s 0.000013534179988710091 s 2.07
add_two / PartOpt / cpu / PostRev 0.000029152 s 0.000013693600021724703 s 2.13
add_two / PartOpt / cpu / BothRev 0.000028902 s 0.000013705980018130504 s 2.11
add_two / IPartOpt / cpu / PreRev 0.000027492 s 0.000013599339972643063 s 2.02
add_two / IPartOpt / cpu / PostRev 0.000029069 s 0.00001350412002466328 s 2.15
add_two / IPartOpt / cpu / BothRev 0.000028844 s 0.000013644700020449818 s 2.11
add_two / DefOpt / cpu / PreRev 0.000027635 s 0.000014368500060299994 s 1.92
add_two / DefOpt / cpu / PostRev 0.000028692 s 0.000013709300001210067 s 2.09
add_two / DefOpt / cpu / BothRev 0.00002829 s 0.000013578719981524044 s 2.08
add_two / IDefOpt / cpu / PreRev 0.000028371 s 0.000013646060060636956 s 2.08
add_two / IDefOpt / cpu / PostRev 0.000028766 s 0.000014498359960271043 s 1.98
add_two / IDefOpt / cpu / BothRev 0.000028685 s 0.000013887720015191008 s 2.07
cache / JaXPipe / cpu / Primal 0.000006798859994887607 s 0.0000067652400139195375 s 1.00
cache / Jax / cpu / Primal 0.000007038260055196588 s 0.000006858279984953697 s 1.03
cache / HLOOpt / cpu / Primal 0.000006787320025978261 s 0.00000677435999023146 s 1.00
cache / PartOpt / cpu / Primal 0.000006213520036908449 s 0.000006694500043522566 s 0.93
cache / IPartOpt / cpu / Primal 0.000006543520012201043 s 0.000006187839981066645 s 1.06
cache / DefOpt / cpu / Primal 0.00000620830000116257 s 0.000006312319965218194 s 0.98
cache / IDefOpt / cpu / Primal 0.0000068050199934077685 s 0.000006500580029751291 s 1.05
cache / JaXPipe / cpu / Forward 0.000014500920005957596 s 0.000015542040036962133 s 0.93
cache / Jax / cpu / Forward 0.000014496779986075126 s 0.000015676000011808355 s 0.92
cache / HLOOpt / cpu / Forward 0.00001931077998051478 s 0.00002061443994534784 s 0.94
cache / PartOpt / cpu / Forward 0.000019487320014377477 s 0.00002036323999163869 s 0.96
cache / IPartOpt / cpu / Forward 0.000015134740024222991 s 0.000015725520006526496 s 0.96
cache / DefOpt / cpu / Forward 0.000019329640017531347 s 0.000023495319992434817 s 0.82
cache / IDefOpt / cpu / Forward 0.000014706659985677106 s 0.000014389620027941418 s 1.02
cache / JaXPipe / cpu / PreRev 0.000016374439983337652 s 0.00001592889998391911 s 1.03
cache / JaXPipe / cpu / PostRev 0.00001944923998053128 s 0.00002100328002597962 s 0.93
cache / JaXPipe / cpu / BothRev 0.000015905099990050075 s 0.000016675520018907264 s 0.95
cache / Jax / cpu / BothRev 0.0000203493400294974 s 0.00002007403998504742 s 1.01
cache / HLOOpt / cpu / PreRev 0.000016112780022012883 s 0.000016019780005080975 s 1.01
cache / HLOOpt / cpu / PostRev 0.000016222400017795734 s 0.000020259680022718383 s 0.80
cache / HLOOpt / cpu / BothRev 0.000018553599938968547 s 0.000018096279964083804 s 1.03
cache / PartOpt / cpu / PreRev 0.00001563560004797182 s 0.000015796039979250053 s 0.99
cache / PartOpt / cpu / PostRev 0.00002128174000972649 s 0.00002046994002739666 s 1.04
cache / PartOpt / cpu / BothRev 0.00001607645996955398 s 0.000015548000001217587 s 1.03
cache / IPartOpt / cpu / PreRev 0.000021464520004883523 s 0.000015691099979449063 s 1.37
cache / IPartOpt / cpu / PostRev 0.000025047639992408223 s 0.000020764780028912355 s 1.21
cache / IPartOpt / cpu / BothRev 0.000015266539976437342 s 0.00001614888001313375 s 0.95
cache / DefOpt / cpu / PreRev 0.000015241980017890457 s 0.000015705839978181757 s 0.97
cache / DefOpt / cpu / PostRev 0.000015454540007340257 s 0.000015708819946667064 s 0.98
cache / DefOpt / cpu / BothRev 0.000015894719972493476 s 0.00001629423999474966 s 0.98
cache / IDefOpt / cpu / PreRev 0.00001601078000931011 s 0.000015533799987679232 s 1.03
cache / IDefOpt / cpu / PostRev 0.000021224579986665047 s 0.000016040699983932426 s 1.32
cache / IDefOpt / cpu / BothRev 0.00002138297991223226 s 0.000015954640039126387 s 1.34
cache / JaXPipe / cuda / Primal 0.000002336 s 0.000002304 s 1.01
cache / Jax / cuda / Primal 0.000002272 s 0.000002271 s 1.00
cache / HLOOpt / cuda / Primal 0.000002304 s 0.000002272 s 1.01
cache / PartOpt / cuda / Primal 0.000002304 s 0.000002271 s 1.01
cache / IPartOpt / cuda / Primal 0.000002272 s 0.000002272 s 1
cache / DefOpt / cuda / Primal 0.000002272 s 0.000002208 s 1.03
cache / IDefOpt / cuda / Primal 0.000002335 s 0.000002304 s 1.01
cache / JaXPipe / cuda / Forward 0.0000023670000000000004 s 0.000002336 s 1.01
cache / Jax / cuda / Forward 0.000002304 s 0.000002303 s 1.00
cache / HLOOpt / cuda / Forward 0.0000023670000000000004 s 0.000002335 s 1.01
cache / PartOpt / cuda / Forward 0.0000023670000000000004 s 0.000002336 s 1.01
cache / IPartOpt / cuda / Forward 0.000002337 s 0.000002304 s 1.01
cache / DefOpt / cuda / Forward 0.000002272 s 0.000002272 s 1
cache / IDefOpt / cuda / Forward 0.000002304 s 0.000002272 s 1.01
cache / JaXPipe / cuda / PreRev 0.00001168 s 0.000013248 s 0.88
cache / JaXPipe / cuda / PostRev 0.000011488 s 0.000012608 s 0.91
cache / JaXPipe / cuda / BothRev 0.000011231 s 0.000011295 s 0.99
cache / Jax / cuda / BothRev 0.000011520000000000002 s 0.000011520000000000002 s 1
cache / HLOOpt / cuda / PreRev 0.000013248 s 0.000013184 s 1.00
cache / HLOOpt / cuda / PostRev 0.000013184 s 0.000013152 s 1.00
cache / HLOOpt / cuda / BothRev 0.000013248 s 0.000013183 s 1.00
cache / PartOpt / cuda / PreRev 0.000011648 s 0.000011552 s 1.01
cache / PartOpt / cuda / PostRev 0.000011296 s 0.000011936 s 0.95
cache / PartOpt / cuda / BothRev 0.000011296 s 0.000012928 s 0.87
cache / IPartOpt / cuda / PreRev 0.000011296 s 0.000012128 s 0.93
cache / IPartOpt / cuda / PostRev 0.000011424 s 0.000011776 s 0.97
cache / IPartOpt / cuda / BothRev 0.00001168 s 0.000011744 s 0.99
cache / DefOpt / cuda / PreRev 0.000011424 s 0.000011103 s 1.03
cache / DefOpt / cuda / PostRev 0.000010976 s 0.000011264 s 0.97
cache / DefOpt / cuda / BothRev 0.000011455999999999998 s 0.000011776 s 0.97
cache / IDefOpt / cuda / PreRev 0.000011168 s 0.000012097 s 0.92
cache / IDefOpt / cuda / PostRev 0.000012704 s 0.00001184 s 1.07
cache / IDefOpt / cuda / BothRev 0.000011168 s 0.000012031 s 0.93
cache / JaXPipe / tpu / Primal 0.000002470375 s 0.000002464075 s 1.00
cache / Jax / tpu / Primal 0.000002454525 s 0.000002457125 s 1.00
cache / HLOOpt / tpu / Primal 0.00000246085 s 0.00000245415 s 1.00
cache / PartOpt / tpu / Primal 0.000002466325 s 0.0000024715 s 1.00
cache / IPartOpt / tpu / Primal 0.000002462975 s 0.0000024584 s 1.00
cache / DefOpt / tpu / Primal 0.000002457125 s 0.0000024666 s 1.00
cache / IDefOpt / tpu / Primal 0.0000024604 s 0.00000247435 s 0.99
cache / JaXPipe / tpu / Forward 0.000003562475 s 0.000003556175 s 1.00
cache / Jax / tpu / Forward 0.000003539 s 0.000003537225 s 1.00
cache / HLOOpt / tpu / Forward 0.000003565 s 0.0000035645250000000004 s 1.00
cache / PartOpt / tpu / Forward 0.000003538975 s 0.000003514975 s 1.01
cache / IPartOpt / tpu / Forward 0.00000356655 s 0.00000354975 s 1.00
cache / DefOpt / tpu / Forward 0.000003538075 s 0.0000035269 s 1.00
cache / IDefOpt / tpu / Forward 0.000003559175 s 0.000003550825 s 1.00
cache / JaXPipe / tpu / PreRev 0.000004946825 s 0.000004951825 s 1.00
cache / JaXPipe / tpu / PostRev 0.0000049474 s 0.0000049492 s 1.00
cache / JaXPipe / tpu / BothRev 0.000004991675 s 0.000004963575 s 1.01
cache / Jax / tpu / BothRev 0.000005008325 s 0.000004954825 s 1.01
cache / HLOOpt / tpu / PreRev 0.0000039637 s 0.000003940500000000001 s 1.01
cache / HLOOpt / tpu / PostRev 0.000004113724999999999 s 0.000004129325 s 1.00
cache / HLOOpt / tpu / BothRev 0.000003955025 s 0.000003954075000000001 s 1.00
cache / PartOpt / tpu / PreRev 0.0000050171 s 0.000004974225 s 1.01
cache / PartOpt / tpu / PostRev 0.000005007725 s 0.000004956625 s 1.01
cache / PartOpt / tpu / BothRev 0.00000497605 s 0.000004975175 s 1.00
cache / IPartOpt / tpu / PreRev 0.00000497825 s 0.000004960675000000001 s 1.00
cache / IPartOpt / tpu / PostRev 0.000004959425 s 0.0000049813 s 1.00
cache / IPartOpt / tpu / BothRev 0.000004974799999999999 s 0.000004991 s 1.00
cache / DefOpt / tpu / PreRev 0.000004974175 s 0.00000497925 s 1.00
cache / DefOpt / tpu / PostRev 0.000004989225 s 0.00000497525 s 1.00
cache / DefOpt / tpu / BothRev 0.000004982125 s 0.0000049921 s 1.00
cache / IDefOpt / tpu / PreRev 0.000004968475 s 0.0000049675750000000005 s 1.00
cache / IDefOpt / tpu / PostRev 0.000004972725 s 0.000004989175 s 1.00
cache / IDefOpt / tpu / BothRev 0.0000049797 s 0.000004989925 s 1.00
cache / JaXPipe / cpu / Primal 0.000017814 s 0.0000067652400139195375 s 2.63
cache / Jax / cpu / Primal 0.000018318 s 0.000006858279984953697 s 2.67
cache / HLOOpt / cpu / Primal 0.000018049 s 0.00000677435999023146 s 2.66
cache / PartOpt / cpu / Primal 0.000018749 s 0.000006694500043522566 s 2.80
cache / IPartOpt / cpu / Primal 0.000018119 s 0.000006187839981066645 s 2.93
cache / DefOpt / cpu / Primal 0.000018558 s 0.000006312319965218194 s 2.94
cache / IDefOpt / cpu / Primal 0.000018136 s 0.000006500580029751291 s 2.79
cache / JaXPipe / cpu / Forward 0.000021039 s 0.000015542040036962133 s 1.35
cache / Jax / cpu / Forward 0.000020866 s 0.000015676000011808355 s 1.33
cache / HLOOpt / cpu / Forward 0.000021683 s 0.00002061443994534784 s 1.05
cache / PartOpt / cpu / Forward 0.000020825 s 0.00002036323999163869 s 1.02
cache / IPartOpt / cpu / Forward 0.000021696 s 0.000015725520006526496 s 1.38
cache / DefOpt / cpu / Forward 0.000021148 s 0.000023495319992434817 s 0.90
cache / IDefOpt / cpu / Forward 0.000021219 s 0.000014389620027941418 s 1.47
cache / JaXPipe / cpu / PreRev 0.000022861 s 0.00001592889998391911 s 1.44
cache / JaXPipe / cpu / PostRev 0.000026608 s 0.00002100328002597962 s 1.27
cache / JaXPipe / cpu / BothRev 0.00002181 s 0.000016675520018907264 s 1.31
cache / Jax / cpu / BothRev 0.000026408 s 0.00002007403998504742 s 1.32
cache / HLOOpt / cpu / PreRev 0.000022897 s 0.000016019780005080975 s 1.43
cache / HLOOpt / cpu / PostRev 0.000021878000000000003 s 0.000020259680022718383 s 1.08
cache / HLOOpt / cpu / BothRev 0.000022923 s 0.000018096279964083804 s 1.27
cache / PartOpt / cpu / PreRev 0.000021707 s 0.000015796039979250053 s 1.37
cache / PartOpt / cpu / PostRev 0.000026948 s 0.00002046994002739666 s 1.32
cache / PartOpt / cpu / BothRev 0.000023061 s 0.000015548000001217587 s 1.48
cache / IPartOpt / cpu / PreRev 0.000022449 s 0.000015691099979449063 s 1.43
cache / IPartOpt / cpu / PostRev 0.000026661 s 0.000020764780028912355 s 1.28
cache / IPartOpt / cpu / BothRev 0.000022132 s 0.00001614888001313375 s 1.37
cache / DefOpt / cpu / PreRev 0.000021317 s 0.000015705839978181757 s 1.36
cache / DefOpt / cpu / PostRev 0.000022388000000000003 s 0.000015708819946667064 s 1.43
cache / DefOpt / cpu / BothRev 0.000022027 s 0.00001629423999474966 s 1.35
cache / IDefOpt / cpu / PreRev 0.000022102 s 0.000015533799987679232 s 1.42
cache / IDefOpt / cpu / PostRev 0.000022004 s 0.000016040699983932426 s 1.37
cache / IDefOpt / cpu / BothRev 0.000021734 s 0.000015954640039126387 s 1.36
Concat / JaXPipe / cpu / Primal 0.000006967280023673084 s 0.000007477999961338355 s 0.93
Concat / Jax / cpu / Primal 0.000006650620034633903 s 0.000007448159994964953 s 0.89
Concat / HLOOpt / cpu / Primal 0.000007009540031504002 s 0.00000979825999820605 s 0.72
Concat / PartOpt / cpu / Primal 0.000006336420001389342 s 0.00000708663998011616 s 0.89
Concat / IPartOpt / cpu / Primal 0.000006815859987909789 s 0.000006348000015350408 s 1.07
Concat / DefOpt / cpu / Primal 0.000010165939984290164 s 0.000010170200021093478 s 1.00
Concat / IDefOpt / cpu / Primal 0.00000638889998299419 s 0.000006924580011400394 s 0.92
Concat / JaXPipe / cpu / Forward 0.000009533020047456376 s 0.000009657939999669908 s 0.99
Concat / Jax / cpu / Forward 0.00000952329996835033 s 0.000010302459977538091 s 0.92
Concat / HLOOpt / cpu / Forward 0.000013221300005170631 s 0.000013886780016036937 s 0.95
Concat / PartOpt / cpu / Forward 0.00001386562001243874 s 0.00001402134001182276 s 0.99
Concat / IPartOpt / cpu / Forward 0.000010232379981971462 s 0.000009403939984622413 s 1.09
Concat / DefOpt / cpu / Forward 0.000014705979974678483 s 0.000014035200047146646 s 1.05
Concat / IDefOpt / cpu / Forward 0.000010126540009878228 s 0.000010131599992746487 s 1.00
Concat / JaXPipe / cpu / PreRev 0.000011960380033997351 s 0.000011709379987223655 s 1.02
Concat / JaXPipe / cpu / PostRev 0.000011302900029477314 s 0.00001162401999863505 s 0.97
Concat / JaXPipe / cpu / BothRev 0.000015279180006473324 s 0.000011647779965642256 s 1.31
Concat / Jax / cpu / BothRev 0.000011239319983360474 s 0.00001116463993639627 s 1.01
Concat / HLOOpt / cpu / PreRev 0.00001148808001744328 s 0.000011569120024432778 s 0.99
Concat / HLOOpt / cpu / PostRev 0.000015201919986793654 s 0.0000153801800297515 s 0.99
Concat / HLOOpt / cpu / BothRev 0.000012988759954168928 s 0.000012803999989046132 s 1.01
Concat / PartOpt / cpu / PreRev 0.00001113491996875382 s 0.000011782940000557573 s 0.95
Concat / PartOpt / cpu / PostRev 0.00001100297997254529 s 0.000011856719975185115 s 0.93
Concat / PartOpt / cpu / BothRev 0.000011534040013430056 s 0.00001157544000307098 s 1.00
Concat / IPartOpt / cpu / PreRev 0.000011605379986576736 s 0.000012226760018165806 s 0.95
Concat / IPartOpt / cpu / PostRev 0.000011655339994831592 s 0.000011645340000541182 s 1.00
Concat / IPartOpt / cpu / BothRev 0.000011621439998634742 s 0.00001154382000095211 s 1.01
Concat / DefOpt / cpu / PreRev 0.000011100419987997156 s 0.000011529539979164836 s 0.96
Concat / DefOpt / cpu / PostRev 0.00001140251998549502 s 0.00001154548002887168 s 0.99
Concat / DefOpt / cpu / BothRev 0.000011119960008727505 s 0.00001121641997997358 s 0.99
Concat / IDefOpt / cpu / PreRev 0.00001146574001722911 s 0.000011383780001779087 s 1.01
Concat / IDefOpt / cpu / PostRev 0.000011704199978339605 s 0.000011767759997383109 s 0.99
Concat / IDefOpt / cpu / BothRev 0.000011073660007241414 s 0.000011008720011886908 s 1.01
Concat / JaXPipe / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / Jax / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / HLOOpt / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / PartOpt / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / IPartOpt / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / DefOpt / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / IDefOpt / cuda / Primal 0.000001951 s 0.000001951 s 1
Concat / JaXPipe / cuda / Forward 0.000009823 s 0.00001008 s 0.97
Concat / Jax / cuda / Forward 0.000009568 s 0.000011328 s 0.84
Concat / HLOOpt / cuda / Forward 0.000009888 s 0.000010017 s 0.99
Concat / PartOpt / cuda / Forward 0.000010016 s 0.000010368 s 0.97
Concat / IPartOpt / cuda / Forward 0.00000976 s 0.000010464 s 0.93
Concat / DefOpt / cuda / Forward 0.000012416 s 0.00001008 s 1.23
Concat / IDefOpt / cuda / Forward 0.000009856 s 0.000009888 s 1.00
Concat / JaXPipe / cuda / PreRev 0.000016544 s 0.000016639 s 0.99
Concat / JaXPipe / cuda / PostRev 0.000015616 s 0.000016032 s 0.97
Concat / JaXPipe / cuda / BothRev 0.000015935999999999998 s 0.000015552 s 1.02
Concat / Jax / cuda / BothRev 0.000016096 s 0.000018657 s 0.86
Concat / HLOOpt / cuda / PreRev 0.000016063999999999997 s 0.000017824 s 0.90
Concat / HLOOpt / cuda / PostRev 0.000015712 s 0.000016512 s 0.95
Concat / HLOOpt / cuda / BothRev 0.000015776 s 0.000016768999999999998 s 0.94
Concat / PartOpt / cuda / PreRev 0.00001616 s 0.000017503999999999997 s 0.92
Concat / PartOpt / cuda / PostRev 0.000016224 s 0.000016447 s 0.99
Concat / PartOpt / cuda / BothRev 0.000016032 s 0.000016447 s 0.97
Concat / IPartOpt / cuda / PreRev 0.000019904 s 0.000016735 s 1.19
Concat / IPartOpt / cuda / PostRev 0.000016383999999999998 s 0.000016224 s 1.01
Concat / IPartOpt / cuda / BothRev 0.000016383999999999998 s 0.000016576000000000002 s 0.99
Concat / DefOpt / cuda / PreRev 0.000015904000000000002 s 0.000018144 s 0.88
Concat / DefOpt / cuda / PostRev 0.000016255 s 0.000016 s 1.02
Concat / DefOpt / cuda / BothRev 0.000015776 s 0.000018624000000000003 s 0.85
Concat / IDefOpt / cuda / PreRev 0.000016704 s 0.000016224 s 1.03
Concat / IDefOpt / cuda / PostRev 0.000016 s 0.000016416 s 0.97
Concat / IDefOpt / cuda / BothRev 0.000016448000000000002 s 0.000016927999999999998 s 0.97
Concat / JaXPipe / tpu / Primal 0.0000015182 s 0.0000015328 s 0.99
Concat / Jax / tpu / Primal 0.0000015384000000000002 s 0.000001532775 s 1.00
Concat / HLOOpt / tpu / Primal 0.0000015207 s 0.00000152725 s 1.00
Concat / PartOpt / tpu / Primal 0.0000015287 s 0.0000015307999999999998 s 1.00
Concat / IPartOpt / tpu / Primal 0.00000153195 s 0.000001521475 s 1.01
Concat / DefOpt / tpu / Primal 0.000001539625 s 0.000001526875 s 1.01
Concat / IDefOpt / tpu / Primal 0.0000015165 s 0.000001526525 s 0.99
Concat / JaXPipe / tpu / Forward 0.000001571925 s 0.0000015857750000000005 s 0.99
Concat / Jax / tpu / Forward 0.000001535425 s 0.00000155355 s 0.99
Concat / HLOOpt / tpu / Forward 0.000001580275 s 0.0000015830499999999998 s 1.00
Concat / PartOpt / tpu / Forward 0.00000154685 s 0.000001563175 s 0.99
Concat / IPartOpt / tpu / Forward 0.000001597975 s 0.0000015681 s 1.02
Concat / DefOpt / tpu / Forward 0.0000015478 s 0.00000154725 s 1.00
Concat / IDefOpt / tpu / Forward 0.0000015739 s 0.0000015760750000000002 s 1.00
Concat / JaXPipe / tpu / PreRev 0.0000020047 s 0.000001992425 s 1.01
Concat / JaXPipe / tpu / PostRev 0.000002085925 s 0.0000020814 s 1.00
Concat / JaXPipe / tpu / BothRev 0.0000019940000000000003 s 0.000002005825 s 0.99
Concat / Jax / tpu / BothRev 0.0000020712 s 0.0000020721 s 1.00
Concat / HLOOpt / tpu / PreRev 0.0000019976500000000003 s 0.0000020036 s 1.00
Concat / HLOOpt / tpu / PostRev 0.000002069275 s 0.00000207245 s 1.00
Concat / HLOOpt / tpu / BothRev 0.0000020058000000000003 s 0.0000020043 s 1.00
Concat / PartOpt / tpu / PreRev 0.0000020757000000000003 s 0.000002071125 s 1.00
Concat / PartOpt / tpu / PostRev 0.00000199385 s 0.00000199385 s 1
Concat / PartOpt / tpu / BothRev 0.00000206445 s 0.000002068375 s 1.00
Concat / IPartOpt / tpu / PreRev 0.0000019965250000000003 s 0.000001999575 s 1.00
Concat / IPartOpt / tpu / PostRev 0.000002072875 s 0.00000206675 s 1.00
Concat / IPartOpt / tpu / BothRev 0.000001993825 s 0.000001995025 s 1.00
Concat / DefOpt / tpu / PreRev 0.000002077275 s 0.000002065175 s 1.01
Concat / DefOpt / tpu / PostRev 0.000001996425 s 0.0000019980750000000003 s 1.00
Concat / DefOpt / tpu / BothRev 0.0000020737500000000003 s 0.000002064875 s 1.00
Concat / IDefOpt / tpu / PreRev 0.000001998575 s 0.000002001975 s 1.00
Concat / IDefOpt / tpu / PostRev 0.0000020645 s 0.0000020654 s 1.00
Concat / IDefOpt / tpu / BothRev 0.00000200805 s 0.00000199555 s 1.01
Concat / JaXPipe / cpu / Primal 0.000015684 s 0.000007477999961338355 s 2.10
Concat / Jax / cpu / Primal 0.000015889 s 0.000007448159994964953 s 2.13
Concat / HLOOpt / cpu / Primal 0.000015745 s 0.00000979825999820605 s 1.61
Concat / PartOpt / cpu / Primal 0.000015720000000000002 s 0.00000708663998011616 s 2.22
Concat / IPartOpt / cpu / Primal 0.000015756 s 0.000006348000015350408 s 2.48
Concat / DefOpt / cpu / Primal 0.000015685 s 0.000010170200021093478 s 1.54
Concat / IDefOpt / cpu / Primal 0.000015938999999999998 s 0.000006924580011400394 s 2.30
Concat / JaXPipe / cpu / Forward 0.000021655 s 0.000009657939999669908 s 2.24
Concat / Jax / cpu / Forward 0.000020761000000000003 s 0.000010302459977538091 s 2.02
Concat / HLOOpt / cpu / Forward 0.000021397 s 0.000013886780016036937 s 1.54
Concat / PartOpt / cpu / Forward 0.000021292 s 0.00001402134001182276 s 1.52
Concat / IPartOpt / cpu / Forward 0.00002129 s 0.000009403939984622413 s 2.26
Concat / DefOpt / cpu / Forward 0.00002141 s 0.000014035200047146646 s 1.53
Concat / IDefOpt / cpu / Forward 0.00002141 s 0.000010131599992746487 s 2.11
Concat / JaXPipe / cpu / PreRev 0.000024330000000000003 s 0.000011709379987223655 s 2.08
Concat / JaXPipe / cpu / PostRev 0.000024042 s 0.00001162401999863505 s 2.07
Concat / JaXPipe / cpu / BothRev 0.0000237 s 0.000011647779965642256 s 2.03
Concat / Jax / cpu / BothRev 0.000024021 s 0.00001116463993639627 s 2.15
Concat / HLOOpt / cpu / PreRev 0.00002413 s 0.000011569120024432778 s 2.09
Concat / HLOOpt / cpu / PostRev 0.000023686 s 0.0000153801800297515 s 1.54
Concat / HLOOpt / cpu / BothRev 0.000023358 s 0.000012803999989046132 s 1.82
Concat / PartOpt / cpu / PreRev 0.000023957 s 0.000011782940000557573 s 2.03
Concat / PartOpt / cpu / PostRev 0.00002407 s 0.000011856719975185115 s 2.03
Concat / PartOpt / cpu / BothRev 0.000023997 s 0.00001157544000307098 s 2.07
Concat / IPartOpt / cpu / PreRev 0.000023323 s 0.000012226760018165806 s 1.91
Concat / IPartOpt / cpu / PostRev 0.000023616 s 0.000011645340000541182 s 2.03
Concat / IPartOpt / cpu / BothRev 0.000024009 s 0.00001154382000095211 s 2.08
Concat / DefOpt / cpu / PreRev 0.000023931 s 0.000011529539979164836 s 2.08
Concat / DefOpt / cpu / PostRev 0.000024455 s 0.00001154548002887168 s 2.12
Concat / DefOpt / cpu / BothRev 0.0000244 s 0.00001121641997997358 s 2.18
Concat / IDefOpt / cpu / PreRev 0.000024365 s 0.000011383780001779087 s 2.14
Concat / IDefOpt / cpu / PostRev 0.000024263 s 0.000011767759997383109 s 2.06
Concat / IDefOpt / cpu / BothRev 0.000024574 s 0.000011008720011886908 s 2.23
const_scatter / JaXPipe / cpu / Primal 0.000006806519986639614 s 0.000007006740024735336 s 0.97
const_scatter / Jax / cpu / Primal 0.000007078879998516641 s 0.000006900340022184537 s 1.03
const_scatter / HLOOpt / cpu / Primal 0.000006793059965275461 s 0.000006681180020677857 s 1.02
const_scatter / PartOpt / cpu / Primal 0.000006534359990837402 s 0.000006747300003553391 s 0.97
const_scatter / IPartOpt / cpu / Primal 0.000006526719989778939 s 0.000007156999990911572 s 0.91
const_scatter / DefOpt / cpu / Primal 0.000011030680007024784 s 0.00001053997998496925 s 1.05
const_scatter / IDefOpt / cpu / Primal 0.000005976659958832897 s 0.000006687559998681536 s 0.89
const_scatter / JaXPipe / cpu / Forward 0.000009184160035147215 s 0.000009094260012716403 s 1.01
const_scatter / Jax / cpu / Forward 0.000010364099971411631 s 0.000009302199969170032 s 1.11
const_scatter / HLOOpt / cpu / Forward 0.000014170279964673682 s 0.000013041919974057237 s 1.09
const_scatter / PartOpt / cpu / Forward 0.000009670559993537609 s 0.000013533999981518718 s 0.71
const_scatter / IPartOpt / cpu / Forward 0.000009104039991143507 s 0.000009127520015681512 s 1.00
const_scatter / DefOpt / cpu / Forward 0.00001410830002896546 s 0.000013344820044949302 s 1.06
const_scatter / IDefOpt / cpu / Forward 0.000009989999980462015 s 0.000009275720003643071 s 1.08
const_scatter / JaXPipe / cpu / PreRev 0.0002999292600088 s 0.0002991972399649 s 1.00
const_scatter / JaXPipe / cpu / PostRev 0.0002897506600311 s 0.0003054082600374 s 0.95
const_scatter / JaXPipe / cpu / BothRev 0.0002848422400347 s 0.0002823708999585 s 1.01
const_scatter / Jax / cpu / BothRev 0.0002833048200136 s 0.0002835109200077 s 1.00
const_scatter / HLOOpt / cpu / PreRev 0.0002834453800096 s 0.0002834082999379 s 1.00
const_scatter / HLOOpt / cpu / PostRev 0.0002878748599414 s 0.0002892958200118 s 1.00
const_scatter / HLOOpt / cpu / BothRev 0.0002846289399894 s 0.0002893734799545 s 0.98
const_scatter / PartOpt / cpu / PreRev 0.0003120051999576 s 0.0002835973600394 s 1.10
const_scatter / PartOpt / cpu / PostRev 0.0002886159199533 s 0.0002832099999977 s 1.02
const_scatter / PartOpt / cpu / BothRev 0.0002829703600036 s 0.0002877302400247 s 0.98
const_scatter / IPartOpt / cpu / PreRev 0.0002898291000474 s 0.0002864275199954 s 1.01
const_scatter / IPartOpt / cpu / PostRev 0.0002901076599391 s 0.0002833985800225 s 1.02
const_scatter / IPartOpt / cpu / BothRev 0.0002826370599632 s 0.0003071726200323 s 0.92
const_scatter / DefOpt / cpu / PreRev 0.0002907615200092 s 0.0002995654199912 s 0.97
const_scatter / DefOpt / cpu / PostRev 0.0002898152199668 s 0.0002848389600876 s 1.02
const_scatter / DefOpt / cpu / BothRev 0.0002832118999958 s 0.0002836268600094 s 1.00
const_scatter / IDefOpt / cpu / PreRev 0.0002859992599951 s 0.000284376260015 s 1.01
const_scatter / IDefOpt / cpu / PostRev 0.000304742219987 s 0.0002873844400164 s 1.06
const_scatter / IDefOpt / cpu / BothRev 0.0002855401600209 s 0.0002819903199906 s 1.01
const_scatter / JaXPipe / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
const_scatter / Jax / cuda / Primal 0.000001919 s 0.0000019200000000000003 s 1.00
const_scatter / HLOOpt / cuda / Primal 0.000001888 s 0.000001888 s 1
const_scatter / PartOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
const_scatter / IPartOpt / cuda / Primal 0.000001919 s 0.0000019200000000000003 s 1.00
const_scatter / DefOpt / cuda / Primal 0.000001888 s 0.000001889 s 1.00
const_scatter / IDefOpt / cuda / Primal 0.000001888 s 0.0000019200000000000003 s 0.98
const_scatter / JaXPipe / cuda / Forward 0.000009728 s 0.000009759 s 1.00
const_scatter / Jax / cuda / Forward 0.000009793 s 0.000009312000000000002 s 1.05
const_scatter / HLOOpt / cuda / Forward 0.00000976 s 0.000009696 s 1.01
const_scatter / PartOpt / cuda / Forward 0.000009919 s 0.000009792 s 1.01
const_scatter / IPartOpt / cuda / Forward 0.000009792 s 0.000009632 s 1.02
const_scatter / DefOpt / cuda / Forward 0.000009696 s 0.000009568 s 1.01
const_scatter / IDefOpt / cuda / Forward 0.000009888 s 0.000009536 s 1.04
const_scatter / JaXPipe / cuda / PreRev 0.000013632 s 0.00001264 s 1.08
const_scatter / JaXPipe / cuda / PostRev 0.000016864 s 0.000016416 s 1.03
const_scatter / JaXPipe / cuda / BothRev 0.000012544 s 0.000012448 s 1.01
const_scatter / Jax / cuda / BothRev 0.000016255999999999998 s 0.000016 s 1.02
const_scatter / HLOOpt / cuda / PreRev 0.000012736 s 0.000012256 s 1.04
const_scatter / HLOOpt / cuda / PostRev 0.00001296 s 0.000012929 s 1.00
const_scatter / HLOOpt / cuda / BothRev 0.000013088 s 0.00001264 s 1.04
const_scatter / PartOpt / cuda / PreRev 0.00001296 s 0.000012832 s 1.01
const_scatter / PartOpt / cuda / PostRev 0.000015744 s 0.000016351 s 0.96
const_scatter / PartOpt / cuda / BothRev 0.000013087 s 0.000012736 s 1.03
const_scatter / IPartOpt / cuda / PreRev 0.000013952 s 0.000012831 s 1.09
const_scatter / IPartOpt / cuda / PostRev 0.000018176 s 0.000016672 s 1.09
const_scatter / IPartOpt / cuda / BothRev 0.000012512 s 0.000012095 s 1.03
const_scatter / DefOpt / cuda / PreRev 0.000013792 s 0.000012608 s 1.09
const_scatter / DefOpt / cuda / PostRev 0.000013344 s 0.00001264 s 1.06
const_scatter / DefOpt / cuda / BothRev 0.0000136 s 0.000012864 s 1.06
const_scatter / IDefOpt / cuda / PreRev 0.000013536 s 0.000012992 s 1.04
const_scatter / IDefOpt / cuda / PostRev 0.000013664 s 0.000012672 s 1.08
const_scatter / IDefOpt / cuda / BothRev 0.000012544 s 0.000012768 s 0.98
const_scatter / JaXPipe / tpu / Primal 0.000003805825 s 0.000003802525 s 1.00
const_scatter / Jax / tpu / Primal 0.000003811475 s 0.000003814375 s 1.00
const_scatter / HLOOpt / tpu / Primal 9.52325e-7 s 9.53775e-7 s 1.00
const_scatter / PartOpt / tpu / Primal 0.000003796525 s 0.0000038103 s 1.00
const_scatter / IPartOpt / tpu / Primal 0.00000381055 s 0.0000038021 s 1.00
const_scatter / DefOpt / tpu / Primal 9.70275e-7 s 9.6515e-7 s 1.01
const_scatter / IDefOpt / tpu / Primal 9.60375e-7 s 9.5495e-7 s 1.01
const_scatter / JaXPipe / tpu / Forward 0.000001917925 s 0.000001930775 s 0.99
const_scatter / Jax / tpu / Forward 0.00000651695 s 0.0000064954000000000005 s 1.00
const_scatter / HLOOpt / tpu / Forward 0.00000191005 s 0.0000019141 s 1.00
const_scatter / PartOpt / tpu / Forward 0.000001947525 s 0.000001929125 s 1.01
const_scatter / IPartOpt / tpu / Forward 0.00000191665 s 0.00000192975 s 0.99
const_scatter / DefOpt / tpu / Forward 0.00000196345 s 0.000001941575 s 1.01
const_scatter / IDefOpt / tpu / Forward 0.000001922025 s 0.000001930225 s 1.00
const_scatter / JaXPipe / tpu / PreRev 0.0000043131 s 0.000004328524999999999 s 1.00
const_scatter / JaXPipe / tpu / PostRev 0.000006664625 s 0.00000667685 s 1.00
const_scatter / JaXPipe / tpu / BothRev 0.000004303675 s 0.000004316825 s 1.00
const_scatter / Jax / tpu / BothRev 0.000006687975 s 0.000006674625 s 1.00
const_scatter / HLOOpt / tpu / PreRev 0.00000430465 s 0.0000043274 s 0.99
const_scatter / HLOOpt / tpu / PostRev 0.000004307225 s 0.00000430925 s 1.00
const_scatter / HLOOpt / tpu / BothRev 0.00000429795 s 0.000004318425 s 1.00
const_scatter / PartOpt / tpu / PreRev 0.000004303875 s 0.0000043161 s 1.00
const_scatter / PartOpt / tpu / PostRev 0.00000666775 s 0.00000667475 s 1.00
const_scatter / PartOpt / tpu / BothRev 0.00000428525 s 0.0000043116 s 0.99
const_scatter / IPartOpt / tpu / PreRev 0.000004310625 s 0.000004317900000000001 s 1.00
const_scatter / IPartOpt / tpu / PostRev 0.0000066649 s 0.0000066586 s 1.00
const_scatter / IPartOpt / tpu / BothRev 0.000004301875 s 0.0000042968 s 1.00
const_scatter / DefOpt / tpu / PreRev 0.000004304825 s 0.000004312949999999999 s 1.00
const_scatter / DefOpt / tpu / PostRev 0.000004298575 s 0.000004308525 s 1.00
const_scatter / DefOpt / tpu / BothRev 0.0000042970000000000005 s 0.000004301725 s 1.00
const_scatter / IDefOpt / tpu / PreRev 0.000004303975 s 0.0000043092 s 1.00
const_scatter / IDefOpt / tpu / PostRev 0.0000043128 s 0.000004319875 s 1.00
const_scatter / IDefOpt / tpu / BothRev 0.000004294824999999999 s 0.000004295625 s 1.00
const_scatter / JaXPipe / cpu / Primal 0.000015929 s 0.000007006740024735336 s 2.27
const_scatter / Jax / cpu / Primal 0.000015819 s 0.000006900340022184537 s 2.29
const_scatter / HLOOpt / cpu / Primal 0.000015668 s 0.000006681180020677857 s 2.35
const_scatter / PartOpt / cpu / Primal 0.000015779000000000003 s 0.000006747300003553391 s 2.34
const_scatter / IPartOpt / cpu / Primal 0.000015308 s 0.000007156999990911572 s 2.14
const_scatter / DefOpt / cpu / Primal 0.000015705 s 0.00001053997998496925 s 1.49
const_scatter / IDefOpt / cpu / Primal 0.000015815 s 0.000006687559998681536 s 2.36
const_scatter / JaXPipe / cpu / Forward 0.000021005 s 0.000009094260012716403 s 2.31
const_scatter / Jax / cpu / Forward 0.00002085 s 0.000009302199969170032 s 2.24
const_scatter / HLOOpt / cpu / Forward 0.000020547 s 0.000013041919974057237 s 1.58
const_scatter / PartOpt / cpu / Forward 0.00002043 s 0.000013533999981518718 s 1.51
const_scatter / IPartOpt / cpu / Forward 0.000020848 s 0.000009127520015681512 s 2.28
const_scatter / DefOpt / cpu / Forward 0.000020083 s 0.000013344820044949302 s 1.50
const_scatter / IDefOpt / cpu / Forward 0.000021062 s 0.000009275720003643071 s 2.27
const_scatter / JaXPipe / cpu / PreRev 0.000522884 s 0.0002991972399649 s 1.75
const_scatter / JaXPipe / cpu / PostRev 0.000532339 s 0.0003054082600374 s 1.74
const_scatter / JaXPipe / cpu / BothRev 0.000518626 s 0.0002823708999585 s 1.84
const_scatter / Jax / cpu / BothRev 0.000533766 s 0.0002835109200077 s 1.88
const_scatter / HLOOpt / cpu / PreRev 0.000529937 s 0.0002834082999379 s 1.87
const_scatter / HLOOpt / cpu / PostRev 0.000530773 s 0.0002892958200118 s 1.83
const_scatter / HLOOpt / cpu / BothRev 0.000526086 s 0.0002893734799545 s 1.82
const_scatter / PartOpt / cpu / PreRev 0.000527366 s 0.0002835973600394 s 1.86
const_scatter / PartOpt / cpu / PostRev 0.000535559 s 0.0002832099999977 s 1.89
const_scatter / PartOpt / cpu / BothRev 0.000545703 s 0.0002877302400247 s 1.90
const_scatter / IPartOpt / cpu / PreRev 0.000523297 s 0.0002864275199954 s 1.83
const_scatter / IPartOpt / cpu / PostRev 0.0005376 s 0.0002833985800225 s 1.90
const_scatter / IPartOpt / cpu / BothRev 0.00056088 s 0.0003071726200323 s 1.83
const_scatter / DefOpt / cpu / PreRev 0.0005292869999999 s 0.0002995654199912 s 1.77
const_scatter / DefOpt / cpu / PostRev 0.00052443 s 0.0002848389600876 s 1.84
const_scatter / DefOpt / cpu / BothRev 0.000539118 s 0.0002836268600094 s 1.90
const_scatter / IDefOpt / cpu / PreRev 0.000523262 s 0.000284376260015 s 1.84
const_scatter / IDefOpt / cpu / PostRev 0.000529749 s 0.0002873844400164 s 1.84
const_scatter / IDefOpt / cpu / BothRev 0.000529966 s 0.0002819903199906 s 1.88
GenDot / JaXPipe / cpu / Primal 0.0000072258999898622274 s 0.000006947120009499486 s 1.04
GenDot / Jax / cpu / Primal 0.000006873860029372736 s 0.000006631260002905038 s 1.04
GenDot / HLOOpt / cpu / Primal 0.00001186787995720806 s 0.000011154000021633693 s 1.06
GenDot / PartOpt / cpu / Primal 0.000006440200031647692 s 0.000007032100011201692 s 0.92
GenDot / IPartOpt / cpu / Primal 0.000006469399995694402 s 0.000007280660038304632 s 0.89
GenDot / DefOpt / cpu / Primal 0.000012264159931874018 s 0.0000069018999965919646 s 1.78
GenDot / IDefOpt / cpu / Primal 0.000007042759989417391 s 0.000007160159993873094 s 0.98
GenDot / JaXPipe / cpu / Forward 0.00001089339995814953 s 0.000010356539960412192 s 1.05
GenDot / Jax / cpu / Forward 0.000010144619991478976 s 0.000009965219987861929 s 1.02
GenDot / HLOOpt / cpu / Forward 0.000010850579965335782 s 0.000015555780000795492 s 0.70
GenDot / PartOpt / cpu / Forward 0.000014091000002736107 s 0.000015106039973034056 s 0.93
GenDot / IPartOpt / cpu / Forward 0.000009987120020014116 s 0.000010579640002106316 s 0.94
GenDot / DefOpt / cpu / Forward 0.000015274380011760513 s 0.000015775420024510822 s 0.97
GenDot / IDefOpt / cpu / Forward 0.000010764159978862154 s 0.000010703139996621758 s 1.01
GenDot / JaXPipe / cpu / PreRev 0.000010639000010996825 s 0.000010941340015051536 s 0.97
GenDot / JaXPipe / cpu / PostRev 0.000010367580025558708 s 0.000009580099986123967 s 1.08
GenDot / JaXPipe / cpu / BothRev 0.000012410860026648152 s 0.000015421240004798164 s 0.80
GenDot / Jax / cpu / BothRev 0.000010058579964606909 s 0.00001034445996992872 s 0.97
GenDot / HLOOpt / cpu / PreRev 0.000010551619998295792 s 0.000010394100036137388 s 1.02
GenDot / HLOOpt / cpu / PostRev 0.000010698859941840056 s 0.00001084276000256068 s 0.99
GenDot / HLOOpt / cpu / BothRev 0.000011634260008577258 s 0.0000126556399573019 s 0.92
GenDot / PartOpt / cpu / PreRev 0.000010222399978374596 s 0.000010507800006962495 s 0.97
GenDot / PartOpt / cpu / PostRev 0.00000989210001534957 s 0.000009735419971548254 s 1.02
GenDot / PartOpt / cpu / BothRev 0.000010970100020131212 s 0.000010773899975902169 s 1.02
GenDot / IPartOpt / cpu / PreRev 0.000015729039996585926 s 0.0000126061000264599 s 1.25
GenDot / IPartOpt / cpu / PostRev 0.00001016750002236222 s 0.000009706900009405215 s 1.05
GenDot / IPartOpt / cpu / BothRev 0.000010763919999590145 s 0.00001048992001415172 s 1.03
GenDot / DefOpt / cpu / PreRev 0.000010973160015055329 s 0.0000105515800169087 s 1.04
GenDot / DefOpt / cpu / PostRev 0.000010253299960822914 s 0.000010753359983937116 s 0.95
GenDot / DefOpt / cpu / BothRev 0.000010674860013750732 s 0.000010645719949025078 s 1.00
GenDot / IDefOpt / cpu / PreRev 0.000010431959963170813 s 0.000010760300001493308 s 0.97
GenDot / IDefOpt / cpu / PostRev 0.00001093580000087968 s 0.00001044383999214915 s 1.05
GenDot / IDefOpt / cpu / BothRev 0.000011237580001761672 s 0.000010543380021772464 s 1.07
GenDot / JaXPipe / cuda / Primal 0.000002016 s 0.000002016 s 1
GenDot / Jax / cuda / Primal 0.000002016 s 0.000002015 s 1.00
GenDot / HLOOpt / cuda / Primal 0.000002015 s 0.000002015 s 1
GenDot / PartOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
GenDot / IPartOpt / cuda / Primal 0.000002016 s 0.000002016 s 1
GenDot / DefOpt / cuda / Primal 0.000002015 s 0.000002015 s 1
GenDot / IDefOpt / cuda / Primal 0.000002016 s 0.000002015 s 1.00
GenDot / JaXPipe / cuda / Forward 0.000010112 s 0.00000992 s 1.02
GenDot / Jax / cuda / Forward 0.00000944 s 0.000009824 s 0.96
GenDot / HLOOpt / cuda / Forward 0.000009665 s 0.000009727 s 0.99
GenDot / PartOpt / cuda / Forward 0.000009823 s 0.000010047 s 0.98
GenDot / IPartOpt / cuda / Forward 0.000009824 s 0.000010817 s 0.91
GenDot / DefOpt / cuda / Forward 0.000009792 s 0.000009888 s 0.99
GenDot / IDefOpt / cuda / Forward 0.000009791 s 0.000010144 s 0.97
GenDot / JaXPipe / cuda / PreRev 0.000014592 s 0.000009696 s 1.50
GenDot / JaXPipe / cuda / PostRev 0.000009759 s 0.000010144 s 0.96
GenDot / JaXPipe / cuda / BothRev 0.000010048 s 0.00001168 s 0.86
GenDot / Jax / cuda / BothRev 0.000009664 s 0.000010144 s 0.95
GenDot / HLOOpt / cuda / PreRev 0.000010464 s 0.000010048 s 1.04
GenDot / HLOOpt / cuda / PostRev 0.000010464 s 0.000010208 s 1.03
GenDot / HLOOpt / cuda / BothRev 0.000010368 s 0.000009856 s 1.05
GenDot / PartOpt / cuda / PreRev 0.000009888 s 0.00000976 s 1.01
GenDot / PartOpt / cuda / PostRev 0.000010176 s 0.000010271 s 0.99
GenDot / PartOpt / cuda / BothRev 0.000010336 s 0.000009983 s 1.04
GenDot / IPartOpt / cuda / PreRev 0.00000992 s 0.000010145 s 0.98
GenDot / IPartOpt / cuda / PostRev 0.000010368 s 0.000009888 s 1.05
GenDot / IPartOpt / cuda / BothRev 0.000010048 s 0.000009984 s 1.01
GenDot / DefOpt / cuda / PreRev 0.000009984 s 0.000010688 s 0.93
GenDot / DefOpt / cuda / PostRev 0.000010048 s 0.000009889 s 1.02
GenDot / DefOpt / cuda / BothRev 0.000009792 s 0.0000104 s 0.94
GenDot / IDefOpt / cuda / PreRev 0.000010144 s 0.000009728 s 1.04
GenDot / IDefOpt / cuda / PostRev 0.000009888 s 0.000010208 s 0.97
GenDot / IDefOpt / cuda / BothRev 0.000010176 s 0.000009856 s 1.03
GenDot / JaXPipe / tpu / Primal 9.30175e-7 s 9.30225e-7 s 1.00
GenDot / Jax / tpu / Primal 9.36125e-7 s 9.36325e-7 s 1.00
GenDot / HLOOpt / tpu / Primal 0.00000157635 s 0.000001582275 s 1.00
GenDot / PartOpt / tpu / Primal 9.36e-7 s 9.367e-7 s 1.00
GenDot / IPartOpt / tpu / Primal 9.40075e-7 s 9.4025e-7 s 1.00
GenDot / DefOpt / tpu / Primal 0.0000015000749999999998 s 0.00000150015 s 1.00
GenDot / IDefOpt / tpu / Primal 0.0000015762 s 0.00000157865 s 1.00
GenDot / JaXPipe / tpu / Forward 0.0000031579 s 0.0000031637 s 1.00
GenDot / Jax / tpu / Forward 0.000002334675 s 0.000002335475 s 1.00
GenDot / HLOOpt / tpu / Forward 0.0000031101750000000003 s 0.000003120325 s 1.00
GenDot / PartOpt / tpu / Forward 0.00000321845 s 0.0000032214749999999995 s 1.00
GenDot / IPartOpt / tpu / Forward 0.0000031199 s 0.0000031289499999999995 s 1.00
GenDot / DefOpt / tpu / Forward 0.000003214625 s 0.0000032214749999999995 s 1.00
GenDot / IDefOpt / tpu / Forward 0.0000031197000000000004 s 0.0000031285 s 1.00
GenDot / JaXPipe / tpu / PreRev 0.00000297265 s 0.000002972925 s 1.00
GenDot / JaXPipe / tpu / PostRev 0.00000240415 s 0.000002405675 s 1.00
GenDot / JaXPipe / tpu / BothRev 0.000002969875 s 0.00000296155 s 1.00
GenDot / Jax / tpu / BothRev 0.0000024108750000000004 s 0.000002403525 s 1.00
GenDot / HLOOpt / tpu / PreRev 0.000002966125 s 0.00000296425 s 1.00
GenDot / HLOOpt / tpu / PostRev 0.0000029226500000000003 s 0.00000292975 s 1.00
GenDot / HLOOpt / tpu / BothRev 0.0000029594 s 0.0000029618 s 1.00
GenDot / PartOpt / tpu / PreRev 0.000002939775 s 0.0000029347 s 1.00
GenDot / PartOpt / tpu / PostRev 0.0000023864 s 0.0000024008 s 0.99
GenDot / PartOpt / tpu / BothRev 0.000002940775 s 0.000002934525 s 1.00
GenDot / IPartOpt / tpu / PreRev 0.000002965775 s 0.000002959775 s 1.00
GenDot / IPartOpt / tpu / PostRev 0.0000024031 s 0.000002402725 s 1.00
GenDot / IPartOpt / tpu / BothRev 0.000002962625 s 0.00000295655 s 1.00
GenDot / DefOpt / tpu / PreRev 0.0000029423749999999995 s 0.0000029418 s 1.00
GenDot / DefOpt / tpu / PostRev 0.000002965775 s 0.000002963375 s 1.00
GenDot / DefOpt / tpu / BothRev 0.0000029407 s 0.0000029399 s 1.00
GenDot / IDefOpt / tpu / PreRev 0.00000295815 s 0.0000029657 s 1.00
GenDot / IDefOpt / tpu / PostRev 0.0000029338 s 0.000002932425 s 1.00
GenDot / IDefOpt / tpu / BothRev 0.000002964075 s 0.0000029619 s 1.00
GenDot / JaXPipe / cpu / Primal 0.000018145 s 0.000006947120009499486 s 2.61
GenDot / Jax / cpu / Primal 0.000017959 s 0.000006631260002905038 s 2.71
GenDot / HLOOpt / cpu / Primal 0.000017406000000000002 s 0.000011154000021633693 s 1.56
GenDot / PartOpt / cpu / Primal 0.000018187 s 0.000007032100011201692 s 2.59
GenDot / IPartOpt / cpu / Primal 0.000018488 s 0.000007280660038304632 s 2.54
GenDot / DefOpt / cpu / Primal 0.00001746 s 0.0000069018999965919646 s 2.53
GenDot / IDefOpt / cpu / Primal 0.000017147 s 0.000007160159993873094 s 2.39
GenDot / JaXPipe / cpu / Forward 0.000025105 s 0.000010356539960412192 s 2.42
GenDot / Jax / cpu / Forward 0.000024854 s 0.000009965219987861929 s 2.49
GenDot / HLOOpt / cpu / Forward 0.000023496 s 0.000015555780000795492 s 1.51
GenDot / PartOpt / cpu / Forward 0.000023468 s 0.000015106039973034056 s 1.55
GenDot / IPartOpt / cpu / Forward 0.000023391 s 0.000010579640002106316 s 2.21
GenDot / DefOpt / cpu / Forward 0.000023726 s 0.000015775420024510822 s 1.50
GenDot / IDefOpt / cpu / Forward 0.000023155 s 0.000010703139996621758 s 2.16
GenDot / JaXPipe / cpu / PreRev 0.000023872 s 0.000010941340015051536 s 2.18
GenDot / JaXPipe / cpu / PostRev 0.000025339 s 0.000009580099986123967 s 2.64
GenDot / JaXPipe / cpu / BothRev 0.000023873 s 0.000015421240004798164 s 1.55
GenDot / Jax / cpu / BothRev 0.000025151 s 0.00001034445996992872 s 2.43
GenDot / HLOOpt / cpu / PreRev 0.00002311 s 0.000010394100036137388 s 2.22
GenDot / HLOOpt / cpu / PostRev 0.000023884 s 0.00001084276000256068 s 2.20
GenDot / HLOOpt / cpu / BothRev 0.000023582 s 0.0000126556399573019 s 1.86
GenDot / PartOpt / cpu / PreRev 0.000024048 s 0.000010507800006962495 s 2.29
GenDot / PartOpt / cpu / PostRev 0.000025532 s 0.000009735419971548254 s 2.62
GenDot / PartOpt / cpu / BothRev 0.000023629 s 0.000010773899975902169 s 2.19
GenDot / IPartOpt / cpu / PreRev 0.000023398 s 0.0000126061000264599 s 1.86
GenDot / IPartOpt / cpu / PostRev 0.0000248 s 0.000009706900009405215 s 2.55
GenDot / IPartOpt / cpu / BothRev 0.000023735 s 0.00001048992001415172 s 2.26
GenDot / DefOpt / cpu / PreRev 0.000024281 s 0.0000105515800169087 s 2.30
GenDot / DefOpt / cpu / PostRev 0.000024015000000000003 s 0.000010753359983937116 s 2.23
GenDot / DefOpt / cpu / BothRev 0.000023639 s 0.000010645719949025078 s 2.22
GenDot / IDefOpt / cpu / PreRev 0.000023231 s 0.000010760300001493308 s 2.16
GenDot / IDefOpt / cpu / PostRev 0.000023372 s 0.00001044383999214915 s 2.24
GenDot / IDefOpt / cpu / BothRev 0.00002337 s 0.000010543380021772464 s 2.22
hlo_ffi / JaXPipe / cpu / Primal 0.000010553960000834197 s 0.00001215756002238777 s 0.87
hlo_ffi / Jax / cpu / Primal 0.000010557359992162674 s 0.000011048639989894584 s 0.96
hlo_ffi / HLOOpt / cpu / Primal 0.00001379296002596675 s 0.000014333660019474336 s 0.96
hlo_ffi / PartOpt / cpu / Primal 0.000010393359980298556 s 0.0000105436999729136 s 0.99
hlo_ffi / IPartOpt / cpu / Primal 0.00001040902003296651 s 0.000011186599986103829 s 0.93
hlo_ffi / DefOpt / cpu / Primal 0.000014546019965564484 s 0.000011220920032428696 s 1.30
hlo_ffi / IDefOpt / cpu / Primal 0.000010131940025530638 s 0.000010587520037006473 s 0.96
hlo_ffi / JaXPipe / cpu / Forward 0.000014821639997535385 s 0.000015703720000601605 s 0.94
hlo_ffi / Jax / cpu / Forward 0.000014716419927935933 s 0.0000157695599955332 s 0.93
hlo_ffi / HLOOpt / cpu / Forward 0.000015233600024657787 s 0.00001574168001752696 s 0.97
hlo_ffi / PartOpt / cpu / Forward 0.00001550762000078976 s 0.00001541290002023743 s 1.01
hlo_ffi / IPartOpt / cpu / Forward 0.00001527420000456914 s 0.000016654940009175333 s 0.92
hlo_ffi / DefOpt / cpu / Forward 0.00001563409995469556 s 0.000015555039981336448 s 1.01
hlo_ffi / IDefOpt / cpu / Forward 0.000015012900003057438 s 0.00001551384004415013 s 0.97
hlo_ffi / JaXPipe / cpu / PreRev 0.000014940260007278992 s 0.000015577020049022396 s 0.96
hlo_ffi / JaXPipe / cpu / PostRev 0.00001484106001953478 s 0.00001549124001940072 s 0.96
hlo_ffi / JaXPipe / cpu / BothRev 0.000015342179976869376 s 0.000018198480001956345 s 0.84
hlo_ffi / Jax / cpu / BothRev 0.00001493138001023908 s 0.00001639481999518466 s 0.91
hlo_ffi / HLOOpt / cpu / PreRev 0.000015015779981695232 s 0.00001555846000883321 s 0.97
hlo_ffi / HLOOpt / cpu / PostRev 0.000014814880041740252 s 0.00001602827996975975 s 0.92
hlo_ffi / HLOOpt / cpu / BothRev 0.00001689178000560787 s 0.000017372659985994687 s 0.97
hlo_ffi / PartOpt / cpu / PreRev 0.000015088300015122514 s 0.000015273759972842528 s 0.99
hlo_ffi / PartOpt / cpu / PostRev 0.00001531143999272899 s 0.000015502440010095598 s 0.99
hlo_ffi / PartOpt / cpu / BothRev 0.000014818060017205425 s 0.000015788059963597335 s 0.94
hlo_ffi / IPartOpt / cpu / PreRev 0.000014639300025010015 s 0.000015949380012898474 s 0.92
hlo_ffi / IPartOpt / cpu / PostRev 0.000014955619972170098 s 0.000016042460019889402 s 0.93
hlo_ffi / IPartOpt / cpu / BothRev 0.000014924800034350482 s 0.00001584979996550828 s 0.94
hlo_ffi / DefOpt / cpu / PreRev 0.000015131500022107505 s 0.000015901400001894216 s 0.95
hlo_ffi / DefOpt / cpu / PostRev 0.000014702599992233444 s 0.000015517680012635537 s 0.95
hlo_ffi / DefOpt / cpu / BothRev 0.00001492728003540833 s 0.000015647959990019445 s 0.95
hlo_ffi / IDefOpt / cpu / PreRev 0.00001536083998871618 s 0.000015500539984714123 s 0.99
hlo_ffi / IDefOpt / cpu / PostRev 0.00001526971997009241 s 0.00001546057996165473 s 0.99
hlo_ffi / IDefOpt / cpu / BothRev 0.000014659799990113243 s 0.000015286819971151998 s 0.96
hlo_ffi / JaXPipe / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / Jax / cuda / Primal 0.000001984 s 0.000001984 s 1
hlo_ffi / HLOOpt / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / PartOpt / cuda / Primal 0.000001983 s 0.000001984 s 1.00
hlo_ffi / IPartOpt / cuda / Primal 0.000001984 s 0.000001984 s 1
hlo_ffi / DefOpt / cuda / Primal 0.000001984 s 0.000001983 s 1.00
hlo_ffi / IDefOpt / cuda / Primal 0.000001984 s 0.000001984 s 1
hlo_ffi / JaXPipe / cuda / Forward 0.000002079 s 0.00000208 s 1.00
hlo_ffi / Jax / cuda / Forward 0.00000208 s 0.000002049 s 1.02
hlo_ffi / HLOOpt / cuda / Forward 0.000002079 s 0.00000208 s 1.00
hlo_ffi / PartOpt / cuda / Forward 0.00000208 s 0.00000208 s 1
hlo_ffi / IPartOpt / cuda / Forward 0.000002079 s 0.00000208 s 1.00
hlo_ffi / DefOpt / cuda / Forward 0.000002048 s 0.00000208 s 0.98
hlo_ffi / IDefOpt / cuda / Forward 0.00000208 s 0.000002079 s 1.00
hlo_ffi / JaXPipe / cuda / PreRev 0.000002079 s 0.000002048 s 1.02
hlo_ffi / JaXPipe / cuda / PostRev 0.000002047 s 0.000002048 s 1.00
hlo_ffi / JaXPipe / cuda / BothRev 0.00000208 s 0.000002048 s 1.02
hlo_ffi / Jax / cuda / BothRev 0.000002048 s 0.000002048 s 1
hlo_ffi / HLOOpt / cuda / PreRev 0.000002047 s 0.000002048 s 1.00
hlo_ffi / HLOOpt / cuda / PostRev 0.000002047 s 0.000002048 s 1.00
hlo_ffi / HLOOpt / cuda / BothRev 0.000002048 s 0.000002048 s 1
hlo_ffi / PartOpt / cuda / PreRev 0.000002048 s 0.000002048 s 1
hlo_ffi / PartOpt / cuda / PostRev 0.000002048 s 0.000002048 s 1
hlo_ffi / PartOpt / cuda / BothRev 0.000002079 s 0.000002048 s 1.02
hlo_ffi / IPartOpt / cuda / PreRev 0.000002079 s 0.000002047 s 1.02
hlo_ffi / IPartOpt / cuda / PostRev 0.000002048 s 0.000002048 s 1
hlo_ffi / IPartOpt / cuda / BothRev 0.000002048 s 0.000002048 s 1
hlo_ffi / DefOpt / cuda / PreRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / DefOpt / cuda / PostRev 0.000002048 s 0.000002048 s 1
hlo_ffi / DefOpt / cuda / BothRev 0.000002047 s 0.000002048 s 1.00
hlo_ffi / IDefOpt / cuda / PreRev 0.000002079 s 0.000002047 s 1.02
hlo_ffi / IDefOpt / cuda / PostRev 0.000002048 s 0.000002048 s 1
hlo_ffi / IDefOpt / cuda / BothRev 0.000002048 s 0.000002047 s 1.00
hlo_ffi / JaXPipe / tpu / Primal 9.19725e-7 s 9.09975e-7 s 1.01
hlo_ffi / Jax / tpu / Primal 9.49775e-7 s 9.754e-7 s 0.97
hlo_ffi / HLOOpt / tpu / Primal 8.9505e-7 s 9.47675e-7 s 0.94
hlo_ffi / PartOpt / tpu / Primal 9.5035e-7 s 9.80375e-7 s 0.97
hlo_ffi / IPartOpt / tpu / Primal 8.9825e-7 s 9.451e-7 s 0.95
hlo_ffi / DefOpt / tpu / Primal 9.59225e-7 s 9.7455e-7 s 0.98
hlo_ffi / IDefOpt / tpu / Primal 8.98775e-7 s 9.5405e-7 s 0.94
hlo_ffi / JaXPipe / tpu / Forward 9.495e-7 s 9.489e-7 s 1.00
hlo_ffi / Jax / tpu / Forward 9.81375e-7 s 9.817e-7 s 1.00
hlo_ffi / HLOOpt / tpu / Forward 9.74125e-7 s 9.73525e-7 s 1.00
hlo_ffi / PartOpt / tpu / Forward 9.345e-7 s 9.5905e-7 s 0.97
hlo_ffi / IPartOpt / tpu / Forward 9.73625e-7 s 9.73825e-7 s 1.00
hlo_ffi / DefOpt / tpu / Forward 9.33925e-7 s 9.5865e-7 s 0.97
hlo_ffi / IDefOpt / tpu / Forward 9.73875e-7 s 9.7355e-7 s 1.00
hlo_ffi / JaXPipe / tpu / PreRev 9.32275e-7 s 9.5385e-7 s 0.98
hlo_ffi / JaXPipe / tpu / PostRev 9.647e-7 s 9.64375e-7 s 1.00
hlo_ffi / JaXPipe / tpu / BothRev 9.5965e-7 s 9.948e-7 s 0.96
hlo_ffi / Jax / tpu / BothRev 9.651e-7 s 9.65025e-7 s 1.00
hlo_ffi / HLOOpt / tpu / PreRev 9.59825e-7 s 9.9415e-7 s 0.97
hlo_ffi / HLOOpt / tpu / PostRev 9.651e-7 s 9.642e-7 s 1.00
hlo_ffi / HLOOpt / tpu / BothRev 9.59675e-7 s 9.946e-7 s 0.96
hlo_ffi / PartOpt / tpu / PreRev 9.6465e-7 s 9.64275e-7 s 1.00
hlo_ffi / PartOpt / tpu / PostRev 9.598e-7 s 9.9475e-7 s 0.96
hlo_ffi / PartOpt / tpu / BothRev 9.65175e-7 s 9.645e-7 s 1.00
hlo_ffi / IPartOpt / tpu / PreRev 9.60125e-7 s 9.946249999999998e-7 s 0.97
hlo_ffi / IPartOpt / tpu / PostRev 9.65575e-7 s 9.646e-7 s 1.00
hlo_ffi / IPartOpt / tpu / BothRev 9.60225e-7 s 9.946e-7 s 0.97
hlo_ffi / DefOpt / tpu / PreRev 9.6525e-7 s 9.6435e-7 s 1.00
hlo_ffi / DefOpt / tpu / PostRev 9.60375e-7 s 9.9435e-7 s 0.97
hlo_ffi / DefOpt / tpu / BothRev 9.650749999999998e-7 s 9.64725e-7 s 1.00
hlo_ffi / IDefOpt / tpu / PreRev 9.60275e-7 s 9.944000000000002e-7 s 0.97
hlo_ffi / IDefOpt / tpu / PostRev 9.64675e-7 s 9.642e-7 s 1.00
hlo_ffi / IDefOpt / tpu / BothRev 9.60075e-7 s 9.94575e-7 s 0.97
hlo_ffi / JaXPipe / cpu / Primal 0.000021682 s 0.00001215756002238777 s 1.78
hlo_ffi / Jax / cpu / Primal 0.000021685 s 0.000011048639989894584 s 1.96
hlo_ffi / HLOOpt / cpu / Primal 0.000021221 s 0.000014333660019474336 s 1.48
hlo_ffi / PartOpt / cpu / Primal 0.000021028000000000003 s 0.0000105436999729136 s 1.99
hlo_ffi / IPartOpt / cpu / Primal 0.000021547 s 0.000011186599986103829 s 1.93
hlo_ffi / DefOpt / cpu / Primal 0.000021409 s 0.000011220920032428696 s 1.91
hlo_ffi / IDefOpt / cpu / Primal 0.000021307 s 0.000010587520037006473 s 2.01
hlo_ffi / JaXPipe / cpu / Forward 0.000030411 s 0.000015703720000601605 s 1.94
hlo_ffi / Jax / cpu / Forward 0.000030104 s 0.0000157695599955332 s 1.91
hlo_ffi / HLOOpt / cpu / Forward 0.00002972 s 0.00001574168001752696 s 1.89
hlo_ffi / PartOpt / cpu / Forward 0.000029394 s 0.00001541290002023743 s 1.91
hlo_ffi / IPartOpt / cpu / Forward 0.000029744 s 0.000016654940009175333 s 1.79
hlo_ffi / DefOpt / cpu / Forward 0.000029495 s 0.000015555039981336448 s 1.90
hlo_ffi / IDefOpt / cpu / Forward 0.000029736 s 0.00001551384004415013 s 1.92
hlo_ffi / JaXPipe / cpu / PreRev 0.000030484 s 0.000015577020049022396 s 1.96
hlo_ffi / JaXPipe / cpu / PostRev 0.000029783 s 0.00001549124001940072 s 1.92
hlo_ffi / JaXPipe / cpu / BothRev 0.000030404 s 0.000018198480001956345 s 1.67
hlo_ffi / Jax / cpu / BothRev 0.000029675 s 0.00001639481999518466 s 1.81
hlo_ffi / HLOOpt / cpu / PreRev 0.000029893 s 0.00001555846000883321 s 1.92
hlo_ffi / HLOOpt / cpu / PostRev 0.00002977 s 0.00001602827996975975 s 1.86
hlo_ffi / HLOOpt / cpu / BothRev 0.000030118 s 0.000017372659985994687 s 1.73
hlo_ffi / PartOpt / cpu / PreRev 0.000030551 s 0.000015273759972842528 s 2.00
hlo_ffi / PartOpt / cpu / PostRev 0.000030883 s 0.000015502440010095598 s 1.99
hlo_ffi / PartOpt / cpu / BothRev 0.000030171 s 0.000015788059963597335 s 1.91
hlo_ffi / IPartOpt / cpu / PreRev 0.000030291 s 0.000015949380012898474 s 1.90
hlo_ffi / IPartOpt / cpu / PostRev 0.000030817 s 0.000016042460019889402 s 1.92
hlo_ffi / IPartOpt / cpu / BothRev 0.000030333 s 0.00001584979996550828 s 1.91
hlo_ffi / DefOpt / cpu / PreRev 0.0000305 s 0.000015901400001894216 s 1.92
hlo_ffi / DefOpt / cpu / PostRev 0.00003033 s 0.000015517680012635537 s 1.95
hlo_ffi / DefOpt / cpu / BothRev 0.00003099 s 0.000015647959990019445 s 1.98
hlo_ffi / IDefOpt / cpu / PreRev 0.000030658000000000004 s 0.000015500539984714123 s 1.98
hlo_ffi / IDefOpt / cpu / PostRev 0.000051691 s 0.00001546057996165473 s 3.34
hlo_ffi / IDefOpt / cpu / BothRev 0.00003017 s 0.000015286819971151998 s 1.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.0011608180000621 s 0.0012036823999551 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.000921300200116 s 0.0009620957998777 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.000965019799878 s 0.0009929958000611 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.0009083613998882 s 0.0009454152001126 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.0009227689999534 s 0.0009428865999325 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.0009837171998697 s 0.0009812250000322 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.0009558567999192 s 0.0009777559998838 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.0026824629999282 s 0.002858737400038 s 0.94
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.0023384537998936 s 0.0023105184000087 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.0021713529999942 s 0.0022813554001004 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.0022423364000133 s 0.0021734424000896 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.0021497572000953 s 0.0021884054000111 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.0022551081999154 s 0.0024744702000134 s 0.91
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.002412242399987 s 0.0022505462000481 s 1.07
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.0061717777999547 s 0.006784250200053 s 0.91
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.0052599124000153 s 0.0058888162000585 s 0.89
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.0054720860000998 s 0.0056171007999182 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.0056609621999996 s 0.0059114769999723 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.0055049397999027 s 0.0058158163998086 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.005428001600103 s 0.0045190218000243 s 1.20
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.006919869199919 s 0.0065032250000513 s 1.06
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.0054702464000911 s 0.0046092143998066 s 1.19
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.0059115796000696 s 0.0070172094000554 s 0.84
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.0057619930000328 s 0.0033629600000494 s 1.71
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.0063755414000297 s 0.0063722097999743 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.0057007753999641 s 0.0054374347999328 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.0032509261998711 s 0.0033442355998886 s 0.97
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.0054783122000117 s 0.0050351783999758 s 1.09
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.0031110091998925 s 0.0075923654000689 s 0.41
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.0052721056000336 s 0.0051982864000819 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.006876614200064 s 0.0033124989999123 s 2.08
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.0058515341999736 s 0.0049496529998577 s 1.18
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.0061410750000504 s 0.0033382535999407 s 1.84
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal 0.000281534 s 0.000279551 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal 0.000281278 s 0.00028 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal 0.00028931 s 0.000287296 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal 0.000282398 s 0.000281216 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal 0.000282879 s 0.0002812149999999 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal 0.000288798 s 0.000287392 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal 0.0002890539999999 s 0.000287263 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward 0.000559964 s 0.000558686 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward 0.000539837 s 0.000539007 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward 0.000559772 s 0.000558366 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward 0.000560508 s 0.000559166 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward 0.00056086 s 0.000558559 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward 0.00055974 s 0.000558239 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward 0.00056006 s 0.000557855 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev 0.001028856 s 0.001026045 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev 0.000987865 s 0.000983869 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev 0.001022745 s 0.001019773 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev 0.00098537 s 0.000982365 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev 0.00100956 s 0.001006621 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev 0.0010360249999999 s 0.001031101 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev 0.001009081 s 0.001007485 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev 0.001025818 s 0.001023357 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev 0.000974137 s 0.000972957 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev 0.001026009 s 0.001019998 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev 0.001024665 s 0.001020509 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev 0.000975097 s 0.000971806 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev 0.001029945 s 0.0010250539999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev 0.001019482 s 0.001018141 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev 0.000957914 s 0.000955134 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev 0.001020505 s 0.001018557 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev 0.0010201849999999 s 0.001018461 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev 0.001021593 s 0.001017341 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev 0.0010210809999999 s 0.00101523 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal 0.00012412975 s 0.000130776 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal 0.00012663325 s 0.00012379575 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal 0.000152624 s 0.0001602895 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal 0.00013437525 s 0.00013092325 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal 0.00013076625 s 0.00013860425 s 0.94
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal 0.0001485145 s 0.0001448835 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal 0.0001508324999999 s 0.000158363 s 0.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward 0.0002121357499999 s 0.00021344325 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward 0.0002612459999999 s 0.000262739 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward 0.00021226825 s 0.0002202455 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward 0.00021842225 s 0.0002149439999999 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward 0.00021231325 s 0.00021632625 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward 0.00021834975 s 0.0002179289999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward 0.0002123635 s 0.000215482 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev 0.0003545575 s 0.00035604775 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev 0.000256988 s 0.00025613975 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev 0.0003549145 s 0.0003558845 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev 0.00025698375 s 0.0002572075 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev 0.000354719 s 0.00035595475 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev 0.00029080325 s 0.00029121575 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev 0.00035460975 s 0.0003563505 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev 0.00035542875 s 0.0003559474999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev 0.00027109 s 0.0002721675 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev 0.00035554475 s 0.00035589125 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev 0.0003546405 s 0.0003560404999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev 0.0002720775 s 0.000272059 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev 0.00035490025 s 0.00035620475 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev 0.00035800825 s 0.00035833425 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev 0.0002838684999999 s 0.00028397 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev 0.00035760675 s 0.00035829275 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev 0.0003568052499999 s 0.0003583175 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev 0.00030104975 s 0.00030107325 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev 0.000357122 s 0.0003580094999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.0022880039999999 s 0.0012036823999551 s 1.90
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.0023439339999999 s 0.0009620957998777 s 2.44
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.00246858 s 0.0009929958000611 s 2.49
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.002261953 s 0.0009454152001126 s 2.39
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.002333813 s 0.0009428865999325 s 2.48
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.002340501 s 0.0009812250000322 s 2.39
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.00231937 s 0.0009777559998838 s 2.37
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.00589891 s 0.002858737400038 s 2.06
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.006237733 s 0.0023105184000087 s 2.70
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.0060525299999999 s 0.0022813554001004 s 2.65
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.006347947 s 0.0021734424000896 s 2.92
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.0062175 s 0.0021884054000111 s 2.84
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.005916849 s 0.0024744702000134 s 2.39
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.0061690389999999 s 0.0022505462000481 s 2.74
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.0123798969999999 s 0.006784250200053 s 1.82
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.009809592 s 0.0058888162000585 s 1.67
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.009828801 s 0.0056171007999182 s 1.75
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.010812429 s 0.0059114769999723 s 1.83
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.009623281 s 0.0058158163998086 s 1.65
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.010003285 s 0.0045190218000243 s 2.21
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.0117749439999999 s 0.0065032250000513 s 1.81
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.009447268 s 0.0046092143998066 s 2.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.012011901 s 0.0070172094000554 s 1.71
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.011384556 s 0.0033629600000494 s 3.39
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.010043784 s 0.0063722097999743 s 1.58
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.010663173 s 0.0054374347999328 s 1.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.010075192 s 0.0033442355998886 s 3.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.009827913 s 0.0050351783999758 s 1.95
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.0091100779999999 s 0.0075923654000689 s 1.20
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.009522662 s 0.0051982864000819 s 1.83
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.009607598 s 0.0033124989999123 s 2.90
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.01002465 s 0.0049496529998577 s 2.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.009256671 s 0.0033382535999407 s 2.77
scatter_sum / JaXPipe / cpu / Primal 0.000008765140000832616 s 0.0000080933799927152 s 1.08
scatter_sum / Jax / cpu / Primal 0.00000830382003186969 s 0.00000773041997490509 s 1.07
scatter_sum / HLOOpt / cpu / Primal 0.00001165101999504259 s 0.000010982679996232036 s 1.06
scatter_sum / PartOpt / cpu / Primal 0.00000729794000108086 s 0.000007304119981199619 s 1.00
scatter_sum / IPartOpt / cpu / Primal 0.000007218680011646939 s 0.000007588260004922631 s 0.95
scatter_sum / DefOpt / cpu / Primal 0.000007342320022871718 s 0.000007340080019275774 s 1.00
scatter_sum / IDefOpt / cpu / Primal 0.000007587279960716842 s 0.000007766280004943837 s 0.98
scatter_sum / JaXPipe / cpu / Forward 0.000011682040021696592 s 0.000011774819968195516 s 0.99
scatter_sum / Jax / cpu / Forward 0.000011735579992091516 s 0.00001104679997297353 s 1.06
scatter_sum / HLOOpt / cpu / Forward 0.000016802159971121 s 0.000016549519978070747 s 1.02
scatter_sum / PartOpt / cpu / Forward 0.00001270353999643703 s 0.000016693700017640368 s 0.76
scatter_sum / IPartOpt / cpu / Forward 0.000011650680035018013 s 0.00001139787999818509 s 1.02
scatter_sum / DefOpt / cpu / Forward 0.000017205119966092753 s 0.00001691736004431732 s 1.02
scatter_sum / IDefOpt / cpu / Forward 0.000011998819991276832 s 0.000011772519965234096 s 1.02
scatter_sum / JaXPipe / cpu / PreRev 0.00001135266001256241 s 0.000012000640008409392 s 0.95
scatter_sum / JaXPipe / cpu / PostRev 0.000011807319997387824 s 0.000011409239968998008 s 1.03
scatter_sum / JaXPipe / cpu / BothRev 0.000011457300033725914 s 0.000016204619996642577 s 0.71
scatter_sum / Jax / cpu / BothRev 0.000011159440018673197 s 0.000011913580010514124 s 0.94
scatter_sum / HLOOpt / cpu / PreRev 0.00001145110003562877 s 0.000011849580014313687 s 0.97
scatter_sum / HLOOpt / cpu / PostRev 0.00001612585995644622 s 0.00001634619997275877 s 0.99
scatter_sum / HLOOpt / cpu / BothRev 0.000012872859979324858 s 0.000013083580033708132 s 0.98
scatter_sum / PartOpt / cpu / PreRev 0.000011292019980828628 s 0.000011277140010861333 s 1.00
scatter_sum / PartOpt / cpu / PostRev 0.000011635459941317094 s 0.000011436099994170943 s 1.02
scatter_sum / PartOpt / cpu / BothRev 0.000011186780011485096 s 0.000012017060034850149 s 0.93
scatter_sum / IPartOpt / cpu / PreRev 0.000011781960020016412 s 0.00001792121997823415 s 0.66
scatter_sum / IPartOpt / cpu / PostRev 0.000011918680002054316 s 0.000011673999988488504 s 1.02
scatter_sum / IPartOpt / cpu / BothRev 0.000011082959990744713 s 0.000011753560038414434 s 0.94
scatter_sum / DefOpt / cpu / PreRev 0.00001176368002234085 s 0.000012247980012034531 s 0.96
scatter_sum / DefOpt / cpu / PostRev 0.000011209939966647653 s 0.00001137746000495099 s 0.99
scatter_sum / DefOpt / cpu / BothRev 0.000010957640024571446 s 0.000011400659977880424 s 0.96
scatter_sum / IDefOpt / cpu / PreRev 0.000011308880038995994 s 0.00001144964000559412 s 0.99
scatter_sum / IDefOpt / cpu / PostRev 0.0000114527999812708 s 0.00001141109995842271 s 1.00
scatter_sum / IDefOpt / cpu / BothRev 0.000011729199995897944 s 0.00001149439997789159 s 1.02
scatter_sum / JaXPipe / cuda / Primal 0.000009824 s 0.000010464 s 0.94
scatter_sum / Jax / cuda / Primal 0.000009504 s 0.000009887 s 0.96
scatter_sum / HLOOpt / cuda / Primal 0.000009984 s 0.000010272 s 0.97
scatter_sum / PartOpt / cuda / Primal 0.000009664 s 0.000009887 s 0.98
scatter_sum / IPartOpt / cuda / Primal 0.000010144 s 0.000010209 s 0.99
scatter_sum / DefOpt / cuda / Primal 0.000009919 s 0.000009855 s 1.01
scatter_sum / IDefOpt / cuda / Primal 0.000009888 s 0.000010176 s 0.97
scatter_sum / JaXPipe / cuda / Forward 0.000017024 s 0.000017344 s 0.98
scatter_sum / Jax / cuda / Forward 0.00001664 s 0.000016865000000000002 s 0.99
scatter_sum / HLOOpt / cuda / Forward 0.000016672 s 0.000016576000000000002 s 1.01
scatter_sum / PartOpt / cuda / Forward 0.000016863 s 0.00001728 s 0.98
scatter_sum / IPartOpt / cuda / Forward 0.000017023 s 0.000017375999999999998 s 0.98
scatter_sum / DefOpt / cuda / Forward 0.0000168 s 0.000016768000000000003 s 1.00
scatter_sum / IDefOpt / cuda / Forward 0.000016352 s 0.00001712 s 0.96
scatter_sum / JaXPipe / cuda / PreRev 0.000016544 s 0.0000168 s 0.98
scatter_sum / JaXPipe / cuda / PostRev 0.00001648 s 0.00001744 s 0.94
scatter_sum / JaXPipe / cuda / BothRev 0.000021504 s 0.000018048 s 1.19
scatter_sum / Jax / cuda / BothRev 0.000016608 s 0.00001664 s 1.00
scatter_sum / HLOOpt / cuda / PreRev 0.000016768000000000003 s 0.000017055000000000002 s 0.98
scatter_sum / HLOOpt / cuda / PostRev 0.000016831 s 0.000016063999999999997 s 1.05
scatter_sum / HLOOpt / cuda / BothRev 0.000017472 s 0.00001664 s 1.05
scatter_sum / PartOpt / cuda / PreRev 0.00001744 s 0.000017152 s 1.02
scatter_sum / PartOpt / cuda / PostRev 0.000016736 s 0.000016832 s 0.99
scatter_sum / PartOpt / cuda / BothRev 0.0000168 s 0.000016927000000000002 s 0.99
scatter_sum / IPartOpt / cuda / PreRev 0.00001632 s 0.00001696 s 0.96
scatter_sum / IPartOpt / cuda / PostRev 0.000016607 s 0.000016768000000000003 s 0.99
scatter_sum / IPartOpt / cuda / BothRev 0.000016672 s 0.000016608 s 1.00
scatter_sum / DefOpt / cuda / PreRev 0.000016703 s 0.000017215 s 0.97
scatter_sum / DefOpt / cuda / PostRev 0.000016352 s 0.000017024 s 0.96
scatter_sum / DefOpt / cuda / BothRev 0.000016736 s 0.00001616 s 1.04
scatter_sum / IDefOpt / cuda / PreRev 0.00001696 s 0.000016832 s 1.01
scatter_sum / IDefOpt / cuda / PostRev 0.000016128 s 0.000016705 s 0.97
scatter_sum / IDefOpt / cuda / BothRev 0.000016353 s 0.00001696 s 0.96
scatter_sum / JaXPipe / tpu / Primal 0.000001342825 s 0.00000135085 s 0.99
scatter_sum / Jax / tpu / Primal 0.0000014135 s 0.000001414475 s 1.00
scatter_sum / HLOOpt / tpu / Primal 0.000001352825 s 0.00000136035 s 0.99
scatter_sum / PartOpt / tpu / Primal 0.0000014136 s 0.000001414375 s 1.00
scatter_sum / IPartOpt / tpu / Primal 0.00000135215 s 0.0000013597749999999998 s 0.99
scatter_sum / DefOpt / tpu / Primal 0.000001413525 s 0.0000014149 s 1.00
scatter_sum / IDefOpt / tpu / Primal 0.000001351825 s 0.0000013597 s 0.99
scatter_sum / JaXPipe / tpu / Forward 0.00000271625 s 0.000002709275 s 1.00
scatter_sum / Jax / tpu / Forward 0.000002733775 s 0.000002732525 s 1.00
scatter_sum / HLOOpt / tpu / Forward 0.0000027132 s 0.00000270835 s 1.00
scatter_sum / PartOpt / tpu / Forward 0.0000027002 s 0.00000269965 s 1.00
scatter_sum / IPartOpt / tpu / Forward 0.0000027129750000000005 s 0.0000027152500000000004 s 1.00
scatter_sum / DefOpt / tpu / Forward 0.0000026986500000000004 s 0.000002706425 s 1.00
scatter_sum / IDefOpt / tpu / Forward 0.0000027177 s 0.0000027088250000000003 s 1.00
scatter_sum / JaXPipe / tpu / PreRev 0.000002688625 s 0.000002704275 s 0.99
scatter_sum / JaXPipe / tpu / PostRev 0.0000026967000000000004 s 0.000002697 s 1.00
scatter_sum / JaXPipe / tpu / BothRev 0.0000027138 s 0.000002714475 s 1.00
scatter_sum / Jax / tpu / BothRev 0.0000027545 s 0.00000274965 s 1.00
scatter_sum / HLOOpt / tpu / PreRev 0.000002710075 s 0.00000272885 s 0.99
scatter_sum / HLOOpt / tpu / PostRev 0.0000027480500000000003 s 0.000002756025 s 1.00
scatter_sum / HLOOpt / tpu / BothRev 0.00000271045 s 0.00000272025 s 1.00
scatter_sum / PartOpt / tpu / PreRev 0.000002754775 s 0.00000275605 s 1.00
scatter_sum / PartOpt / tpu / PostRev 0.0000027078 s 0.0000027178 s 1.00
scatter_sum / PartOpt / tpu / BothRev 0.0000027567 s 0.0000027538750000000004 s 1.00
scatter_sum / IPartOpt / tpu / PreRev 0.000002709025 s 0.0000027201 s 1.00
scatter_sum / IPartOpt / tpu / PostRev 0.000002750975 s 0.0000027556750000000005 s 1.00
scatter_sum / IPartOpt / tpu / BothRev 0.000002707225 s 0.00000271685 s 1.00
scatter_sum / DefOpt / tpu / PreRev 0.0000027538750000000004 s 0.00000274965 s 1.00
scatter_sum / DefOpt / tpu / PostRev 0.00000270455 s 0.00000271375 s 1.00
scatter_sum / DefOpt / tpu / BothRev 0.000002749775 s 0.000002750275 s 1.00
scatter_sum / IDefOpt / tpu / PreRev 0.00000270745 s 0.0000027207 s 1.00
scatter_sum / IDefOpt / tpu / PostRev 0.000002752325 s 0.0000027518 s 1.00
scatter_sum / IDefOpt / tpu / BothRev 0.000002710975 s 0.000002717075 s 1.00
scatter_sum / JaXPipe / cpu / Primal 0.000019248000000000003 s 0.0000080933799927152 s 2.38
scatter_sum / Jax / cpu / Primal 0.000018504 s 0.00000773041997490509 s 2.39
scatter_sum / HLOOpt / cpu / Primal 0.000019428 s 0.000010982679996232036 s 1.77
scatter_sum / PartOpt / cpu / Primal 0.000019185 s 0.000007304119981199619 s 2.63
scatter_sum / IPartOpt / cpu / Primal 0.000018976 s 0.000007588260004922631 s 2.50
scatter_sum / DefOpt / cpu / Primal 0.000019284 s 0.000007340080019275774 s 2.63
scatter_sum / IDefOpt / cpu / Primal 0.00001875 s 0.000007766280004943837 s 2.41
scatter_sum / JaXPipe / cpu / Forward 0.000027811 s 0.000011774819968195516 s 2.36
scatter_sum / Jax / cpu / Forward 0.000026865 s 0.00001104679997297353 s 2.43
scatter_sum / HLOOpt / cpu / Forward 0.000026818000000000003 s 0.000016549519978070747 s 1.62
scatter_sum / PartOpt / cpu / Forward 0.000026831 s 0.000016693700017640368 s 1.61
scatter_sum / IPartOpt / cpu / Forward 0.000026879 s 0.00001139787999818509 s 2.36
scatter_sum / DefOpt / cpu / Forward 0.000026846 s 0.00001691736004431732 s 1.59
scatter_sum / IDefOpt / cpu / Forward 0.000026784 s 0.000011772519965234096 s 2.28
scatter_sum / JaXPipe / cpu / PreRev 0.000027299 s 0.000012000640008409392 s 2.27
scatter_sum / JaXPipe / cpu / PostRev 0.000027858 s 0.000011409239968998008 s 2.44
scatter_sum / JaXPipe / cpu / BothRev 0.000027888 s 0.000016204619996642577 s 1.72
scatter_sum / Jax / cpu / BothRev 0.000027651 s 0.000011913580010514124 s 2.32
scatter_sum / HLOOpt / cpu / PreRev 0.000027563 s 0.000011849580014313687 s 2.33
scatter_sum / HLOOpt / cpu / PostRev 0.00002736 s 0.00001634619997275877 s 1.67
scatter_sum / HLOOpt / cpu / BothRev 0.000027457 s 0.000013083580033708132 s 2.10
scatter_sum / PartOpt / cpu / PreRev 0.000026585 s 0.000011277140010861333 s 2.36
scatter_sum / PartOpt / cpu / PostRev 0.000026972 s 0.000011436099994170943 s 2.36
scatter_sum / PartOpt / cpu / BothRev 0.000027799 s 0.000012017060034850149 s 2.31
scatter_sum / IPartOpt / cpu / PreRev 0.000027805 s 0.00001792121997823415 s 1.55
scatter_sum / IPartOpt / cpu / PostRev 0.000027316 s 0.000011673999988488504 s 2.34
scatter_sum / IPartOpt / cpu / BothRev 0.000027128 s 0.000011753560038414434 s 2.31
scatter_sum / DefOpt / cpu / PreRev 0.000027201 s 0.000012247980012034531 s 2.22
scatter_sum / DefOpt / cpu / PostRev 0.000027904000000000003 s 0.00001137746000495099 s 2.45
scatter_sum / DefOpt / cpu / BothRev 0.000026927 s 0.000011400659977880424 s 2.36
scatter_sum / IDefOpt / cpu / PreRev 0.000027345 s 0.00001144964000559412 s 2.39
scatter_sum / IDefOpt / cpu / PostRev 0.00002788 s 0.00001141109995842271 s 2.44
scatter_sum / IDefOpt / cpu / BothRev 0.000027385 s 0.00001149439997789159 s 2.38
slicing / JaXPipe / cpu / Primal 0.000006716739999319544 s 0.000007002580023254268 s 0.96
slicing / Jax / cpu / Primal 0.00000592898000832065 s 0.00000600012003815209 s 0.99
slicing / HLOOpt / cpu / Primal 0.00001015080001707247 s 0.000009939419978763908 s 1.02
slicing / PartOpt / cpu / Primal 0.000005991240013827337 s 0.000006265100055315998 s 0.96
slicing / IPartOpt / cpu / Primal 0.000005967719998807297 s 0.000006429720006053685 s 0.93
slicing / DefOpt / cpu / Primal 0.000010611260031510027 s 0.000010525640009291236 s 1.01
slicing / IDefOpt / cpu / Primal 0.000006269440009418759 s 0.000006683040001007612 s 0.94
slicing / JaXPipe / cpu / Forward 0.000009699180027382682 s 0.00000964760001807008 s 1.01
slicing / Jax / cpu / Forward 0.000009746279965838769 s 0.000010153560006074256 s 0.96
slicing / HLOOpt / cpu / Forward 0.000013895640013288355 s 0.00001357187995381537 s 1.02
slicing / PartOpt / cpu / Forward 0.000014040380010555964 s 0.000013793300013276166 s 1.02
slicing / IPartOpt / cpu / Forward 0.00000886831999196147 s 0.000009253480029656205 s 0.96
slicing / DefOpt / cpu / Forward 0.000014229019980120938 s 0.00001335547999588016 s 1.07
slicing / IDefOpt / cpu / Forward 0.000008901359960873378 s 0.00000917934004064591 s 0.97
slicing / JaXPipe / cpu / PreRev 0.000010068740011774935 s 0.000010112960035257857 s 1.00
slicing / JaXPipe / cpu / PostRev 0.000010061659950224566 s 0.000009997219967772253 s 1.01
slicing / JaXPipe / cpu / BothRev 0.000013906099993619135 s 0.000013674820002052 s 1.02
slicing / Jax / cpu / BothRev 0.000010186320014327068 s 0.000010126220013262357 s 1.01
slicing / HLOOpt / cpu / PreRev 0.000009474439966652426 s 0.00000982377998298034 s 0.96
slicing / HLOOpt / cpu / PostRev 0.00001028783995025151 s 0.000010500840053282444 s 0.98
slicing / HLOOpt / cpu / BothRev 0.000011148860030516517 s 0.00001144218002082198 s 0.97
slicing / PartOpt / cpu / PreRev 0.000009621040017009364 s 0.00000988726000286988 s 0.97
slicing / PartOpt / cpu / PostRev 0.000010587219985609407 s 0.000010319300008632126 s 1.03
slicing / PartOpt / cpu / BothRev 0.000010054600006697 s 0.000009877820011752193 s 1.02
slicing / IPartOpt / cpu / PreRev 0.000014428679960474256 s 0.000009871659958662347 s 1.46
slicing / IPartOpt / cpu / PostRev 0.000010026840018326764 s 0.000010186019962930005 s 0.98
slicing / IPartOpt / cpu / BothRev 0.000010097339982166889 s 0.00000961596000706777 s 1.05
slicing / DefOpt / cpu / PreRev 0.00000954265995460446 s 0.000009731240015753427 s 0.98
slicing / DefOpt / cpu / PostRev 0.000010105219944307464 s 0.00001040664000356628 s 0.97
slicing / DefOpt / cpu / BothRev 0.000010099879991685155 s 0.00000972724000348535 s 1.04
slicing / IDefOpt / cpu / PreRev 0.00000962346000051184 s 0.000009798860037335545 s 0.98
slicing / IDefOpt / cpu / PostRev 0.00000974014002167678 s 0.000010772420000648709 s 0.90
slicing / IDefOpt / cpu / BothRev 0.00000984720002634276 s 0.000010212100023636597 s 0.96
slicing / JaXPipe / cuda / Primal 0.000001888 s 0.000001888 s 1
slicing / Jax / cuda / Primal 0.000001888 s 0.000001888 s 1
slicing / HLOOpt / cuda / Primal 0.000001919 s 0.000001919 s 1
slicing / PartOpt / cuda / Primal 0.000001889 s 0.000001888 s 1.00
slicing / IPartOpt / cuda / Primal 0.000001888 s 0.000001888 s 1
slicing / DefOpt / cuda / Primal 0.000001919 s 0.0000019200000000000003 s 1.00
slicing / IDefOpt / cuda / Primal 0.0000019200000000000003 s 0.0000019200000000000003 s 1
slicing / JaXPipe / cuda / Forward 0.00000944 s 0.000009376 s 1.01
slicing / Jax / cuda / Forward 0.000009856 s 0.000009728 s 1.01
slicing / HLOOpt / cuda / Forward 0.00001008 s 0.00000912 s 1.11
slicing / PartOpt / cuda / Forward 0.000009568 s 0.00000976 s 0.98
slicing / IPartOpt / cuda / Forward 0.000009824 s 0.00000944 s 1.04
slicing / DefOpt / cuda / Forward 0.000009568 s 0.000009857 s 0.97
slicing / IDefOpt / cuda / Forward 0.000009024 s 0.000009472 s 0.95
slicing / JaXPipe / cuda / PreRev 0.000009952 s 0.000009633 s 1.03
slicing / JaXPipe / cuda / PostRev 0.00000992 s 0.00001008 s 0.98
slicing / JaXPipe / cuda / BothRev 0.000010016 s 0.00000992 s 1.01
slicing / Jax / cuda / BothRev 0.00000992 s 0.000010016 s 0.99
slicing / HLOOpt / cuda / PreRev 0.000010272 s 0.0000096 s 1.07
slicing / HLOOpt / cuda / PostRev 0.000009952 s 0.000010081 s 0.99
slicing / HLOOpt / cuda / BothRev 0.000009888 s 0.0000104 s 0.95
slicing / PartOpt / cuda / PreRev 0.000009984 s 0.000010016 s 1.00
slicing / PartOpt / cuda / PostRev 0.000009888 s 0.000009952 s 0.99
slicing / PartOpt / cuda / BothRev 0.000010208 s 0.000010048 s 1.02
slicing / IPartOpt / cuda / PreRev 0.000009984 s 0.00001008 s 0.99
slicing / IPartOpt / cuda / PostRev 0.000009888 s 0.000009632 s 1.03
slicing / IPartOpt / cuda / BothRev 0.000010112 s 0.000009889 s 1.02
slicing / DefOpt / cuda / PreRev 0.000009823 s 0.000010016 s 0.98
slicing / DefOpt / cuda / PostRev 0.000009697 s 0.000009536 s 1.02
slicing / DefOpt / cuda / BothRev 0.0000096 s 0.000010112 s 0.95
slicing / IDefOpt / cuda / PreRev 0.000010208 s 0.000010208 s 1
slicing / IDefOpt / cuda / PostRev 0.00000976 s 0.000010048 s 0.97
slicing / IDefOpt / cuda / BothRev 0.00000992 s 0.000011104 s 0.89
slicing / JaXPipe / tpu / Primal 0.000001024 s 0.000001022125 s 1.00
slicing / Jax / tpu / Primal 9.74075e-7 s 9.83425e-7 s 0.99
slicing / HLOOpt / tpu / Primal 0.0000010366 s 0.000001026025 s 1.01
slicing / PartOpt / tpu / Primal 9.7105e-7 s 9.68325e-7 s 1.00
slicing / IPartOpt / tpu / Primal 0.000001037825 s 0.00000102485 s 1.01
slicing / DefOpt / tpu / Primal 9.72025e-7 s 9.74625e-7 s 1.00
slicing / IDefOpt / tpu / Primal 0.000001028425 s 0.000001027875 s 1.00
slicing / JaXPipe / tpu / Forward 0.00000141255 s 0.000001408975 s 1.00
slicing / Jax / tpu / Forward 0.000001475275 s 0.0000014854 s 0.99
slicing / HLOOpt / tpu / Forward 0.0000015242000000000002 s 0.0000015198999999999998 s 1.00
slicing / PartOpt / tpu / Forward 0.0000015037749999999998 s 0.000001496775 s 1.00
slicing / IPartOpt / tpu / Forward 0.000001523425 s 0.0000015180250000000002 s 1.00
slicing / DefOpt / tpu / Forward 0.000001495875 s 0.00000149955 s 1.00
slicing / IDefOpt / tpu / Forward 0.00000152015 s 0.00000152035 s 1.00
slicing / JaXPipe / tpu / PreRev 0.0000025674 s 0.000002570325 s 1.00
slicing / JaXPipe / tpu / PostRev 0.000002519825 s 0.0000025251 s 1.00
slicing / JaXPipe / tpu / BothRev 0.000002578775 s 0.00000258565 s 1.00
slicing / Jax / tpu / BothRev 0.000002530825 s 0.00000254905 s 0.99
slicing / HLOOpt / tpu / PreRev 0.0000025881250000000003 s 0.000002579625 s 1.00
slicing / HLOOpt / tpu / PostRev 0.0000025354500000000004 s 0.000002538575 s 1.00
slicing / HLOOpt / tpu / BothRev 0.00000258575 s 0.0000025944750000000003 s 1.00
slicing / PartOpt / tpu / PreRev 0.0000025413750000000003 s 0.00000254365 s 1.00
slicing / PartOpt / tpu / PostRev 0.00000258345 s 0.000002580025 s 1.00
slicing / PartOpt / tpu / BothRev 0.0000025367250000000003 s 0.000002537975 s 1.00
slicing / IPartOpt / tpu / PreRev 0.0000025880750000000006 s 0.00000258125 s 1.00
slicing / IPartOpt / tpu / PostRev 0.0000025395250000000005 s 0.000002543975 s 1.00
slicing / IPartOpt / tpu / BothRev 0.0000025823 s 0.0000025812 s 1.00
slicing / DefOpt / tpu / PreRev 0.000002547775 s 0.000002536625 s 1.00
slicing / DefOpt / tpu / PostRev 0.000002575075 s 0.00000258605 s 1.00
slicing / DefOpt / tpu / BothRev 0.00000253825 s 0.000002543825 s 1.00
slicing / IDefOpt / tpu / PreRev 0.000002592275 s 0.0000025917750000000003 s 1.00
slicing / IDefOpt / tpu / PostRev 0.000002537275 s 0.000002549775 s 1.00
slicing / IDefOpt / tpu / BothRev 0.0000025813 s 0.000002588275 s 1.00
slicing / JaXPipe / cpu / Primal 0.000015689000000000002 s 0.000007002580023254268 s 2.24
slicing / Jax / cpu / Primal 0.000015109 s 0.00000600012003815209 s 2.52
slicing / HLOOpt / cpu / Primal 0.000015185 s 0.000009939419978763908 s 1.53
slicing / PartOpt / cpu / Primal 0.000015294 s 0.000006265100055315998 s 2.44
slicing / IPartOpt / cpu / Primal 0.000015533 s 0.000006429720006053685 s 2.42
slicing / DefOpt / cpu / Primal 0.000015465 s 0.000010525640009291236 s 1.47
slicing / IDefOpt / cpu / Primal 0.000015288 s 0.000006683040001007612 s 2.29
slicing / JaXPipe / cpu / Forward 0.00002067 s 0.00000964760001807008 s 2.14
slicing / Jax / cpu / Forward 0.000020265 s 0.000010153560006074256 s 2.00
slicing / HLOOpt / cpu / Forward 0.000020623 s 0.00001357187995381537 s 1.52
slicing / PartOpt / cpu / Forward 0.000020228 s 0.000013793300013276166 s 1.47
slicing / IPartOpt / cpu / Forward 0.000020157 s 0.000009253480029656205 s 2.18
slicing / DefOpt / cpu / Forward 0.000020287 s 0.00001335547999588016 s 1.52
slicing / IDefOpt / cpu / Forward 0.000020499 s 0.00000917934004064591 s 2.23
slicing / JaXPipe / cpu / PreRev 0.000021722000000000003 s 0.000010112960035257857 s 2.15
slicing / JaXPipe / cpu / PostRev 0.000021033 s 0.000009997219967772253 s 2.10
slicing / JaXPipe / cpu / BothRev 0.000021313 s 0.000013674820002052 s 1.56
slicing / Jax / cpu / BothRev 0.000021313 s 0.000010126220013262357 s 2.10
slicing / HLOOpt / cpu / PreRev 0.00002168 s 0.00000982377998298034 s 2.21
slicing / HLOOpt / cpu / PostRev 0.000021214 s 0.000010500840053282444 s 2.02
slicing / HLOOpt / cpu / BothRev 0.000021834 s 0.00001144218002082198 s 1.91
slicing / PartOpt / cpu / PreRev 0.000021129 s 0.00000988726000286988 s 2.14
slicing / PartOpt / cpu / PostRev 0.000021519 s 0.000010319300008632126 s 2.09
slicing / PartOpt / cpu / BothRev 0.000021453 s 0.000009877820011752193 s 2.17
slicing / IPartOpt / cpu / PreRev 0.000021369 s 0.000009871659958662347 s 2.16
slicing / IPartOpt / cpu / PostRev 0.000021425 s 0.000010186019962930005 s 2.10
slicing / IPartOpt / cpu / BothRev 0.000021672 s 0.00000961596000706777 s 2.25
slicing / DefOpt / cpu / PreRev 0.000021211 s 0.000009731240015753427 s 2.18
slicing / DefOpt / cpu / PostRev 0.000021285 s 0.00001040664000356628 s 2.05
slicing / DefOpt / cpu / BothRev 0.000033042 s 0.00000972724000348535 s 3.40
slicing / IDefOpt / cpu / PreRev 0.000021184 s 0.000009798860037335545 s 2.16
slicing / IDefOpt / cpu / PostRev 0.000021409 s 0.000010772420000648709 s 1.99
slicing / IDefOpt / cpu / BothRev 0.000021161 s 0.000010212100023636597 s 2.07
sum / JaXPipe / cpu / Primal 0.000008477319997837185 s 0.00000784006000685622 s 1.08
sum / Jax / cpu / Primal 0.000007931499940241338 s 0.000007316700002775179 s 1.08
sum / HLOOpt / cpu / Primal 0.000011924299969905406 s 0.000010999159976563532 s 1.08
sum / PartOpt / cpu / Primal 0.000007570420020783786 s 0.00000762464000217733 s 0.99
sum / IPartOpt / cpu / Primal 0.000007981140006450005 s 0.00000783147999754874 s 1.02
sum / DefOpt / cpu / Primal 0.000012293260024307528 s 0.000011770680039262516 s 1.04
sum / IDefOpt / cpu / Primal 0.000007645960022273356 s 0.000007909559990366688 s 0.97
sum / JaXPipe / cpu / Forward 0.00001122850000683684 s 0.000011585400006879352 s 0.97
sum / Jax / cpu / Forward 0.000010824320006577182 s 0.0000113637000140443 s 0.95
sum / HLOOpt / cpu / Forward 0.000016411979995609727 s 0.000016045620050135767 s 1.02
sum / PartOpt / cpu / Forward 0.00001554558001771511 s 0.000016026719968067483 s 0.97
sum / IPartOpt / cpu / Forward 0.000011276080031166202 s 0.000010967219986923738 s 1.03
sum / DefOpt / cpu / Forward 0.000016322839992426453 s 0.000015576400019199356 s 1.05
sum / IDefOpt / cpu / Forward 0.000011488020036267698 s 0.000011566979983399506 s 0.99
sum / JaXPipe / cpu / PreRev 0.000011544139997567982 s 0.00001090174001546984 s 1.06
sum / JaXPipe / cpu / PostRev 0.000011101579948444853 s 0.000011536099991644733 s 0.96
sum / JaXPipe / cpu / BothRev 0.00001122014000429772 s 0.000010744080000222312 s 1.04
sum / Jax / cpu / BothRev 0.00001140031998147606 s 0.000011325500045131776 s 1.01
sum / HLOOpt / cpu / PreRev 0.000010597039972708444 s 0.000010893339995163842 s 0.97
sum / HLOOpt / cpu / PostRev 0.000015284199971574708 s 0.000014673560008304775 s 1.04
sum / HLOOpt / cpu / BothRev 0.000012698499995167367 s 0.000012468359982449328 s 1.02
sum / PartOpt / cpu / PreRev 0.000010363100000176927 s 0.000010480200025995145 s 0.99
sum / PartOpt / cpu / PostRev 0.000010984059999827875 s 0.000010896019994106607 s 1.01
sum / PartOpt / cpu / BothRev 0.000010877179984163376 s 0.000010456680047354894 s 1.04
sum / IPartOpt / cpu / PreRev 0.000010818119953910354 s 0.000010520199975871946 s 1.03
sum / IPartOpt / cpu / PostRev 0.000011132939989693114 s 0.00001105634001760336 s 1.01
sum / IPartOpt / cpu / BothRev 0.000010372919996370913 s 0.000010889280029005022 s 0.95
sum / DefOpt / cpu / PreRev 0.00001091736000489618 s 0.000010331519979445149 s 1.06
sum / DefOpt / cpu / PostRev 0.000010614139964673088 s 0.000010265839946441702 s 1.03
sum / DefOpt / cpu / BothRev 0.000010385520008640014 s 0.000010974580018228153 s 0.95
sum / IDefOpt / cpu / PreRev 0.000010924219977823669 s 0.000010357479986851103 s 1.05
sum / IDefOpt / cpu / PostRev 0.00001105566003388958 s 0.000010391179985163035 s 1.06
sum / IDefOpt / cpu / BothRev 0.00001053876002515608 s 0.00001077867999811133 s 0.98
sum / JaXPipe / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / Jax / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / HLOOpt / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / PartOpt / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / IPartOpt / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / DefOpt / cuda / Primal 0.00000208 s 0.00000208 s 1
sum / IDefOpt / cuda / Primal 0.000002079 s 0.00000208 s 1.00
sum / JaXPipe / cuda / Forward 0.000010048 s 0.00001008 s 1.00
sum / Jax / cuda / Forward 0.000009952 s 0.000009983 s 1.00
sum / HLOOpt / cuda / Forward 0.000009888 s 0.000010368 s 0.95
sum / PartOpt / cuda / Forward 0.000010112 s 0.000010176 s 0.99
sum / IPartOpt / cuda / Forward 0.000009471 s 0.000010209 s 0.93
sum / DefOpt / cuda / Forward 0.000010016 s 0.0000104 s 0.96
sum / IDefOpt / cuda / Forward 0.000012416 s 0.000009856 s 1.26
sum / JaXPipe / cuda / PreRev 0.00000976 s 0.00000976 s 1
sum / JaXPipe / cuda / PostRev 0.000009888 s 0.000010048 s 0.98
sum / JaXPipe / cuda / BothRev 0.000009215 s 0.00001008 s 0.91
sum / Jax / cuda / BothRev 0.000009984 s 0.000009824 s 1.02
sum / HLOOpt / cuda / PreRev 0.000009568 s 0.00000912 s 1.05
sum / HLOOpt / cuda / PostRev 0.000009472 s 0.000009567 s 0.99
sum / HLOOpt / cuda / BothRev 0.000009376 s 0.000010016 s 0.94
sum / PartOpt / cuda / PreRev 0.000009568 s 0.000009664 s 0.99
sum / PartOpt / cuda / PostRev 0.000009536 s 0.000010016 s 0.95
sum / PartOpt / cuda / BothRev 0.000009984 s 0.00001008 s 0.99
sum / IPartOpt / cuda / PreRev 0.00000976 s 0.000009729 s 1.00
sum / IPartOpt / cuda / PostRev 0.000009728 s 0.000009984 s 0.97
sum / IPartOpt / cuda / BothRev 0.000009696 s 0.00000944 s 1.03
sum / DefOpt / cuda / PreRev 0.00000944 s 0.000009632 s 0.98
sum / DefOpt / cuda / PostRev 0.000009695 s 0.00000992 s 0.98
sum / DefOpt / cuda / BothRev 0.000009376 s 0.000009408 s 1.00
sum / IDefOpt / cuda / PreRev 0.000009504 s 0.000009728 s 0.98
sum / IDefOpt / cuda / PostRev 0.000009792 s 0.000009792 s 1
sum / IDefOpt / cuda / BothRev 0.000009472 s 0.000010048 s 0.94
sum / JaXPipe / tpu / Primal 5.10075e-7 s 5.10625e-7 s 1.00
sum / Jax / tpu / Primal 5.580749999999999e-7 s 5.588000000000001e-7 s 1.00
sum / HLOOpt / tpu / Primal 5.210749999999999e-7 s 5.2155e-7 s 1.00
sum / PartOpt / tpu / Primal 5.5785e-7 s 5.58175e-7 s 1.00
sum / IPartOpt / tpu / Primal 5.24325e-7 s 5.215e-7 s 1.01
sum / DefOpt / tpu / Primal 5.58175e-7 s 5.5785e-7 s 1.00
sum / IDefOpt / tpu / Primal 5.212749999999999e-7 s 5.21375e-7 s 1.00
sum / JaXPipe / tpu / Forward 0.0000015494250000000002 s 0.000001560875 s 0.99
sum / Jax / tpu / Forward 0.000001495775 s 0.0000015087 s 0.99
sum / HLOOpt / tpu / Forward 0.0000015278 s 0.0000015303 s 1.00
sum / PartOpt / tpu / Forward 0.0000014937 s 0.000001495275 s 1.00
sum / IPartOpt / tpu / Forward 0.0000015276999999999998 s 0.00000153475 s 1.00
sum / DefOpt / tpu / Forward 0.000001492675 s 0.000001499025 s 1.00
sum / IDefOpt / tpu / Forward 0.000001527575 s 0.000001527325 s 1.00
sum / JaXPipe / tpu / PreRev 0.0000010494 s 0.000001049075 s 1.00
sum / JaXPipe / tpu / PostRev 0.000001089375 s 0.0000010885999999999998 s 1.00
sum / JaXPipe / tpu / BothRev 0.000001048975 s 0.0000010480000000000002 s 1.00
sum / Jax / tpu / BothRev 0.0000010938 s 0.000001091025 s 1.00
sum / HLOOpt / tpu / PreRev 0.0000010491999999999998 s 0.00000105175 s 1.00
sum / HLOOpt / tpu / PostRev 0.00000108985 s 0.00000108495 s 1.00
sum / HLOOpt / tpu / BothRev 0.0000010468 s 0.000001052325 s 0.99
sum / PartOpt / tpu / PreRev 0.000001083725 s 0.0000010978 s 0.99
sum / PartOpt / tpu / PostRev 0.00000105685 s 0.000001062175 s 0.99
sum / PartOpt / tpu / BothRev 0.0000010948749999999998 s 0.0000010918 s 1.00
sum / IPartOpt / tpu / PreRev 0.000001053325 s 0.00000106675 s 0.99
sum / IPartOpt / tpu / PostRev 0.00000109635 s 0.0000010903 s 1.01
sum / IPartOpt / tpu / BothRev 0.000001064525 s 0.000001048225 s 1.02
sum / DefOpt / tpu / PreRev 0.000001094325 s 0.00000110485 s 0.99
sum / DefOpt / tpu / PostRev 0.0000010501249999999998 s 0.0000010486 s 1.00
sum / DefOpt / tpu / BothRev 0.00000109 s 0.000001090675 s 1.00
sum / IDefOpt / tpu / PreRev 0.0000010482 s 0.000001056775 s 0.99
sum / IDefOpt / tpu / PostRev 0.000001086775 s 0.000001095125 s 0.99
sum / IDefOpt / tpu / BothRev 0.0000010505499999999998 s 0.000001049075 s 1.00
sum / JaXPipe / cpu / Primal 0.000017732999999999998 s 0.00000784006000685622 s 2.26
sum / Jax / cpu / Primal 0.000018069000000000003 s 0.000007316700002775179 s 2.47
sum / HLOOpt / cpu / Primal 0.000017887 s 0.000010999159976563532 s 1.63
sum / PartOpt / cpu / Primal 0.000017638 s 0.00000762464000217733 s 2.31
sum / IPartOpt / cpu / Primal 0.00001797 s 0.00000783147999754874 s 2.29
sum / DefOpt / cpu / Primal 0.000017839 s 0.000011770680039262516 s 1.52
sum / IDefOpt / cpu / Primal 0.000017911 s 0.000007909559990366688 s 2.26
sum / JaXPipe / cpu / Forward 0.000024515 s 0.000011585400006879352 s 2.12
sum / Jax / cpu / Forward 0.000023883 s 0.0000113637000140443 s 2.10
sum / HLOOpt / cpu / Forward 0.00002416 s 0.000016045620050135767 s 1.51
sum / PartOpt / cpu / Forward 0.000024595 s 0.000016026719968067483 s 1.53
sum / IPartOpt / cpu / Forward 0.000024258 s 0.000010967219986923738 s 2.21
sum / DefOpt / cpu / Forward 0.000024408 s 0.000015576400019199356 s 1.57
sum / IDefOpt / cpu / Forward 0.000024267 s 0.000011566979983399506 s 2.10
sum / JaXPipe / cpu / PreRev 0.000023244 s 0.00001090174001546984 s 2.13
sum / JaXPipe / cpu / PostRev 0.00002293 s 0.000011536099991644733 s 1.99
sum / JaXPipe / cpu / BothRev 0.000023313 s 0.000010744080000222312 s 2.17
sum / Jax / cpu / BothRev 0.000023085 s 0.000011325500045131776 s 2.04
sum / HLOOpt / cpu / PreRev 0.000023427 s 0.000010893339995163842 s 2.15
sum / HLOOpt / cpu / PostRev 0.000023423 s 0.000014673560008304775 s 1.60
sum / HLOOpt / cpu / BothRev 0.000023585 s 0.000012468359982449328 s 1.89
sum / PartOpt / cpu / PreRev 0.000023511 s 0.000010480200025995145 s 2.24
sum / PartOpt / cpu / PostRev 0.000023625 s 0.000010896019994106607 s 2.17
sum / PartOpt / cpu / BothRev 0.000023114 s 0.000010456680047354894 s 2.21
sum / IPartOpt / cpu / PreRev 0.000023069 s 0.000010520199975871946 s 2.19
sum / IPartOpt / cpu / PostRev 0.000023356 s 0.00001105634001760336 s 2.11
sum / IPartOpt / cpu / BothRev 0.00002318 s 0.000010889280029005022 s 2.13
sum / DefOpt / cpu / PreRev 0.000022857 s 0.000010331519979445149 s 2.21
sum / DefOpt / cpu / PostRev 0.000023348 s 0.000010265839946441702 s 2.27
sum / DefOpt / cpu / BothRev 0.000023691 s 0.000010974580018228153 s 2.16
sum / IDefOpt / cpu / PreRev 0.000022476 s 0.000010357479986851103 s 2.17
sum / IDefOpt / cpu / PostRev 0.000024012 s 0.000010391179985163035 s 2.31
sum / IDefOpt / cpu / BothRev 0.00002375 s 0.00001077867999811133 s 2.20
value_and_grad / JaXPipe / cpu / Primal 0.000014494239976556856 s 0.000014128279972283053 s 1.03
value_and_grad / Jax / cpu / Primal 0.000014001580002513948 s 0.000014739260004716923 s 0.95
value_and_grad / HLOOpt / cpu / Primal 0.000013488899958247202 s 0.000013682799981324932 s 0.99
value_and_grad / PartOpt / cpu / Primal 0.00001410103997841361 s 0.000013681440013897371 s 1.03
value_and_grad / IPartOpt / cpu / Primal 0.00001366767999570584 s 0.000013595520003946147 s 1.01
value_and_grad / DefOpt / cpu / Primal 0.000013524939968192484 s 0.000013901780012020026 s 0.97
value_and_grad / IDefOpt / cpu / Primal 0.00001371530001051724 s 0.00001404642001944012 s 0.98
value_and_grad / JaXPipe / cuda / Primal 0.000033119000000000006 s 0.000033249 s 1.00
value_and_grad / Jax / cuda / Primal 0.000032927 s 0.000032832 s 1.00
value_and_grad / HLOOpt / cuda / Primal 0.000032576 s 0.00003328 s 0.98
value_and_grad / PartOpt / cuda / Primal 0.000032351 s 0.000032896000000000005 s 0.98
value_and_grad / IPartOpt / cuda / Primal 0.000033024 s 0.000032992 s 1.00
value_and_grad / DefOpt / cuda / Primal 0.00003248 s 0.000032512 s 1.00
value_and_grad / IDefOpt / cuda / Primal 0.000033055 s 0.000032832 s 1.01
value_and_grad / JaXPipe / tpu / Primal 0 s 0 s 1
value_and_grad / Jax / tpu / Primal 0 s 0 s 1
value_and_grad / HLOOpt / tpu / Primal 0 s 0 s 1
value_and_grad / PartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IPartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / DefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IDefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / JaXPipe / cpu / Primal 0.000042013 s 0.000014128279972283053 s 2.97
value_and_grad / Jax / cpu / Primal 0.000027322 s 0.000014739260004716923 s 1.85
value_and_grad / HLOOpt / cpu / Primal 0.0000279 s 0.000013682799981324932 s 2.04
value_and_grad / PartOpt / cpu / Primal 0.000027473 s 0.000013681440013897371 s 2.01
value_and_grad / IPartOpt / cpu / Primal 0.000027779000000000003 s 0.000013595520003946147 s 2.04
value_and_grad / DefOpt / cpu / Primal 0.000027589 s 0.000013901780012020026 s 1.98
value_and_grad / IDefOpt / cpu / Primal 0.00002723 s 0.00001404642001944012 s 1.94
jaxmd20 / JaXPipe / cuda / Primal 0.001468535 s 0.0015381709999999 s 0.95
jaxmd20 / Jax / cuda / Primal 0.0014484379999999 s 0.001580476 s 0.92
jaxmd20 / HLOOpt / cuda / Primal 0.001142072 s 0.001078782 s 1.06
jaxmd20 / PartOpt / cuda / Primal 0.001283383 s 0.001315996 s 0.98
jaxmd20 / IPartOpt / cuda / Primal 0.0013215269999999 s 0.001333052 s 0.99
jaxmd20 / DefOpt / cuda / Primal 0.000520124 s 0.000523038 s 0.99
jaxmd20 / IDefOpt / cuda / Primal 0.000514588 s 0.000492254 s 1.05
jaxmd20 / JaXPipe / cuda / Forward 0.000828187 s 0.000816734 s 1.01
jaxmd20 / Jax / cuda / Forward 0.001789269 s 0.001811772 s 0.99
jaxmd20 / HLOOpt / cuda / Forward 0.000829979 s 0.000823358 s 1.01
jaxmd20 / PartOpt / cuda / Forward 0.000817563 s 0.0008217259999999 s 0.99
jaxmd20 / IPartOpt / cuda / Forward 0.000884282 s 0.000826653 s 1.07
jaxmd20 / DefOpt / cuda / Forward 0.000834523 s 0.000815997 s 1.02
jaxmd20 / IDefOpt / cuda / Forward 0.000815099 s 0.000817886 s 1.00
jaxmd20 / JaXPipe / cuda / PreRev 0.00166495 s 0.001678363 s 0.99
jaxmd20 / JaXPipe / cuda / PostRev 0.005326971 s 0.005320914 s 1.00
jaxmd20 / JaXPipe / cuda / BothRev 0.001658197 s 0.001663451 s 1.00
jaxmd20 / Jax / cuda / BothRev 0.005293977 s 0.005268499 s 1.00
jaxmd20 / HLOOpt / cuda / PreRev 0.001711731 s 0.001726523 s 0.99
jaxmd20 / HLOOpt / cuda / PostRev 0.005219067 s 0.00516002 s 1.01
jaxmd20 / HLOOpt / cuda / BothRev 0.00165666 s 0.001661531 s 1.00
jaxmd20 / PartOpt / cuda / PreRev 0.00175567 s 0.001712668 s 1.03
jaxmd20 / PartOpt / cuda / PostRev 0.005410015 s 0.005340304 s 1.01
jaxmd20 / PartOpt / cuda / BothRev 0.00166738 s 0.001663068 s 1.00
jaxmd20 / IPartOpt / cuda / PreRev 0.0017363699999999 s 0.001705212 s 1.02
jaxmd20 / IPartOpt / cuda / PostRev 0.0054033879999999 s 0.0053764979999999 s 1.01
jaxmd20 / IPartOpt / cuda / BothRev 0.001664406 s 0.001627164 s 1.02
jaxmd20 / DefOpt / cuda / PreRev 0.00174658 s 0.0017215309999999 s 1.01
jaxmd20 / DefOpt / cuda / PostRev 0.002723662 s 0.0027170809999999 s 1.00
jaxmd20 / DefOpt / cuda / BothRev 0.001649492 s 0.001738428 s 0.95
jaxmd20 / IDefOpt / cuda / PreRev 0.001738804 s 0.00173222 s 1.00
jaxmd20 / IDefOpt / cuda / PostRev 0.001977362 s 0.001977914 s 1.00
jaxmd20 / IDefOpt / cuda / BothRev 0.00165698 s 0.001666619 s 0.99
jaxmd20 / JaXPipe / tpu / Primal 0.009277995 s 0.0092832325 s 1.00
jaxmd20 / Jax / tpu / Primal 0.00927855375 s 0.00926823125 s 1.00
jaxmd20 / HLOOpt / tpu / Primal 0.00917455625 s 0.009165471875 s 1.00
jaxmd20 / PartOpt / tpu / Primal 0.0091971925 s 0.009195720625 s 1.00
jaxmd20 / IPartOpt / tpu / Primal 0.0092001 s 0.00920227 s 1.00
jaxmd20 / DefOpt / tpu / Primal 0.008749179375 s 0.008745300625 s 1.00
jaxmd20 / IDefOpt / tpu / Primal 0.00863421875 s 0.008635025625 s 1.00
jaxmd20 / JaXPipe / tpu / Forward 0.01726184 s 0.017267543125 s 1.00
jaxmd20 / Jax / tpu / Forward 0.01875432375 s 0.018740548125 s 1.00
jaxmd20 / HLOOpt / tpu / Forward 0.017236080625 s 0.01723612625 s 1.00
jaxmd20 / PartOpt / tpu / Forward 0.017263274375 s 0.01726744625 s 1.00
jaxmd20 / IPartOpt / tpu / Forward 0.017268051875 s 0.017262624375 s 1.00
jaxmd20 / DefOpt / tpu / Forward 0.01725859 s 0.01726469125 s 1.00
jaxmd20 / IDefOpt / tpu / Forward 0.017266670625 s 0.017266065625 s 1.00
jaxmd20 / JaXPipe / tpu / PreRev 0.025350906875 s 0.025341306875 s 1.00
jaxmd20 / JaXPipe / tpu / PostRev 0.021560056875 s 0.021869649375 s 0.99
jaxmd20 / JaXPipe / tpu / BothRev 0.025341431875 s 0.02535814125 s 1.00
jaxmd20 / Jax / tpu / BothRev 0.021876835 s 0.021872894375 s 1.00
jaxmd20 / HLOOpt / tpu / PreRev 0.02533986625 s 0.0253574406249999 s 1.00
jaxmd20 / HLOOpt / tpu / PostRev 0.0209853275 s 0.02097280625 s 1.00
jaxmd20 / HLOOpt / tpu / BothRev 0.025255084375 s 0.0252753712499999 s 1.00
jaxmd20 / PartOpt / tpu / PreRev 0.0253586406249999 s 0.0253516543749999 s 1.00
jaxmd20 / PartOpt / tpu / PostRev 0.021514848125 s 0.02152932625 s 1.00
jaxmd20 / PartOpt / tpu / BothRev 0.025273721875 s 0.025272335625 s 1.00
jaxmd20 / IPartOpt / tpu / PreRev 0.0253291575 s 0.025349014375 s 1.00
jaxmd20 / IPartOpt / tpu / PostRev 0.021520535 s 0.021520705625 s 1.00
jaxmd20 / IPartOpt / tpu / BothRev 0.025251193125 s 0.025271966875 s 1.00
jaxmd20 / DefOpt / tpu / PreRev 0.0253626824999999 s 0.025350971875 s 1.00
jaxmd20 / DefOpt / tpu / PostRev 0.0187649318749999 s 0.018796156875 s 1.00
jaxmd20 / DefOpt / tpu / BothRev 0.025278385 s 0.025270051875 s 1.00
jaxmd20 / IDefOpt / tpu / PreRev 0.0253348375 s 0.025350833125 s 1.00
jaxmd20 / IDefOpt / tpu / PostRev 0.01809764875 s 0.01810748375 s 1.00
jaxmd20 / IDefOpt / tpu / BothRev 0.02525376375 s 0.02526988125 s 1.00
jaxmd40 / JaXPipe / cpu / Primal 0.0895589899999999 s 0.064794319 s 1.38
jaxmd40 / Jax / cpu / Primal 0.070354388 s 0.060799723 s 1.16
jaxmd40 / HLOOpt / cpu / Primal 0.0988557229999999 s 0.084688236 s 1.17
jaxmd40 / PartOpt / cpu / Primal 0.089056278 s 0.064421721 s 1.38
jaxmd40 / IPartOpt / cpu / Primal 0.0858872449999999 s 0.070357754 s 1.22
jaxmd40 / DefOpt / cpu / Primal 0.1080280999999999 s 0.086803284 s 1.24
jaxmd40 / IDefOpt / cpu / Primal 0.097163111 s 0.079787198 s 1.22
jaxmd40 / JaXPipe / cpu / Forward 0.1999315599999999 s 0.155879245 s 1.28
jaxmd40 / Jax / cpu / Forward 0.106891377 s 0.08651161 s 1.24
jaxmd40 / HLOOpt / cpu / Forward 0.1949410939999999 s 0.159918111 s 1.22
jaxmd40 / PartOpt / cpu / Forward 0.197680442 s 0.157475402 s 1.26
jaxmd40 / IPartOpt / cpu / Forward 0.205484542 s 0.156772428 s 1.31
jaxmd40 / DefOpt / cpu / Forward 0.202310212 s 0.167871725 s 1.21
jaxmd40 / IDefOpt / cpu / Forward 0.20375368 s 0.15935788 s 1.28
jaxmd40 / JaXPipe / cpu / PreRev 0.256641134 s 0.229345345 s 1.12
jaxmd40 / JaXPipe / cpu / PostRev 0.155128748 s 0.138166208 s 1.12
jaxmd40 / JaXPipe / cpu / BothRev 0.251759801 s 0.220160343 s 1.14
jaxmd40 / Jax / cpu / BothRev 0.178771298 s 0.134181835 s 1.33
jaxmd40 / HLOOpt / cpu / PreRev 0.248316298 s 0.207645172 s 1.20
jaxmd40 / HLOOpt / cpu / PostRev 0.213463205 s 0.174557981 s 1.22
jaxmd40 / HLOOpt / cpu / BothRev 0.276987482 s 0.23236999 s 1.19
jaxmd40 / PartOpt / cpu / PreRev 0.2774425 s 0.227327331 s 1.22
jaxmd40 / PartOpt / cpu / PostRev 0.148465212 s 0.131031843 s 1.13
jaxmd40 / PartOpt / cpu / BothRev 0.299941758 s 0.2489365919999999 s 1.20
jaxmd40 / IPartOpt / cpu / PreRev 0.258336119 s 0.231753635 s 1.11
jaxmd40 / IPartOpt / cpu / PostRev 0.150197237 s 0.139559405 s 1.08
jaxmd40 / IPartOpt / cpu / BothRev 0.296021058 s 0.252466919 s 1.17
jaxmd40 / DefOpt / cpu / PreRev 0.255458953 s 0.21722004 s 1.18
jaxmd40 / DefOpt / cpu / PostRev 0.197120848 s 0.168523171 s 1.17
jaxmd40 / DefOpt / cpu / BothRev 0.268880939 s 0.256271871 s 1.05
jaxmd40 / IDefOpt / cpu / PreRev 0.264546132 s 0.230775248 s 1.15
jaxmd40 / IDefOpt / cpu / PostRev 0.204063954 s 0.167811243 s 1.22
jaxmd40 / IDefOpt / cpu / BothRev 0.288346569 s 0.2396261509999999 s 1.20
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal 1.7036897489999998 s 1.704613862 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal 1.706865788 s 1.707337545 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal 1.717634159 s 1.71679985 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal 1.69872664 s 1.699223816 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal 1.697451992 s 1.697202868 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal 1.666030468 s 1.667867349 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal 1.911652507 s 1.913616855 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal 3.92906063625 s 4.01163404 s 0.98
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal 3.03864475625 s 3.03868473125 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal 3.12119580875 s 3.121042244375 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal 3.059036788125 s 3.05902196125 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal 3.05900937875 s 3.059123505625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal 2.102641335 s 2.10262360875 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal 4.35619512875 s 4.356125429375 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal 6.880717086 s 5.964431313 s 1.15
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal 6.784770517 s 5.889115877 s 1.15
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal 6.813897442 s 5.979979623999999 s 1.14
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal 6.817683051 s 5.929690399 s 1.15
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal 6.936965362 s 5.923135273 s 1.17
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal 2.755515989 s 2.22214844 s 1.24
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal 7.511662121 s 6.30221173 s 1.19

This comment was automatically generated by workflow using github-action-benchmark.

@wsmoses
Copy link
Member

wsmoses commented Jan 26, 2026

@jumerckx sorry this again hit rebase nonsense

@wsmoses wsmoses merged commit bdc475a into main Jan 26, 2026
22 of 25 checks passed
@wsmoses wsmoses deleted the jm/multislice_opt branch January 26, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants