Skip to content

Conversation

@avik-pal
Copy link
Collaborator

@avik-pal avik-pal commented Jan 7, 2026

No description provided.

@avik-pal avik-pal force-pushed the ap/neuro_benchmark branch 7 times, most recently from 94c8aa8 to 9f60d0f Compare January 7, 2026 05:32
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnzymeJAX Benchmarks

Details
Benchmark suite Current: c50080b Previous: 7a7d4f5 Ratio
actmtch / JaXPipe / cuda / Primal 0.0000024 s 0.000002015 s 1.19
actmtch / Jax / cuda / Primal 0.0000024 s 0.000002015 s 1.19
actmtch / HLOOpt / cuda / Primal 0.0000024 s 0.000001984 s 1.21
actmtch / PartOpt / cuda / Primal 0.000002399 s 0.000002015 s 1.19
actmtch / IPartOpt / cuda / Primal 0.0000024 s 0.000002015 s 1.19
actmtch / DefOpt / cuda / Primal 0.0000024 s 0.000002015 s 1.19
actmtch / IDefOpt / cuda / Primal 0.0000024 s 0.000002015 s 1.19
actmtch / JaXPipe / cuda / Forward 0.000010528 s 0.00000944 s 1.12
actmtch / Jax / cuda / Forward 0.000010272 s 0.0000096 s 1.07
actmtch / HLOOpt / cuda / Forward 0.000010688 s 0.000009664 s 1.11
actmtch / PartOpt / cuda / Forward 0.000010495 s 0.000009568 s 1.10
actmtch / IPartOpt / cuda / Forward 0.000010656 s 0.000009888 s 1.08
actmtch / DefOpt / cuda / Forward 0.000010431 s 0.000009888 s 1.05
actmtch / IDefOpt / cuda / Forward 0.000010272 s 0.000009824 s 1.05
actmtch / JaXPipe / cuda / PreRev 0.000010432 s 0.000009247 s 1.13
actmtch / JaXPipe / cuda / PostRev 0.00001088 s 0.0000096 s 1.13
actmtch / JaXPipe / cuda / BothRev 0.000010592 s 0.000009664 s 1.10
actmtch / Jax / cuda / BothRev 0.00001072 s 0.000009888 s 1.08
actmtch / HLOOpt / cuda / PreRev 0.000010368 s 0.000010208 s 1.02
actmtch / HLOOpt / cuda / PostRev 0.000010432 s 0.000009984 s 1.04
actmtch / HLOOpt / cuda / BothRev 0.000010687 s 0.000010144 s 1.05
actmtch / PartOpt / cuda / PreRev 0.0000104 s 0.000009823 s 1.06
actmtch / PartOpt / cuda / PostRev 0.000010656 s 0.000009952 s 1.07
actmtch / PartOpt / cuda / BothRev 0.000010784 s 0.00000976 s 1.10
actmtch / IPartOpt / cuda / PreRev 0.000010688 s 0.000010176 s 1.05
actmtch / IPartOpt / cuda / PostRev 0.000012928 s 0.000009696 s 1.33
actmtch / IPartOpt / cuda / BothRev 0.000010783 s 0.000009568 s 1.13
actmtch / DefOpt / cuda / PreRev 0.000010528 s 0.000009792 s 1.08
actmtch / DefOpt / cuda / PostRev 0.00001072 s 0.00000992 s 1.08
actmtch / DefOpt / cuda / BothRev 0.000010944 s 0.000010016 s 1.09
actmtch / IDefOpt / cuda / PreRev 0.00001072 s 0.000010112 s 1.06
actmtch / IDefOpt / cuda / PostRev 0.000010464 s 0.000009696 s 1.08
actmtch / IDefOpt / cuda / BothRev 0.000010304 s 0.000009824 s 1.05
actmtch / JaXPipe / tpu / Primal 5.825e-7 s 5.64025e-7 s 1.03
actmtch / Jax / tpu / Primal 5.63075e-7 s 5.96725e-7 s 0.94
actmtch / HLOOpt / tpu / Primal 0.0000021651 s 0.00000209535 s 1.03
actmtch / PartOpt / tpu / Primal 5.63225e-7 s 5.96875e-7 s 0.94
actmtch / IPartOpt / tpu / Primal 5.752999999999999e-7 s 5.527750000000001e-7 s 1.04
actmtch / DefOpt / tpu / Primal 0.000002062275 s 0.0000021667 s 0.95
actmtch / IDefOpt / tpu / Primal 0.00000217415 s 0.000002095325 s 1.04
actmtch / JaXPipe / tpu / Forward 0.000003857625 s 0.000003823775 s 1.01
actmtch / Jax / tpu / Forward 0.0000012321749999999998 s 0.000001206075 s 1.02
actmtch / HLOOpt / tpu / Forward 0.000003682525 s 0.000003932450000000001 s 0.94
actmtch / PartOpt / tpu / Forward 0.000003892375 s 0.000003922824999999999 s 0.99
actmtch / IPartOpt / tpu / Forward 0.0000036728 s 0.00000393125 s 0.93
actmtch / DefOpt / tpu / Forward 0.000003901075 s 0.00000391195 s 1.00
actmtch / IDefOpt / tpu / Forward 0.00000367745 s 0.00000394 s 0.93
actmtch / JaXPipe / tpu / PreRev 0.000003750075 s 0.000003474025 s 1.08
actmtch / JaXPipe / tpu / PostRev 0.0000016241749999999998 s 0.000001645425 s 0.99
actmtch / JaXPipe / tpu / BothRev 0.000003748425 s 0.000003475325 s 1.08
actmtch / Jax / tpu / BothRev 0.000001617025 s 0.000001642275 s 0.98
actmtch / HLOOpt / tpu / PreRev 0.000003748775 s 0.000003480675 s 1.08
actmtch / HLOOpt / tpu / PostRev 0.000003434325 s 0.00000340625 s 1.01
actmtch / HLOOpt / tpu / BothRev 0.00000374765 s 0.000003476675 s 1.08
actmtch / PartOpt / tpu / PreRev 0.0000034422750000000004 s 0.000003408575 s 1.01
actmtch / PartOpt / tpu / PostRev 0.0000016636 s 0.00000159245 s 1.04
actmtch / PartOpt / tpu / BothRev 0.00000343615 s 0.0000034163500000000003 s 1.01
actmtch / IPartOpt / tpu / PreRev 0.0000037365 s 0.000003466875 s 1.08
actmtch / IPartOpt / tpu / PostRev 0.00000162335 s 0.0000016414 s 0.99
actmtch / IPartOpt / tpu / BothRev 0.00000374855 s 0.00000348575 s 1.08
actmtch / DefOpt / tpu / PreRev 0.000003451725 s 0.00000341775 s 1.01
actmtch / DefOpt / tpu / PostRev 0.0000036698 s 0.000003414525 s 1.07
actmtch / DefOpt / tpu / BothRev 0.000003453625 s 0.00000341335 s 1.01
actmtch / IDefOpt / tpu / PreRev 0.000003749975 s 0.000003470725 s 1.08
actmtch / IDefOpt / tpu / PostRev 0.0000034603 s 0.0000034141 s 1.01
actmtch / IDefOpt / tpu / BothRev 0.0000037466 s 0.000003471675 s 1.08
actmtch / JaXPipe / cpu / Primal 0.000013289 s 0.000006552539998665452 s 2.03
actmtch / Jax / cpu / Primal 0.000013356 s 0.000006462679994001519 s 2.07
actmtch / HLOOpt / cpu / Primal 0.000014082 s 0.000007385220033029327 s 1.91
actmtch / PartOpt / cpu / Primal 0.000013368 s 0.000006657979920419166 s 2.01
actmtch / IPartOpt / cpu / Primal 0.000013363 s 0.0000066309799694863614 s 2.02
actmtch / DefOpt / cpu / Primal 0.000014032 s 0.000007643220014870166 s 1.84
actmtch / IDefOpt / cpu / Primal 0.000013959 s 0.000006935199999134056 s 2.01
actmtch / JaXPipe / cpu / Forward 0.000019163 s 0.000010832880034286064 s 1.77
actmtch / Jax / cpu / Forward 0.000018009 s 0.00000932088001718512 s 1.93
actmtch / HLOOpt / cpu / Forward 0.000018973 s 0.000010819079961947865 s 1.75
actmtch / PartOpt / cpu / Forward 0.000019031 s 0.000010567340013949434 s 1.80
actmtch / IPartOpt / cpu / Forward 0.000018797 s 0.000010711440027080245 s 1.75
actmtch / DefOpt / cpu / Forward 0.000019135 s 0.000011212999943381874 s 1.71
actmtch / IDefOpt / cpu / Forward 0.000019065 s 0.000010483019959792727 s 1.82
actmtch / JaXPipe / cpu / PreRev 0.000019475 s 0.00001079392011888558 s 1.80
actmtch / JaXPipe / cpu / PostRev 0.000017492 s 0.00001015964004182024 s 1.72
actmtch / JaXPipe / cpu / BothRev 0.000018777 s 0.000011096060025010956 s 1.69
actmtch / Jax / cpu / BothRev 0.000017638 s 0.000009553080035402672 s 1.85
actmtch / HLOOpt / cpu / PreRev 0.0000193 s 0.000011104160057584522 s 1.74
actmtch / HLOOpt / cpu / PostRev 0.000019129 s 0.000013026439937675604 s 1.47
actmtch / HLOOpt / cpu / BothRev 0.000019211 s 0.000010503799985599472 s 1.83
actmtch / PartOpt / cpu / PreRev 0.000019094 s 0.00001085412002794328 s 1.76
actmtch / PartOpt / cpu / PostRev 0.000017708 s 0.000009770079923328013 s 1.81
actmtch / PartOpt / cpu / BothRev 0.000019192 s 0.00001131148008425953 s 1.70
actmtch / IPartOpt / cpu / PreRev 0.000019211 s 0.000010512220051168697 s 1.83
actmtch / IPartOpt / cpu / PostRev 0.000017601 s 0.000009670899926277345 s 1.82
actmtch / IPartOpt / cpu / BothRev 0.000019384 s 0.00001107394004066009 s 1.75
actmtch / DefOpt / cpu / PreRev 0.000019259 s 0.000010589979974611196 s 1.82
actmtch / DefOpt / cpu / PostRev 0.000018828 s 0.000011556600020412588 s 1.63
actmtch / DefOpt / cpu / BothRev 0.000019082 s 0.000010426920071040512 s 1.83
actmtch / IDefOpt / cpu / PreRev 0.000019419 s 0.000010652239907358308 s 1.82
actmtch / IDefOpt / cpu / PostRev 0.000018921 s 0.00001113797998186783 s 1.70
actmtch / IDefOpt / cpu / BothRev 0.00001903 s 0.00001050803997713956 s 1.81
add_one / JaXPipe / cuda / Primal 0.000002335 s 0.000001919 s 1.22
add_one / Jax / cuda / Primal 0.000002335 s 0.0000019200000000000003 s 1.22
add_one / HLOOpt / cuda / Primal 0.000002304 s 0.0000019200000000000003 s 1.20
add_one / PartOpt / cuda / Primal 0.000002335 s 0.000001919 s 1.22
add_one / IPartOpt / cuda / Primal 0.000002304 s 0.0000019200000000000003 s 1.20
add_one / DefOpt / cuda / Primal 0.000002335 s 0.000001919 s 1.22
add_one / IDefOpt / cuda / Primal 0.000002335 s 0.0000019200000000000003 s 1.22
add_one / JaXPipe / cuda / Forward 0.000010784 s 0.0000096 s 1.12
add_one / Jax / cuda / Forward 0.000010592 s 0.000010303 s 1.03
add_one / HLOOpt / cuda / Forward 0.00001056 s 0.000010016 s 1.05
add_one / PartOpt / cuda / Forward 0.000010592 s 0.000009248 s 1.15
add_one / IPartOpt / cuda / Forward 0.000010656 s 0.000009856 s 1.08
add_one / DefOpt / cuda / Forward 0.000010656 s 0.00000944 s 1.13
add_one / IDefOpt / cuda / Forward 0.000010529 s 0.000009536 s 1.10
add_one / JaXPipe / cuda / PreRev 0.000024511000000000003 s 0.000024704 s 0.99
add_one / JaXPipe / cuda / PostRev 0.0000248 s 0.000024415 s 1.02
add_one / JaXPipe / cuda / BothRev 0.000024672 s 0.000024576 s 1.00
add_one / Jax / cuda / BothRev 0.000025407 s 0.000024255 s 1.05
add_one / HLOOpt / cuda / PreRev 0.000025664 s 0.000024896 s 1.03
add_one / HLOOpt / cuda / PostRev 0.000025151 s 0.000023936 s 1.05
add_one / HLOOpt / cuda / BothRev 0.00002512 s 0.00002448 s 1.03
add_one / PartOpt / cuda / PreRev 0.000025248 s 0.000024159 s 1.05
add_one / PartOpt / cuda / PostRev 0.000025472000000000003 s 0.000024448 s 1.04
add_one / PartOpt / cuda / BothRev 0.000029152 s 0.000024479 s 1.19
add_one / IPartOpt / cuda / PreRev 0.000025567 s 0.000024864 s 1.03
add_one / IPartOpt / cuda / PostRev 0.000025312 s 0.00002464 s 1.03
add_one / IPartOpt / cuda / BothRev 0.000028608 s 0.000025312 s 1.13
add_one / DefOpt / cuda / PreRev 0.000025247 s 0.000024768 s 1.02
add_one / DefOpt / cuda / PostRev 0.000024832 s 0.000024704 s 1.01
add_one / DefOpt / cuda / BothRev 0.000025345 s 0.000024575 s 1.03
add_one / IDefOpt / cuda / PreRev 0.000024608 s 0.000025024 s 0.98
add_one / IDefOpt / cuda / PostRev 0.000025152 s 0.000024352 s 1.03
add_one / IDefOpt / cuda / BothRev 0.000025056 s 0.000024416 s 1.03
add_one / JaXPipe / tpu / Primal 0.0000014445250000000002 s 0.0000014285000000000002 s 1.01
add_one / Jax / tpu / Primal 0.000001445025 s 0.00000140035 s 1.03
add_one / HLOOpt / tpu / Primal 0.0000014499 s 0.0000014297499999999995 s 1.01
add_one / PartOpt / tpu / Primal 0.0000014558499999999998 s 0.0000014069999999999998 s 1.03
add_one / IPartOpt / tpu / Primal 0.0000014522999999999998 s 0.00000142595 s 1.02
add_one / DefOpt / tpu / Primal 0.0000014532 s 0.000001403875 s 1.04
add_one / IDefOpt / tpu / Primal 0.000001451075 s 0.00000142315 s 1.02
add_one / JaXPipe / tpu / Forward 0.0000019067 s 0.000001847025 s 1.03
add_one / Jax / tpu / Forward 0.00000186305 s 0.000001833725 s 1.02
add_one / HLOOpt / tpu / Forward 0.000001909625 s 0.000001849325 s 1.03
add_one / PartOpt / tpu / Forward 0.000001876975 s 0.00000184025 s 1.02
add_one / IPartOpt / tpu / Forward 0.000001903575 s 0.0000018542 s 1.03
add_one / DefOpt / tpu / Forward 0.000001864725 s 0.000001840125 s 1.01
add_one / IDefOpt / tpu / Forward 0.000001904225 s 0.0000018431 s 1.03
add_one / JaXPipe / tpu / PreRev 0.0000022687 s 0.000002231 s 1.02
add_one / JaXPipe / tpu / PostRev 0.0000022959 s 0.0000022341000000000003 s 1.03
add_one / JaXPipe / tpu / BothRev 0.000002259 s 0.00000223525 s 1.01
add_one / Jax / tpu / BothRev 0.00000230265 s 0.000002241375 s 1.03
add_one / HLOOpt / tpu / PreRev 0.0000022548 s 0.000002234475 s 1.01
add_one / HLOOpt / tpu / PostRev 0.0000022925 s 0.0000022342 s 1.03
add_one / HLOOpt / tpu / BothRev 0.0000022592000000000003 s 0.000002238875 s 1.01
add_one / PartOpt / tpu / PreRev 0.000002293675 s 0.000002242175 s 1.02
add_one / PartOpt / tpu / PostRev 0.000002279675 s 0.00000223735 s 1.02
add_one / PartOpt / tpu / BothRev 0.000002292 s 0.000002239825 s 1.02
add_one / IPartOpt / tpu / PreRev 0.000002259675 s 0.0000022381750000000004 s 1.01
add_one / IPartOpt / tpu / PostRev 0.00000228755 s 0.0000022416 s 1.02
add_one / IPartOpt / tpu / BothRev 0.00000225615 s 0.000002238225 s 1.01
add_one / DefOpt / tpu / PreRev 0.000002300375 s 0.000002236725 s 1.03
add_one / DefOpt / tpu / PostRev 0.0000022613 s 0.0000022327 s 1.01
add_one / DefOpt / tpu / BothRev 0.0000022898 s 0.00000224765 s 1.02
add_one / IDefOpt / tpu / PreRev 0.0000022613 s 0.00000223625 s 1.01
add_one / IDefOpt / tpu / PostRev 0.000002292575 s 0.0000022397 s 1.02
add_one / IDefOpt / tpu / BothRev 0.00000226415 s 0.000002248125 s 1.01
add_one / JaXPipe / cpu / Primal 0.000013037 s 0.000006873860002087895 s 1.90
add_one / Jax / cpu / Primal 0.000013089 s 0.000006641540094278752 s 1.97
add_one / HLOOpt / cpu / Primal 0.000012676 s 0.000006734419948770665 s 1.88
add_one / PartOpt / cpu / Primal 0.000012891 s 0.000006703060043946607 s 1.92
add_one / IPartOpt / cpu / Primal 0.000012876 s 0.000007102279996615835 s 1.81
add_one / DefOpt / cpu / Primal 0.000012823 s 0.000006567140026163543 s 1.95
add_one / IDefOpt / cpu / Primal 0.000012738 s 0.000006652760075667174 s 1.91
add_one / JaXPipe / cpu / Forward 0.000017662 s 0.000009910600037983386 s 1.78
add_one / Jax / cpu / Forward 0.000017507 s 0.0000099407600282575 s 1.76
add_one / HLOOpt / cpu / Forward 0.000017684 s 0.000010102879969053902 s 1.75
add_one / PartOpt / cpu / Forward 0.000017622 s 0.000010000159909395734 s 1.76
add_one / IPartOpt / cpu / Forward 0.000017911 s 0.000010069060044770594 s 1.78
add_one / DefOpt / cpu / Forward 0.000017597 s 0.000010091080039273947 s 1.74
add_one / IDefOpt / cpu / Forward 0.000017406999999999998 s 0.000010010940022766593 s 1.74
add_one / JaXPipe / cpu / PreRev 0.000020017 s 0.00001154700001279707 s 1.73
add_one / JaXPipe / cpu / PostRev 0.00001979 s 0.000010986779925588052 s 1.80
add_one / JaXPipe / cpu / BothRev 0.000019905 s 0.00001186152005175245 s 1.68
add_one / Jax / cpu / BothRev 0.000019558 s 0.000011301159975118936 s 1.73
add_one / HLOOpt / cpu / PreRev 0.000019616 s 0.000011737559998437065 s 1.67
add_one / HLOOpt / cpu / PostRev 0.000019794 s 0.000013497480031219311 s 1.47
add_one / HLOOpt / cpu / BothRev 0.000019735 s 0.000011520519910845904 s 1.71
add_one / PartOpt / cpu / PreRev 0.000019721 s 0.000011383839992049615 s 1.73
add_one / PartOpt / cpu / PostRev 0.000019936 s 0.0000113441400208103 s 1.76
add_one / PartOpt / cpu / BothRev 0.000019643 s 0.000011659819992928531 s 1.68
add_one / IPartOpt / cpu / PreRev 0.000019797 s 0.00001109857996198116 s 1.78
add_one / IPartOpt / cpu / PostRev 0.000019677 s 0.000011195859951840247 s 1.76
add_one / IPartOpt / cpu / BothRev 0.000019518 s 0.000011110880095657192 s 1.76
add_one / DefOpt / cpu / PreRev 0.000019648 s 0.00001151215998106636 s 1.71
add_one / DefOpt / cpu / PostRev 0.000019787 s 0.00001111472003685776 s 1.78
add_one / DefOpt / cpu / BothRev 0.000019901 s 0.00001123659993027104 s 1.77
add_one / IDefOpt / cpu / PreRev 0.000019775 s 0.000011117619960714364 s 1.78
add_one / IDefOpt / cpu / PostRev 0.000019846 s 0.000012493279918999178 s 1.59
add_one / IDefOpt / cpu / BothRev 0.000019596 s 0.00001183342010335764 s 1.66
add_two / JaXPipe / cuda / Primal 0.000002431 s 0.000001887 s 1.29
add_two / Jax / cuda / Primal 0.000002432 s 0.000001887 s 1.29
add_two / HLOOpt / cuda / Primal 0.000002431 s 0.000001887 s 1.29
add_two / PartOpt / cuda / Primal 0.000002431 s 0.000001887 s 1.29
add_two / IPartOpt / cuda / Primal 0.000002432 s 0.000001887 s 1.29
add_two / DefOpt / cuda / Primal 0.000002431 s 0.000001887 s 1.29
add_two / IDefOpt / cuda / Primal 0.000002431 s 0.000001887 s 1.29
add_two / JaXPipe / cuda / Forward 0.000011329 s 0.00000992 s 1.14
add_two / Jax / cuda / Forward 0.000011104 s 0.000009472 s 1.17
add_two / HLOOpt / cuda / Forward 0.000011328 s 0.000009696 s 1.17
add_two / PartOpt / cuda / Forward 0.000010592 s 0.00000976 s 1.09
add_two / IPartOpt / cuda / Forward 0.000010272 s 0.00000944 s 1.09
add_two / DefOpt / cuda / Forward 0.000010528 s 0.000009791 s 1.08
add_two / IDefOpt / cuda / Forward 0.000012864 s 0.000009408 s 1.37
add_two / JaXPipe / cuda / PreRev 0.000031743 s 0.000032063 s 0.99
add_two / JaXPipe / cuda / PostRev 0.000031999 s 0.000031231 s 1.02
add_two / JaXPipe / cuda / BothRev 0.000031904000000000005 s 0.000031903 s 1.00
add_two / Jax / cuda / BothRev 0.000032192 s 0.00003184 s 1.01
add_two / HLOOpt / cuda / PreRev 0.000032672 s 0.000032352 s 1.01
add_two / HLOOpt / cuda / PostRev 0.000032160000000000004 s 0.000031456 s 1.02
add_two / HLOOpt / cuda / BothRev 0.000031967 s 0.000032928 s 0.97
add_two / PartOpt / cuda / PreRev 0.000032704 s 0.000032864 s 1.00
add_two / PartOpt / cuda / PostRev 0.000031487 s 0.000032544 s 0.97
add_two / PartOpt / cuda / BothRev 0.000032095 s 0.000031743 s 1.01
add_two / IPartOpt / cuda / PreRev 0.000033087 s 0.000031968 s 1.04
add_two / IPartOpt / cuda / PostRev 0.000031936 s 0.000032672 s 0.98
add_two / IPartOpt / cuda / BothRev 0.000032384 s 0.00003648 s 0.89
add_two / DefOpt / cuda / PreRev 0.000032416 s 0.000036671 s 0.88
add_two / DefOpt / cuda / PostRev 0.000031776 s 0.000035839 s 0.89
add_two / DefOpt / cuda / BothRev 0.000031839 s 0.000035776000000000004 s 0.89
add_two / IDefOpt / cuda / PreRev 0.000032672 s 0.000036064 s 0.91
add_two / IDefOpt / cuda / PostRev 0.00003184 s 0.000036415 s 0.87
add_two / IDefOpt / cuda / BothRev 0.000032351 s 0.000032096 s 1.01
add_two / JaXPipe / tpu / Primal 0.0000013981 s 0.00000143495 s 0.97
add_two / Jax / tpu / Primal 0.000001456275 s 0.000001478325 s 0.99
add_two / HLOOpt / tpu / Primal 0.00000139505 s 0.000001433025 s 0.97
add_two / PartOpt / tpu / Primal 0.0000014438999999999998 s 0.00000147025 s 0.98
add_two / IPartOpt / tpu / Primal 0.000001387125 s 0.0000014277500000000002 s 0.97
add_two / DefOpt / tpu / Primal 0.0000014413 s 0.00000147415 s 0.98
add_two / IDefOpt / tpu / Primal 0.0000013894249999999998 s 0.0000014370250000000002 s 0.97
add_two / JaXPipe / tpu / Forward 0.00000180235 s 0.00000183055 s 0.98
add_two / Jax / tpu / Forward 0.0000017926 s 0.000001831125 s 0.98
add_two / HLOOpt / tpu / Forward 0.000001808325 s 0.00000182945 s 0.99
add_two / PartOpt / tpu / Forward 0.0000017868249999999998 s 0.0000018309 s 0.98
add_two / IPartOpt / tpu / Forward 0.000001808875 s 0.000001826875 s 0.99
add_two / DefOpt / tpu / Forward 0.00000179785 s 0.00000182775 s 0.98
add_two / IDefOpt / tpu / Forward 0.00000181895 s 0.000001824025 s 1.00
add_two / JaXPipe / tpu / PreRev 0.000002802125 s 0.00000284975 s 0.98
add_two / JaXPipe / tpu / PostRev 0.00000272 s 0.0000027693000000000003 s 0.98
add_two / JaXPipe / tpu / BothRev 0.000002796175 s 0.0000028433000000000004 s 0.98
add_two / Jax / tpu / BothRev 0.00000273065 s 0.0000027424 s 1.00
add_two / HLOOpt / tpu / PreRev 0.000002800275 s 0.000002847575 s 0.98
add_two / HLOOpt / tpu / PostRev 0.000002728175 s 0.0000027530500000000003 s 0.99
add_two / HLOOpt / tpu / BothRev 0.0000027923500000000005 s 0.000002843875 s 0.98
add_two / PartOpt / tpu / PreRev 0.000002728075 s 0.000002747275 s 0.99
add_two / PartOpt / tpu / PostRev 0.000002802025 s 0.0000028402500000000005 s 0.99
add_two / PartOpt / tpu / BothRev 0.000002723475 s 0.00000274815 s 0.99
add_two / IPartOpt / tpu / PreRev 0.0000028019 s 0.00000284545 s 0.98
add_two / IPartOpt / tpu / PostRev 0.0000027226 s 0.000002744725 s 0.99
add_two / IPartOpt / tpu / BothRev 0.000002793775 s 0.000002844925 s 0.98
add_two / DefOpt / tpu / PreRev 0.0000027136500000000003 s 0.000002748175 s 0.99
add_two / DefOpt / tpu / PostRev 0.00000281775 s 0.000002833925 s 0.99
add_two / DefOpt / tpu / BothRev 0.000002731175 s 0.000002753225 s 0.99
add_two / IDefOpt / tpu / PreRev 0.0000027999 s 0.00000283975 s 0.99
add_two / IDefOpt / tpu / PostRev 0.0000027187 s 0.000002749175 s 0.99
add_two / IDefOpt / tpu / BothRev 0.000002792825 s 0.0000028428750000000003 s 0.98
add_two / JaXPipe / cpu / Primal 0.000013538 s 0.000006639200055360561 s 2.04
add_two / Jax / cpu / Primal 0.000013323 s 0.000006919120005477453 s 1.93
add_two / HLOOpt / cpu / Primal 0.000013342 s 0.000007400899976346409 s 1.80
add_two / PartOpt / cpu / Primal 0.000013227 s 0.000007223580032587052 s 1.83
add_two / IPartOpt / cpu / Primal 0.000013258 s 0.000007824360091035487 s 1.69
add_two / DefOpt / cpu / Primal 0.000013431 s 0.000006683000028715469 s 2.01
add_two / IDefOpt / cpu / Primal 0.000013081 s 0.00000732392003556015 s 1.79
add_two / JaXPipe / cpu / Forward 0.000017773 s 0.00001016425998386694 s 1.75
add_two / Jax / cpu / Forward 0.000017926 s 0.000010257819976686733 s 1.75
add_two / HLOOpt / cpu / Forward 0.000017763000000000003 s 0.000010419079990242608 s 1.70
add_two / PartOpt / cpu / Forward 0.000017887 s 0.00001018144001136534 s 1.76
add_two / IPartOpt / cpu / Forward 0.000017976 s 0.000010250659961457132 s 1.75
add_two / DefOpt / cpu / Forward 0.000018078 s 0.00001075334002962336 s 1.68
add_two / IDefOpt / cpu / Forward 0.000017896 s 0.000010367140021116938 s 1.73
add_two / JaXPipe / cpu / PreRev 0.000023586 s 0.000013726959950872695 s 1.72
add_two / JaXPipe / cpu / PostRev 0.000023318 s 0.000013312379996932575 s 1.75
add_two / JaXPipe / cpu / BothRev 0.000022992 s 0.000013630859866680112 s 1.69
add_two / Jax / cpu / BothRev 0.000022975 s 0.00001391522002450074 s 1.65
add_two / HLOOpt / cpu / PreRev 0.000023049 s 0.000014176519925968025 s 1.63
add_two / HLOOpt / cpu / PostRev 0.000022574 s 0.00001600007994056796 s 1.41
add_two / HLOOpt / cpu / BothRev 0.000023493 s 0.000013237259991001335 s 1.77
add_two / PartOpt / cpu / PreRev 0.000022773 s 0.000013702699980058242 s 1.66
add_two / PartOpt / cpu / PostRev 0.000023463 s 0.000013647719970322216 s 1.72
add_two / PartOpt / cpu / BothRev 0.000023389 s 0.000013649460015585646 s 1.71
add_two / IPartOpt / cpu / PreRev 0.000023527 s 0.00001414162001310615 s 1.66
add_two / IPartOpt / cpu / PostRev 0.00002308 s 0.000013527139999496284 s 1.71
add_two / IPartOpt / cpu / BothRev 0.000023363 s 0.000013481519945344189 s 1.73
add_two / DefOpt / cpu / PreRev 0.000023199 s 0.000013913919956394238 s 1.67
add_two / DefOpt / cpu / PostRev 0.000023264 s 0.00001358887988317292 s 1.71
add_two / DefOpt / cpu / BothRev 0.000023274 s 0.000013515859882318182 s 1.72
add_two / IDefOpt / cpu / PreRev 0.000023478 s 0.000013658319894602754 s 1.72
add_two / IDefOpt / cpu / PostRev 0.000023484 s 0.000013398179926298323 s 1.75
add_two / IDefOpt / cpu / BothRev 0.000023215 s 0.000013483259972417726 s 1.72
cache / JaXPipe / cuda / Primal 0.000002335 s 0.000002335 s 1
cache / Jax / cuda / Primal 0.000002336 s 0.000002336 s 1
cache / HLOOpt / cuda / Primal 0.000002335 s 0.00000224 s 1.04
cache / PartOpt / cuda / Primal 0.000002335 s 0.000002304 s 1.01
cache / IPartOpt / cuda / Primal 0.000002335 s 0.000002335 s 1
cache / DefOpt / cuda / Primal 0.000002335 s 0.000002273 s 1.03
cache / IDefOpt / cuda / Primal 0.000002335 s 0.000002272 s 1.03
cache / JaXPipe / cuda / Forward 0.0000023670000000000004 s 0.000002336 s 1.01
cache / Jax / cuda / Forward 0.000002336 s 0.0000023670000000000004 s 0.99
cache / HLOOpt / cuda / Forward 0.0000023670000000000004 s 0.0000023670000000000004 s 1
cache / PartOpt / cuda / Forward 0.000002336 s 0.0000023670000000000004 s 0.99
cache / IPartOpt / cuda / Forward 0.0000023670000000000004 s 0.0000023670000000000004 s 1
cache / DefOpt / cuda / Forward 0.0000023670000000000004 s 0.000002272 s 1.04
cache / IDefOpt / cuda / Forward 0.0000023670000000000004 s 0.0000023670000000000004 s 1
cache / JaXPipe / cuda / PreRev 0.000010816 s 0.000010144 s 1.07
cache / JaXPipe / cuda / PostRev 0.0000104 s 0.000010272 s 1.01
cache / JaXPipe / cuda / BothRev 0.0000112 s 0.00000992 s 1.13
cache / Jax / cuda / BothRev 0.00001072 s 0.000010176 s 1.05
cache / HLOOpt / cuda / PreRev 0.00001376 s 0.000013183 s 1.04
cache / HLOOpt / cuda / PostRev 0.000013696 s 0.00001312 s 1.04
cache / HLOOpt / cuda / BothRev 0.000013728 s 0.000013184 s 1.04
cache / PartOpt / cuda / PreRev 0.000010816 s 0.000010623 s 1.02
cache / PartOpt / cuda / PostRev 0.000010688 s 0.00001024 s 1.04
cache / PartOpt / cuda / BothRev 0.000010656 s 0.000011647 s 0.91
cache / IPartOpt / cuda / PreRev 0.000010752 s 0.00001072 s 1.00
cache / IPartOpt / cuda / PostRev 0.000011008 s 0.000011936 s 0.92
cache / IPartOpt / cuda / BothRev 0.0000112 s 0.000010464 s 1.07
cache / DefOpt / cuda / PreRev 0.000010784 s 0.000011775 s 0.92
cache / DefOpt / cuda / PostRev 0.000010752 s 0.000011327 s 0.95
cache / DefOpt / cuda / BothRev 0.000010623 s 0.000010527 s 1.01
cache / IDefOpt / cuda / PreRev 0.000010816 s 0.000010112 s 1.07
cache / IDefOpt / cuda / PostRev 0.000010593 s 0.000010304 s 1.03
cache / IDefOpt / cuda / BothRev 0.000011520000000000002 s 0.00001056 s 1.09
cache / JaXPipe / tpu / Primal 0.0000024534 s 0.000002465925 s 0.99
cache / Jax / tpu / Primal 0.000002475825 s 0.000002454575 s 1.01
cache / HLOOpt / tpu / Primal 0.000002467125 s 0.00000246235 s 1.00
cache / PartOpt / tpu / Primal 0.0000024798 s 0.000002465875 s 1.01
cache / IPartOpt / tpu / Primal 0.000002467675 s 0.0000024584 s 1.00
cache / DefOpt / tpu / Primal 0.0000024774750000000004 s 0.0000024616250000000003 s 1.01
cache / IDefOpt / tpu / Primal 0.000002471725 s 0.000002461 s 1.00
cache / JaXPipe / tpu / Forward 0.000003537575 s 0.000003532375 s 1.00
cache / Jax / tpu / Forward 0.000003540275 s 0.0000035432750000000004 s 1.00
cache / HLOOpt / tpu / Forward 0.0000035617749999999995 s 0.000003569075 s 1.00
cache / PartOpt / tpu / Forward 0.000003529175 s 0.00000353285 s 1.00
cache / IPartOpt / tpu / Forward 0.00000355885 s 0.0000035633 s 1.00
cache / DefOpt / tpu / Forward 0.000003523725 s 0.000003526175 s 1.00
cache / IDefOpt / tpu / Forward 0.0000035503750000000003 s 0.000003552625 s 1.00
cache / JaXPipe / tpu / PreRev 0.000004943874999999999 s 0.00000494295 s 1.00
cache / JaXPipe / tpu / PostRev 0.0000050378 s 0.000004941975 s 1.02
cache / JaXPipe / tpu / BothRev 0.00000498595 s 0.000004965924999999999 s 1.00
cache / Jax / tpu / BothRev 0.000005028175 s 0.000004960275 s 1.01
cache / HLOOpt / tpu / PreRev 0.0000041347250000000005 s 0.00000392735 s 1.05
cache / HLOOpt / tpu / PostRev 0.000004146525 s 0.000004120599999999999 s 1.01
cache / HLOOpt / tpu / BothRev 0.000004134025000000001 s 0.000003920175000000001 s 1.05
cache / PartOpt / tpu / PreRev 0.000005009475 s 0.000004993325 s 1.00
cache / PartOpt / tpu / PostRev 0.000004989075 s 0.00000495125 s 1.01
cache / PartOpt / tpu / BothRev 0.000005021025 s 0.000004992050000000001 s 1.01
cache / IPartOpt / tpu / PreRev 0.00000501345 s 0.00000496495 s 1.01
cache / IPartOpt / tpu / PostRev 0.0000049997 s 0.000004982175 s 1.00
cache / IPartOpt / tpu / BothRev 0.000005003675 s 0.000004973575 s 1.01
cache / DefOpt / tpu / PreRev 0.0000050145 s 0.00000498995 s 1.00
cache / DefOpt / tpu / PostRev 0.000004980425 s 0.000004962525 s 1.00
cache / DefOpt / tpu / BothRev 0.000005020274999999999 s 0.000004980075 s 1.01
cache / IDefOpt / tpu / PreRev 0.000004991575 s 0.0000049623 s 1.01
cache / IDefOpt / tpu / PostRev 0.00000499815 s 0.00000495285 s 1.01
cache / IDefOpt / tpu / BothRev 0.000004972 s 0.00000495055 s 1.00
cache / JaXPipe / cpu / Primal 0.000012717 s 0.000006346800018945942 s 2.00
cache / Jax / cpu / Primal 0.000012586 s 0.0000064703600401117 s 1.95
cache / HLOOpt / cpu / Primal 0.000012348 s 0.000006257859986362746 s 1.97
cache / PartOpt / cpu / Primal 0.00001285 s 0.000006558360055350931 s 1.96
cache / IPartOpt / cpu / Primal 0.00001273 s 0.0000060225199558772144 s 2.11
cache / DefOpt / cpu / Primal 0.000012586 s 0.0000061135000214562754 s 2.06
cache / IDefOpt / cpu / Primal 0.000012676 s 0.000005915639994782396 s 2.14
cache / JaXPipe / cpu / Forward 0.000017131 s 0.000014798280026298015 s 1.16
cache / Jax / cpu / Forward 0.000017222000000000002 s 0.000014563039967470103 s 1.18
cache / HLOOpt / cpu / Forward 0.000016896999999999998 s 0.00001559456002723891 s 1.08
cache / PartOpt / cpu / Forward 0.000017187 s 0.000015037659995869034 s 1.14
cache / IPartOpt / cpu / Forward 0.000017068 s 0.000015018099966255249 s 1.14
cache / DefOpt / cpu / Forward 0.000016887 s 0.000015332019993365977 s 1.10
cache / IDefOpt / cpu / Forward 0.000017208000000000002 s 0.000014673120003863004 s 1.17
cache / JaXPipe / cpu / PreRev 0.000018005 s 0.000016887860037968494 s 1.07
cache / JaXPipe / cpu / PostRev 0.000020005 s 0.0000211999999373802 s 0.94
cache / JaXPipe / cpu / BothRev 0.000017589 s 0.000016905980064620963 s 1.04
cache / Jax / cpu / BothRev 0.000019737 s 0.000021262679965730057 s 0.93
cache / HLOOpt / cpu / PreRev 0.00001749 s 0.000016826700029923813 s 1.04
cache / HLOOpt / cpu / PostRev 0.000017494 s 0.000019113939961243885 s 0.92
cache / HLOOpt / cpu / BothRev 0.000017955 s 0.000016683340036252047 s 1.08
cache / PartOpt / cpu / PreRev 0.000017956 s 0.00001720952001051046 s 1.04
cache / PartOpt / cpu / PostRev 0.000018847 s 0.00002065606000542175 s 0.91
cache / PartOpt / cpu / BothRev 0.000017697 s 0.000016254839956673094 s 1.09
cache / IPartOpt / cpu / PreRev 0.000017590000000000003 s 0.000016843620032886973 s 1.04
cache / IPartOpt / cpu / PostRev 0.00001905 s 0.000021748199960711643 s 0.88
cache / IPartOpt / cpu / BothRev 0.000017755 s 0.00001659785995798302 s 1.07
cache / DefOpt / cpu / PreRev 0.000017371 s 0.000016926499993132892 s 1.03
cache / DefOpt / cpu / PostRev 0.000017179 s 0.00001647445998969488 s 1.04
cache / DefOpt / cpu / BothRev 0.000017275 s 0.000016036959987104638 s 1.08
cache / IDefOpt / cpu / PreRev 0.000017738000000000002 s 0.000016723539974918822 s 1.06
cache / IDefOpt / cpu / PostRev 0.000017343 s 0.000016724459928809665 s 1.04
cache / IDefOpt / cpu / BothRev 0.000017718000000000002 s 0.00001608033997399616 s 1.10
Concat / JaXPipe / cuda / Primal 0.000002463 s 0.000001919 s 1.28
Concat / Jax / cuda / Primal 0.000002463 s 0.0000019200000000000003 s 1.28
Concat / HLOOpt / cuda / Primal 0.000002463 s 0.000001919 s 1.28
Concat / PartOpt / cuda / Primal 0.000002463 s 0.0000019200000000000003 s 1.28
Concat / IPartOpt / cuda / Primal 0.000002463 s 0.000001919 s 1.28
Concat / DefOpt / cuda / Primal 0.000002463 s 0.0000019200000000000003 s 1.28
Concat / IDefOpt / cuda / Primal 0.000002463 s 0.000001919 s 1.28
Concat / JaXPipe / cuda / Forward 0.000010655 s 0.000009696 s 1.10
Concat / Jax / cuda / Forward 0.000010592 s 0.00000992 s 1.07
Concat / HLOOpt / cuda / Forward 0.000010656 s 0.000009664 s 1.10
Concat / PartOpt / cuda / Forward 0.000010464 s 0.000009536 s 1.10
Concat / IPartOpt / cuda / Forward 0.000010592 s 0.000009824 s 1.08
Concat / DefOpt / cuda / Forward 0.00001072 s 0.000009951 s 1.08
Concat / IDefOpt / cuda / Forward 0.000010464 s 0.0000096 s 1.09
Concat / JaXPipe / cuda / PreRev 0.000017119 s 0.000015487 s 1.11
Concat / JaXPipe / cuda / PostRev 0.00001696 s 0.00001632 s 1.04
Concat / JaXPipe / cuda / BothRev 0.000016864 s 0.000016576000000000002 s 1.02
Concat / Jax / cuda / BothRev 0.000016895 s 0.000015968 s 1.06
Concat / HLOOpt / cuda / PreRev 0.000016832 s 0.000016128 s 1.04
Concat / HLOOpt / cuda / PostRev 0.000017184 s 0.000015744 s 1.09
Concat / HLOOpt / cuda / BothRev 0.000016832 s 0.000016416 s 1.03
Concat / PartOpt / cuda / PreRev 0.000017216 s 0.000016288 s 1.06
Concat / PartOpt / cuda / PostRev 0.00001696 s 0.000015935999999999998 s 1.06
Concat / PartOpt / cuda / BothRev 0.000016864 s 0.000016063999999999997 s 1.05
Concat / IPartOpt / cuda / PreRev 0.000017184 s 0.000016768000000000003 s 1.02
Concat / IPartOpt / cuda / PostRev 0.000017024 s 0.00001616 s 1.05
Concat / IPartOpt / cuda / BothRev 0.000016864 s 0.000015711 s 1.07
Concat / DefOpt / cuda / PreRev 0.000016512 s 0.000015649 s 1.06
Concat / DefOpt / cuda / PostRev 0.000016544 s 0.000016383999999999998 s 1.01
Concat / DefOpt / cuda / BothRev 0.000016864 s 0.00001616 s 1.04
Concat / IDefOpt / cuda / PreRev 0.000016705 s 0.000015999 s 1.04
Concat / IDefOpt / cuda / PostRev 0.000016704 s 0.000016448000000000002 s 1.02
Concat / IDefOpt / cuda / BothRev 0.000018176 s 0.000016288 s 1.12
Concat / JaXPipe / tpu / Primal 0.0000015104 s 0.000001534825 s 0.98
Concat / Jax / tpu / Primal 0.000001526125 s 0.000001540325 s 0.99
Concat / HLOOpt / tpu / Primal 0.0000015064500000000002 s 0.000001529325 s 0.99
Concat / PartOpt / tpu / Primal 0.0000015173 s 0.000001534825 s 0.99
Concat / IPartOpt / tpu / Primal 0.00000151935 s 0.000001545625 s 0.98
Concat / DefOpt / tpu / Primal 0.0000015237500000000002 s 0.0000015349250000000002 s 0.99
Concat / IDefOpt / tpu / Primal 0.0000015082 s 0.00000153315 s 0.98
Concat / JaXPipe / tpu / Forward 0.000001560725 s 0.000001576825 s 0.99
Concat / Jax / tpu / Forward 0.0000015666749999999997 s 0.00000155415 s 1.01
Concat / HLOOpt / tpu / Forward 0.00000153905 s 0.0000015822 s 0.97
Concat / PartOpt / tpu / Forward 0.00000155415 s 0.00000154945 s 1.00
Concat / IPartOpt / tpu / Forward 0.000001567825 s 0.0000015807 s 0.99
Concat / DefOpt / tpu / Forward 0.000001556375 s 0.0000015530250000000005 s 1.00
Concat / IDefOpt / tpu / Forward 0.00000156495 s 0.000001589775 s 0.98
Concat / JaXPipe / tpu / PreRev 0.0000020222 s 0.0000020067 s 1.01
Concat / JaXPipe / tpu / PostRev 0.000001998875 s 0.000002110625 s 0.95
Concat / JaXPipe / tpu / BothRev 0.000002041 s 0.00000201985 s 1.01
Concat / Jax / tpu / BothRev 0.0000020119500000000003 s 0.0000020861 s 0.96
Concat / HLOOpt / tpu / PreRev 0.00000201785 s 0.0000020079250000000003 s 1.00
Concat / HLOOpt / tpu / PostRev 0.00000199085 s 0.0000020871000000000003 s 0.95
Concat / HLOOpt / tpu / BothRev 0.000002012625 s 0.00000200515 s 1.00
Concat / PartOpt / tpu / PreRev 0.00000199345 s 0.0000020804 s 0.96
Concat / PartOpt / tpu / PostRev 0.000002009125 s 0.00000200155 s 1.00
Concat / PartOpt / tpu / BothRev 0.000001997625 s 0.0000020847 s 0.96
Concat / IPartOpt / tpu / PreRev 0.0000020133499999999995 s 0.000002007975 s 1.00
Concat / IPartOpt / tpu / PostRev 0.000001996425 s 0.0000020867250000000003 s 0.96
Concat / IPartOpt / tpu / BothRev 0.0000020108000000000003 s 0.0000020111 s 1.00
Concat / DefOpt / tpu / PreRev 0.0000019943 s 0.0000020867 s 0.96
Concat / DefOpt / tpu / PostRev 0.000002015175 s 0.000002003125 s 1.01
Concat / DefOpt / tpu / BothRev 0.0000019901250000000003 s 0.00000209595 s 0.95
Concat / IDefOpt / tpu / PreRev 0.0000020090500000000003 s 0.000002009925 s 1.00
Concat / IDefOpt / tpu / PostRev 0.000001993125 s 0.0000020858 s 0.96
Concat / IDefOpt / tpu / BothRev 0.0000020124 s 0.000002015175 s 1.00
Concat / JaXPipe / cpu / Primal 0.000013034 s 0.000006445700018957723 s 2.02
Concat / Jax / cpu / Primal 0.000012749 s 0.0000068173200452292805 s 1.87
Concat / HLOOpt / cpu / Primal 0.000012753 s 0.000006353679873427609 s 2.01
Concat / PartOpt / cpu / Primal 0.000012671 s 0.00000617892001173459 s 2.05
Concat / IPartOpt / cpu / Primal 0.000012678 s 0.00000728348004486179 s 1.74
Concat / DefOpt / cpu / Primal 0.000012904 s 0.000006313840094662737 s 2.04
Concat / IDefOpt / cpu / Primal 0.000012828000000000002 s 0.000006743359972460894 s 1.90
Concat / JaXPipe / cpu / Forward 0.000017791 s 0.000009786760056158528 s 1.82
Concat / Jax / cpu / Forward 0.000017551000000000002 s 0.000009296239986724688 s 1.89
Concat / HLOOpt / cpu / Forward 0.000017335 s 0.000009474220005358804 s 1.83
Concat / PartOpt / cpu / Forward 0.000017281999999999998 s 0.000009963219999917785 s 1.73
Concat / IPartOpt / cpu / Forward 0.000017193 s 0.000009927740084094694 s 1.73
Concat / DefOpt / cpu / Forward 0.000017605 s 0.000010052360012196003 s 1.75
Concat / IDefOpt / cpu / Forward 0.000017515 s 0.000009728680070111296 s 1.80
Concat / JaXPipe / cpu / PreRev 0.000020419 s 0.000011505700040288504 s 1.77
Concat / JaXPipe / cpu / PostRev 0.00001972 s 0.000011536420024640392 s 1.71
Concat / JaXPipe / cpu / BothRev 0.00001972 s 0.000011523979974299437 s 1.71
Concat / Jax / cpu / BothRev 0.000020046 s 0.00001130902002842049 s 1.77
Concat / HLOOpt / cpu / PreRev 0.000019767 s 0.000011424400054238505 s 1.73
Concat / HLOOpt / cpu / PostRev 0.000019802 s 0.000013188979974074756 s 1.50
Concat / HLOOpt / cpu / BothRev 0.000020002 s 0.000011646559942164458 s 1.72
Concat / PartOpt / cpu / PreRev 0.000019935 s 0.000011037480016966582 s 1.81
Concat / PartOpt / cpu / PostRev 0.000019641 s 0.000011795680056820855 s 1.67
Concat / PartOpt / cpu / BothRev 0.000019886 s 0.000011574739964999024 s 1.72
Concat / IPartOpt / cpu / PreRev 0.00001985 s 0.000011574820055102464 s 1.71
Concat / IPartOpt / cpu / PostRev 0.000019982 s 0.000011695620014506855 s 1.71
Concat / IPartOpt / cpu / BothRev 0.000020201 s 0.00001153209996118676 s 1.75
Concat / DefOpt / cpu / PreRev 0.000020038 s 0.000011297619967081118 s 1.77
Concat / DefOpt / cpu / PostRev 0.00002008 s 0.000011271200000919635 s 1.78
Concat / DefOpt / cpu / BothRev 0.000019707 s 0.0000117719400623173 s 1.67
Concat / IDefOpt / cpu / PreRev 0.000020147 s 0.00001126107988966396 s 1.79
Concat / IDefOpt / cpu / PostRev 0.000019981 s 0.00001109713999539963 s 1.80
Concat / IDefOpt / cpu / BothRev 0.000019595 s 0.000011643640009424416 s 1.68
const_scatter / JaXPipe / cuda / Primal 0.000002464 s 0.000001887 s 1.31
const_scatter / Jax / cuda / Primal 0.000002463 s 0.000001887 s 1.31
const_scatter / HLOOpt / cuda / Primal 0.000002463 s 0.000001887 s 1.31
const_scatter / PartOpt / cuda / Primal 0.000002463 s 0.000001887 s 1.31
const_scatter / IPartOpt / cuda / Primal 0.000002464 s 0.000001888 s 1.31
const_scatter / DefOpt / cuda / Primal 0.000002464 s 0.000001888 s 1.31
const_scatter / IDefOpt / cuda / Primal 0.000002463 s 0.000001887 s 1.31
const_scatter / JaXPipe / cuda / Forward 0.000010913 s 0.000010112 s 1.08
const_scatter / Jax / cuda / Forward 0.000010368 s 0.000009856 s 1.05
const_scatter / HLOOpt / cuda / Forward 0.000010592 s 0.000009696 s 1.09
const_scatter / PartOpt / cuda / Forward 0.00001072 s 0.000009919 s 1.08
const_scatter / IPartOpt / cuda / Forward 0.000011328 s 0.000009568 s 1.18
const_scatter / DefOpt / cuda / Forward 0.000011776 s 0.000009824 s 1.20
const_scatter / IDefOpt / cuda / Forward 0.000010623 s 0.000009984 s 1.06
const_scatter / JaXPipe / cuda / PreRev 0.000017568000000000002 s 0.000016416 s 1.07
const_scatter / JaXPipe / cuda / PostRev 0.0000192 s 0.000016063999999999997 s 1.20
const_scatter / JaXPipe / cuda / BothRev 0.000016704 s 0.000015871 s 1.05
const_scatter / Jax / cuda / BothRev 0.000017024 s 0.000016576000000000002 s 1.03
const_scatter / HLOOpt / cuda / PreRev 0.000018176 s 0.000015872 s 1.15
const_scatter / HLOOpt / cuda / PostRev 0.000018816 s 0.000016352 s 1.15
const_scatter / HLOOpt / cuda / BothRev 0.000018528 s 0.000015648 s 1.18
const_scatter / PartOpt / cuda / PreRev 0.000018752000000000003 s 0.000015518999999999998 s 1.21
const_scatter / PartOpt / cuda / PostRev 0.000016799000000000003 s 0.00001568 s 1.07
const_scatter / PartOpt / cuda / BothRev 0.000017344 s 0.000015553 s 1.12
const_scatter / IPartOpt / cuda / PreRev 0.000016576000000000002 s 0.000016031 s 1.03
const_scatter / IPartOpt / cuda / PostRev 0.000016831 s 0.000016544 s 1.02
const_scatter / IPartOpt / cuda / BothRev 0.0000168 s 0.000015968 s 1.05
const_scatter / DefOpt / cuda / PreRev 0.000017088 s 0.000016383999999999998 s 1.04
const_scatter / DefOpt / cuda / PostRev 0.000016670999999999997 s 0.000015904000000000002 s 1.05
const_scatter / DefOpt / cuda / BothRev 0.000016736 s 0.00001632 s 1.03
const_scatter / IDefOpt / cuda / PreRev 0.000016832 s 0.000016192 s 1.04
const_scatter / IDefOpt / cuda / PostRev 0.000017247999999999998 s 0.000015743 s 1.10
const_scatter / IDefOpt / cuda / BothRev 0.000017024 s 0.000015711 s 1.08
const_scatter / JaXPipe / tpu / Primal 0.000003827 s 0.0000038184 s 1.00
const_scatter / Jax / tpu / Primal 0.000003841125 s 0.000003835175 s 1.00
const_scatter / HLOOpt / tpu / Primal 0.000003819125 s 0.000003798625 s 1.01
const_scatter / PartOpt / tpu / Primal 0.000003849175 s 0.000003819275 s 1.01
const_scatter / IPartOpt / tpu / Primal 0.00000379835 s 0.000003795925 s 1.00
const_scatter / DefOpt / tpu / Primal 0.00000381755 s 0.000003832675 s 1.00
const_scatter / IDefOpt / tpu / Primal 0.000003792775 s 0.00000380515 s 1.00
const_scatter / JaXPipe / tpu / Forward 0.000006495 s 0.000006451825 s 1.01
const_scatter / Jax / tpu / Forward 0.00000645745 s 0.000006502525 s 0.99
const_scatter / HLOOpt / tpu / Forward 0.000006493 s 0.00000645655 s 1.01
const_scatter / PartOpt / tpu / Forward 0.000006487825 s 0.0000065193 s 1.00
const_scatter / IPartOpt / tpu / Forward 0.000006495925000000001 s 0.0000064571 s 1.01
const_scatter / DefOpt / tpu / Forward 0.0000064614000000000005 s 0.000006503475 s 0.99
const_scatter / IDefOpt / tpu / Forward 0.000006489525 s 0.000006458125 s 1.00
const_scatter / JaXPipe / tpu / PreRev 0.000006703474999999999 s 0.000006637475 s 1.01
const_scatter / JaXPipe / tpu / PostRev 0.000006703799999999999 s 0.000006622749999999999 s 1.01
const_scatter / JaXPipe / tpu / BothRev 0.000006667825 s 0.0000066325250000000005 s 1.01
const_scatter / Jax / tpu / BothRev 0.000006706475 s 0.000006621949999999999 s 1.01
const_scatter / HLOOpt / tpu / PreRev 0.000006680475 s 0.00000664045 s 1.01
const_scatter / HLOOpt / tpu / PostRev 0.000006698575 s 0.000006628874999999999 s 1.01
const_scatter / HLOOpt / tpu / BothRev 0.000006672225 s 0.000006645124999999999 s 1.00
const_scatter / PartOpt / tpu / PreRev 0.00000668665 s 0.000006635125 s 1.01
const_scatter / PartOpt / tpu / PostRev 0.0000066813 s 0.000006629475 s 1.01
const_scatter / PartOpt / tpu / BothRev 0.0000066946 s 0.000006631924999999999 s 1.01
const_scatter / IPartOpt / tpu / PreRev 0.000006676225 s 0.0000066217000000000005 s 1.01
const_scatter / IPartOpt / tpu / PostRev 0.000006686025 s 0.000006610425 s 1.01
const_scatter / IPartOpt / tpu / BothRev 0.0000066669 s 0.000006637600000000001 s 1.00
const_scatter / DefOpt / tpu / PreRev 0.000006677125000000001 s 0.0000066379 s 1.01
const_scatter / DefOpt / tpu / PostRev 0.000006686675 s 0.000006617724999999999 s 1.01
const_scatter / DefOpt / tpu / BothRev 0.000006690199999999999 s 0.00000663095 s 1.01
const_scatter / IDefOpt / tpu / PreRev 0.000006698225 s 0.000006621325 s 1.01
const_scatter / IDefOpt / tpu / PostRev 0.000006685175 s 0.000006636525 s 1.01
const_scatter / IDefOpt / tpu / BothRev 0.000006666450000000001 s 0.00000660165 s 1.01
const_scatter / JaXPipe / cpu / Primal 0.000012946 s 0.000006296179999480955 s 2.06
const_scatter / Jax / cpu / Primal 0.000012615 s 0.000006478699924628018 s 1.95
const_scatter / HLOOpt / cpu / Primal 0.000013249 s 0.000007021259953035042 s 1.89
const_scatter / PartOpt / cpu / Primal 0.000012846 s 0.00000653694001812255 s 1.97
const_scatter / IPartOpt / cpu / Primal 0.000012794 s 0.000006497859976661857 s 1.97
const_scatter / DefOpt / cpu / Primal 0.000013355 s 0.000006806539895478636 s 1.96
const_scatter / IDefOpt / cpu / Primal 0.000013262 s 0.000006674100004602223 s 1.99
const_scatter / JaXPipe / cpu / Forward 0.000018088 s 0.000010450200006744126 s 1.73
const_scatter / Jax / cpu / Forward 0.000016955000000000003 s 0.000009259560010832502 s 1.83
const_scatter / HLOOpt / cpu / Forward 0.000017845 s 0.000010598320095596136 s 1.68
const_scatter / PartOpt / cpu / Forward 0.000017654 s 0.000010302739956387088 s 1.71
const_scatter / IPartOpt / cpu / Forward 0.000017745 s 0.000010669560033420568 s 1.66
const_scatter / DefOpt / cpu / Forward 0.000017854 s 0.000010220859985565769 s 1.75
const_scatter / IDefOpt / cpu / Forward 0.000017997 s 0.0000100798799576296 s 1.79
const_scatter / JaXPipe / cpu / PreRev 0.0004868759999999 s 0.0002859397599968 s 1.70
const_scatter / JaXPipe / cpu / PostRev 0.00050926 s 0.0002769920399259 s 1.84
const_scatter / JaXPipe / cpu / BothRev 0.000507781 s 0.0002800957999716 s 1.81
const_scatter / Jax / cpu / BothRev 0.00049152 s 0.0002796387400485 s 1.76
const_scatter / HLOOpt / cpu / PreRev 0.000487235 s 0.0002858592999837 s 1.70
const_scatter / HLOOpt / cpu / PostRev 0.000512209 s 0.0002851575600652 s 1.80
const_scatter / HLOOpt / cpu / BothRev 0.000508304 s 0.0002833786999508 s 1.79
const_scatter / PartOpt / cpu / PreRev 0.000502916 s 0.0002815988000656 s 1.79
const_scatter / PartOpt / cpu / PostRev 0.000501018 s 0.0002802312200219 s 1.79
const_scatter / PartOpt / cpu / BothRev 0.000495407 s 0.0002805860599619 s 1.77
const_scatter / IPartOpt / cpu / PreRev 0.000509577 s 0.0002813582600538 s 1.81
const_scatter / IPartOpt / cpu / PostRev 0.000501245 s 0.0002807098400262 s 1.79
const_scatter / IPartOpt / cpu / BothRev 0.000493155 s 0.0002834572800384 s 1.74
const_scatter / DefOpt / cpu / PreRev 0.000491804 s 0.0002803923399551 s 1.75
const_scatter / DefOpt / cpu / PostRev 0.000510077 s 0.0002814791199307 s 1.81
const_scatter / DefOpt / cpu / BothRev 0.000493616 s 0.000281565200039 s 1.75
const_scatter / IDefOpt / cpu / PreRev 0.000511866 s 0.0002804958800152 s 1.82
const_scatter / IDefOpt / cpu / PostRev 0.000490943 s 0.0002823369399993 s 1.74
const_scatter / IDefOpt / cpu / BothRev 0.000511223 s 0.0002790884600472 s 1.83
GenDot / JaXPipe / cuda / Primal 0.000002528 s 0.000002016 s 1.25
GenDot / Jax / cuda / Primal 0.000002528 s 0.000002016 s 1.25
GenDot / HLOOpt / cuda / Primal 0.000002527 s 0.000002016 s 1.25
GenDot / PartOpt / cuda / Primal 0.00000256 s 0.000002016 s 1.27
GenDot / IPartOpt / cuda / Primal 0.000002559 s 0.000002015 s 1.27
GenDot / DefOpt / cuda / Primal 0.000002527 s 0.000002015 s 1.25
GenDot / IDefOpt / cuda / Primal 0.00000256 s 0.000002015 s 1.27
GenDot / JaXPipe / cuda / Forward 0.000010464 s 0.000009824 s 1.07
GenDot / Jax / cuda / Forward 0.00001056 s 0.000009344 s 1.13
GenDot / HLOOpt / cuda / Forward 0.000010688 s 0.00000976 s 1.10
GenDot / PartOpt / cuda / Forward 0.000010752 s 0.000009695 s 1.11
GenDot / IPartOpt / cuda / Forward 0.000010625 s 0.00000976 s 1.09
GenDot / DefOpt / cuda / Forward 0.000010752 s 0.00000992 s 1.08
GenDot / IDefOpt / cuda / Forward 0.00001072 s 0.0000096 s 1.12
GenDot / JaXPipe / cuda / PreRev 0.000010272 s 0.000009408 s 1.09
GenDot / JaXPipe / cuda / PostRev 0.000010656 s 0.000009569 s 1.11
GenDot / JaXPipe / cuda / BothRev 0.00001056 s 0.000009696 s 1.09
GenDot / Jax / cuda / BothRev 0.000010591 s 0.000009568 s 1.11
GenDot / HLOOpt / cuda / PreRev 0.00001056 s 0.000009632 s 1.10
GenDot / HLOOpt / cuda / PostRev 0.000010528 s 0.000010016 s 1.05
GenDot / HLOOpt / cuda / BothRev 0.000010527 s 0.000009567 s 1.10
GenDot / PartOpt / cuda / PreRev 0.000010592 s 0.000009344 s 1.13
GenDot / PartOpt / cuda / PostRev 0.000010528 s 0.000009504 s 1.11
GenDot / PartOpt / cuda / BothRev 0.000010624 s 0.000010592 s 1.00
GenDot / IPartOpt / cuda / PreRev 0.000010464 s 0.000010112 s 1.03
GenDot / IPartOpt / cuda / PostRev 0.000010592 s 0.000009727 s 1.09
GenDot / IPartOpt / cuda / BothRev 0.000010304 s 0.000009696 s 1.06
GenDot / DefOpt / cuda / PreRev 0.00001056 s 0.000009696 s 1.09
GenDot / DefOpt / cuda / PostRev 0.000010496 s 0.000009696 s 1.08
GenDot / DefOpt / cuda / BothRev 0.000010752 s 0.000009632 s 1.12
GenDot / IDefOpt / cuda / PreRev 0.000010624 s 0.000010048 s 1.06
GenDot / IDefOpt / cuda / PostRev 0.000010816 s 0.0000096 s 1.13
GenDot / IDefOpt / cuda / BothRev 0.000010592 s 0.00000944 s 1.12
GenDot / JaXPipe / tpu / Primal 9.435e-7 s 9.3015e-7 s 1.01
GenDot / Jax / tpu / Primal 9.30275e-7 s 9.25925e-7 s 1.00
GenDot / HLOOpt / tpu / Primal 0.0000016002 s 0.0000015801499999999998 s 1.01
GenDot / PartOpt / tpu / Primal 9.298e-7 s 9.25925e-7 s 1.00
GenDot / IPartOpt / tpu / Primal 9.434e-7 s 9.30225e-7 s 1.01
GenDot / DefOpt / tpu / Primal 0.000001502175 s 0.000001514625 s 0.99
GenDot / IDefOpt / tpu / Primal 0.00000159905 s 0.000001579175 s 1.01
GenDot / JaXPipe / tpu / Forward 0.000003048925 s 0.000003185025 s 0.96
GenDot / Jax / tpu / Forward 0.00000227975 s 0.00000232695 s 0.98
GenDot / HLOOpt / tpu / Forward 0.000003114175 s 0.000003133375 s 0.99
GenDot / PartOpt / tpu / Forward 0.000003135425 s 0.000003229475 s 0.97
GenDot / IPartOpt / tpu / Forward 0.0000031227500000000003 s 0.0000031208000000000003 s 1.00
GenDot / DefOpt / tpu / Forward 0.0000031314 s 0.000003228025 s 0.97
GenDot / IDefOpt / tpu / Forward 0.0000031149 s 0.0000031296 s 1.00
GenDot / JaXPipe / tpu / PreRev 0.000003027025 s 0.00000298095 s 1.02
GenDot / JaXPipe / tpu / PostRev 0.00000237295 s 0.000002402975 s 0.99
GenDot / JaXPipe / tpu / BothRev 0.000003012025 s 0.000002991025 s 1.01
GenDot / Jax / tpu / BothRev 0.000002379625 s 0.000002403 s 0.99
GenDot / HLOOpt / tpu / PreRev 0.0000030072 s 0.000002973925 s 1.01
GenDot / HLOOpt / tpu / PostRev 0.0000029346 s 0.000002938625 s 1.00
GenDot / HLOOpt / tpu / BothRev 0.000003007175000000001 s 0.000002979075 s 1.01
GenDot / PartOpt / tpu / PreRev 0.000002934175 s 0.0000029275 s 1.00
GenDot / PartOpt / tpu / PostRev 0.000002416 s 0.0000023966 s 1.01
GenDot / PartOpt / tpu / BothRev 0.000002935625 s 0.0000029288500000000003 s 1.00
GenDot / IPartOpt / tpu / PreRev 0.00000301045 s 0.000002976525 s 1.01
GenDot / IPartOpt / tpu / PostRev 0.00000237535 s 0.00000240625 s 0.99
GenDot / IPartOpt / tpu / BothRev 0.0000030095 s 0.0000029808 s 1.01
GenDot / DefOpt / tpu / PreRev 0.000002938475 s 0.0000029376000000000005 s 1.00
GenDot / DefOpt / tpu / PostRev 0.00000301595 s 0.000002976 s 1.01
GenDot / DefOpt / tpu / BothRev 0.0000029443000000000003 s 0.000002935325 s 1.00
GenDot / IDefOpt / tpu / PreRev 0.000003014825 s 0.000002978425 s 1.01
GenDot / IDefOpt / tpu / PostRev 0.0000029408 s 0.00000292695 s 1.00
GenDot / IDefOpt / tpu / BothRev 0.00000300985 s 0.0000029819250000000003 s 1.01
GenDot / JaXPipe / cpu / Primal 0.000014426 s 0.000007302779995370656 s 1.98
GenDot / Jax / cpu / Primal 0.000014649 s 0.000006674979922536295 s 2.19
GenDot / HLOOpt / cpu / Primal 0.000013994 s 0.000007559800014860229 s 1.85
GenDot / PartOpt / cpu / Primal 0.000014818 s 0.000006457319959736196 s 2.29
GenDot / IPartOpt / cpu / Primal 0.000014749 s 0.0000068904400359315336 s 2.14
GenDot / DefOpt / cpu / Primal 0.000014132 s 0.0000070318200778274335 s 2.01
GenDot / IDefOpt / cpu / Primal 0.000014077 s 0.000007177160059654853 s 1.96
GenDot / JaXPipe / cpu / Forward 0.000019369 s 0.00001050371996825561 s 1.84
GenDot / Jax / cpu / Forward 0.000020224 s 0.000010276840002916288 s 1.97
GenDot / HLOOpt / cpu / Forward 0.000019254 s 0.000010662299937393982 s 1.81
GenDot / PartOpt / cpu / Forward 0.000019312 s 0.000010708300105761735 s 1.80
GenDot / IPartOpt / cpu / Forward 0.000019625 s 0.00001087337996068527 s 1.80
GenDot / DefOpt / cpu / Forward 0.000019033 s 0.0000105288599661435 s 1.81
GenDot / IDefOpt / cpu / Forward 0.000018991 s 0.000010731519978435244 s 1.77
GenDot / JaXPipe / cpu / PreRev 0.000019282 s 0.000010836680030479328 s 1.78
GenDot / JaXPipe / cpu / PostRev 0.00002047 s 0.000009931940003298224 s 2.06
GenDot / JaXPipe / cpu / BothRev 0.000019124 s 0.000011299920006422324 s 1.69
GenDot / Jax / cpu / BothRev 0.000019995 s 0.000010131740000360878 s 1.97
GenDot / HLOOpt / cpu / PreRev 0.000019051 s 0.000010890920038946206 s 1.75
GenDot / HLOOpt / cpu / PostRev 0.000019553 s 0.000012772999980370514 s 1.53
GenDot / HLOOpt / cpu / BothRev 0.000019489 s 0.000010551199993642513 s 1.85
GenDot / PartOpt / cpu / PreRev 0.000019056 s 0.00001108557997213211 s 1.72
GenDot / PartOpt / cpu / PostRev 0.000020461 s 0.00000987782004813198 s 2.07
GenDot / PartOpt / cpu / BothRev 0.000019325 s 0.000011596880067372697 s 1.67
GenDot / IPartOpt / cpu / PreRev 0.000019075000000000003 s 0.000010663179964467415 s 1.79
GenDot / IPartOpt / cpu / PostRev 0.000020707 s 0.00000973392006926588 s 2.13
GenDot / IPartOpt / cpu / BothRev 0.000019336 s 0.000010919259948423132 s 1.77
GenDot / DefOpt / cpu / PreRev 0.000019382 s 0.000010918499992840224 s 1.78
GenDot / DefOpt / cpu / PostRev 0.00001962 s 0.00001102416010326124 s 1.78
GenDot / DefOpt / cpu / BothRev 0.000019734 s 0.000010676539968699217 s 1.85
GenDot / IDefOpt / cpu / PreRev 0.000018944 s 0.000010241140007565263 s 1.85
GenDot / IDefOpt / cpu / PostRev 0.000019565 s 0.000010610380031721434 s 1.84
GenDot / IDefOpt / cpu / BothRev 0.000019067 s 0.000010912239995377604 s 1.75
hlo_ffi / JaXPipe / cuda / Primal 0.0000023670000000000004 s 0.000001983 s 1.19
hlo_ffi / Jax / cuda / Primal 0.0000023670000000000004 s 0.000001983 s 1.19
hlo_ffi / HLOOpt / cuda / Primal 0.000002368 s 0.000001952 s 1.21
hlo_ffi / PartOpt / cuda / Primal 0.0000023670000000000004 s 0.000001952 s 1.21
hlo_ffi / IPartOpt / cuda / Primal 0.0000023670000000000004 s 0.000001983 s 1.19
hlo_ffi / DefOpt / cuda / Primal 0.0000023670000000000004 s 0.000001952 s 1.21
hlo_ffi / IDefOpt / cuda / Primal 0.0000023670000000000004 s 0.000001952 s 1.21
hlo_ffi / JaXPipe / cuda / Forward 0.000002463 s 0.000002047 s 1.20
hlo_ffi / Jax / cuda / Forward 0.000002463 s 0.000002047 s 1.20
hlo_ffi / HLOOpt / cuda / Forward 0.000002463 s 0.000002047 s 1.20
hlo_ffi / PartOpt / cuda / Forward 0.000002464 s 0.000002047 s 1.20
hlo_ffi / IPartOpt / cuda / Forward 0.000002464 s 0.000002047 s 1.20
hlo_ffi / DefOpt / cuda / Forward 0.000002463 s 0.000002047 s 1.20
hlo_ffi / IDefOpt / cuda / Forward 0.000002464 s 0.000002047 s 1.20
hlo_ffi / JaXPipe / cuda / PreRev 0.000002432 s 0.000002047 s 1.19
hlo_ffi / JaXPipe / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / JaXPipe / cuda / BothRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / Jax / cuda / BothRev 0.000002433 s 0.000002047 s 1.19
hlo_ffi / HLOOpt / cuda / PreRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / HLOOpt / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / HLOOpt / cuda / BothRev 0.000002432 s 0.000002047 s 1.19
hlo_ffi / PartOpt / cuda / PreRev 0.000002463 s 0.000002016 s 1.22
hlo_ffi / PartOpt / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / PartOpt / cuda / BothRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / IPartOpt / cuda / PreRev 0.000002432 s 0.000002047 s 1.19
hlo_ffi / IPartOpt / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / IPartOpt / cuda / BothRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / DefOpt / cuda / PreRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / DefOpt / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / DefOpt / cuda / BothRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / IDefOpt / cuda / PreRev 0.000002432 s 0.000002047 s 1.19
hlo_ffi / IDefOpt / cuda / PostRev 0.000002463 s 0.000002047 s 1.20
hlo_ffi / IDefOpt / cuda / BothRev 0.000002432 s 0.000002047 s 1.19
hlo_ffi / JaXPipe / tpu / Primal 9.198e-7 s 9.28675e-7 s 0.99
hlo_ffi / Jax / tpu / Primal 9.5005e-7 s 9.53925e-7 s 1.00
hlo_ffi / HLOOpt / tpu / Primal 8.958249999999999e-7 s 9.07475e-7 s 0.99
hlo_ffi / PartOpt / tpu / Primal 9.50575e-7 s 9.4965e-7 s 1.00
hlo_ffi / IPartOpt / tpu / Primal 9.00825e-7 s 9.09325e-7 s 0.99
hlo_ffi / DefOpt / tpu / Primal 9.49675e-7 s 9.52175e-7 s 1.00
hlo_ffi / IDefOpt / tpu / Primal 8.94775e-7 s 9.059e-7 s 0.99
hlo_ffi / JaXPipe / tpu / Forward 9.49475e-7 s 9.49325e-7 s 1.00
hlo_ffi / Jax / tpu / Forward 9.81825e-7 s 9.81775e-7 s 1.00
hlo_ffi / HLOOpt / tpu / Forward 9.74e-7 s 9.7415e-7 s 1.00
hlo_ffi / PartOpt / tpu / Forward 9.345e-7 s 9.33875e-7 s 1.00
hlo_ffi / IPartOpt / tpu / Forward 9.74675e-7 s 9.739499999999998e-7 s 1.00
hlo_ffi / DefOpt / tpu / Forward 9.34125e-7 s 9.33825e-7 s 1.00
hlo_ffi / IDefOpt / tpu / Forward 9.74375e-7 s 9.74475e-7 s 1.00
hlo_ffi / JaXPipe / tpu / PreRev 9.3225e-7 s 9.3735e-7 s 0.99
hlo_ffi / JaXPipe / tpu / PostRev 9.648e-7 s 9.65e-7 s 1.00
hlo_ffi / JaXPipe / tpu / BothRev 9.6195e-7 s 9.61925e-7 s 1.00
hlo_ffi / Jax / tpu / BothRev 9.653e-7 s 9.643e-7 s 1.00
hlo_ffi / HLOOpt / tpu / PreRev 9.62175e-7 s 9.616e-7 s 1.00
hlo_ffi / HLOOpt / tpu / PostRev 9.65125e-7 s 9.6495e-7 s 1.00
hlo_ffi / HLOOpt / tpu / BothRev 9.61375e-7 s 9.6135e-7 s 1.00
hlo_ffi / PartOpt / tpu / PreRev 9.65225e-7 s 9.64825e-7 s 1.00
hlo_ffi / PartOpt / tpu / PostRev 9.61625e-7 s 9.6165e-7 s 1.00
hlo_ffi / PartOpt / tpu / BothRev 9.647e-7 s 9.64875e-7 s 1.00
hlo_ffi / IPartOpt / tpu / PreRev 9.61675e-7 s 9.619e-7 s 1.00
hlo_ffi / IPartOpt / tpu / PostRev 9.65275e-7 s 9.646e-7 s 1.00
hlo_ffi / IPartOpt / tpu / BothRev 9.6175e-7 s 9.61775e-7 s 1.00
hlo_ffi / DefOpt / tpu / PreRev 9.6505e-7 s 9.647e-7 s 1.00
hlo_ffi / DefOpt / tpu / PostRev 9.624e-7 s 9.62225e-7 s 1.00
hlo_ffi / DefOpt / tpu / BothRev 9.6505e-7 s 9.645e-7 s 1.00
hlo_ffi / IDefOpt / tpu / PreRev 9.61175e-7 s 9.617e-7 s 1.00
hlo_ffi / IDefOpt / tpu / PostRev 9.6525e-7 s 9.6485e-7 s 1.00
hlo_ffi / IDefOpt / tpu / BothRev 9.62125e-7 s 9.62e-7 s 1.00
hlo_ffi / JaXPipe / cpu / Primal 0.000016992 s 0.000011233720051677663 s 1.51
hlo_ffi / Jax / cpu / Primal 0.000017114 s 0.000010917700001300543 s 1.57
hlo_ffi / HLOOpt / cpu / Primal 0.000016933 s 0.00001101467994885752 s 1.54
hlo_ffi / PartOpt / cpu / Primal 0.000017257 s 0.000010988760004693175 s 1.57
hlo_ffi / IPartOpt / cpu / Primal 0.000017173 s 0.000011737320073734736 s 1.46
hlo_ffi / DefOpt / cpu / Primal 0.000017196 s 0.000010840160084626403 s 1.59
hlo_ffi / IDefOpt / cpu / Primal 0.000016784 s 0.00001093057993784896 s 1.54
hlo_ffi / JaXPipe / cpu / Forward 0.000024274 s 0.000015760040041641333 s 1.54
hlo_ffi / Jax / cpu / Forward 0.000023215 s 0.00001529287999801454 s 1.52
hlo_ffi / HLOOpt / cpu / Forward 0.000023543 s 0.00001593963992490899 s 1.48
hlo_ffi / PartOpt / cpu / Forward 0.000023531 s 0.000015218719981930916 s 1.55
hlo_ffi / IPartOpt / cpu / Forward 0.000023047 s 0.00001521984002465615 s 1.51
hlo_ffi / DefOpt / cpu / Forward 0.000023179000000000003 s 0.000015428400020027765 s 1.50
hlo_ffi / IDefOpt / cpu / Forward 0.00002318 s 0.000015707859984104288 s 1.48
hlo_ffi / JaXPipe / cpu / PreRev 0.000023691 s 0.000015456880028068554 s 1.53
hlo_ffi / JaXPipe / cpu / PostRev 0.000023497 s 0.00001708776000668877 s 1.38
hlo_ffi / JaXPipe / cpu / BothRev 0.000022992 s 0.000014944239974283844 s 1.54
hlo_ffi / Jax / cpu / BothRev 0.000023301 s 0.000015678040035709273 s 1.49
hlo_ffi / HLOOpt / cpu / PreRev 0.000023554 s 0.000016419940020568902 s 1.43
hlo_ffi / HLOOpt / cpu / PostRev 0.00002294 s 0.00001703794001514325 s 1.35
hlo_ffi / HLOOpt / cpu / BothRev 0.000023581 s 0.000014703899996675318 s 1.60
hlo_ffi / PartOpt / cpu / PreRev 0.000023766 s 0.000015609500005666634 s 1.52
hlo_ffi / PartOpt / cpu / PostRev 0.000023075 s 0.000015246840011968744 s 1.51
hlo_ffi / PartOpt / cpu / BothRev 0.00002294 s 0.000015681820041208994 s 1.46
hlo_ffi / IPartOpt / cpu / PreRev 0.000023703 s 0.000015859059949434595 s 1.49
hlo_ffi / IPartOpt / cpu / PostRev 0.000023015 s 0.00001532091999251861 s 1.50
hlo_ffi / IPartOpt / cpu / BothRev 0.000023025 s 0.000014792100046179256 s 1.56
hlo_ffi / DefOpt / cpu / PreRev 0.000023265 s 0.00001618226009668433 s 1.44
hlo_ffi / DefOpt / cpu / PostRev 0.000023557 s 0.000015356060011981753 s 1.53
hlo_ffi / DefOpt / cpu / BothRev 0.000023214 s 0.000015247140036080964 s 1.52
hlo_ffi / IDefOpt / cpu / PreRev 0.000023432 s 0.00001594363999174675 s 1.47
hlo_ffi / IDefOpt / cpu / PostRev 0.000023152 s 0.000015076339986990204 s 1.54
hlo_ffi / IDefOpt / cpu / BothRev 0.000023514 s 0.000015262999950209634 s 1.54
jaxmd20 / JaXPipe / cuda / Primal 0.0014987099999999 s 0.001487797 s 1.01
jaxmd20 / Jax / cuda / Primal 0.001515256 s 0.001502389 s 1.01
jaxmd20 / HLOOpt / cuda / Primal 0.001406967 s 0.001337142 s 1.05
jaxmd20 / PartOpt / cuda / Primal 0.001371705 s 0.001336182 s 1.03
jaxmd20 / IPartOpt / cuda / Primal 0.001437721 s 0.001332982 s 1.08
jaxmd20 / DefOpt / cuda / Primal 0.00095033 s 0.000924473 s 1.03
jaxmd20 / IDefOpt / cuda / Primal 0.000964795 s 0.000986297 s 0.98
jaxmd20 / JaXPipe / cuda / Forward 0.001645912 s 0.001555061 s 1.06
jaxmd20 / Jax / cuda / Forward 0.00189711 s 0.0017823869999999 s 1.06
jaxmd20 / HLOOpt / cuda / Forward 0.001745366 s 0.0016205 s 1.08
jaxmd20 / PartOpt / cuda / Forward 0.001725943 s 0.001677841 s 1.03
jaxmd20 / IPartOpt / cuda / Forward 0.001724663 s 0.001613684 s 1.07
jaxmd20 / DefOpt / cuda / Forward 0.001739703 s 0.001646644 s 1.06
jaxmd20 / IDefOpt / cuda / Forward 0.001743126 s 0.001617236 s 1.08
jaxmd20 / JaXPipe / cuda / PreRev 0.002767599 s 0.002683692 s 1.03
jaxmd20 / JaXPipe / cuda / PostRev 0.005485442 s 0.005353332 s 1.02
jaxmd20 / JaXPipe / cuda / BothRev 0.002759665 s 0.0026781229999999 s 1.03
jaxmd20 / Jax / cuda / BothRev 0.0054966749999999 s 0.005319223 s 1.03
jaxmd20 / HLOOpt / cuda / PreRev 0.0028658719999999 s 0.002741707 s 1.05
jaxmd20 / HLOOpt / cuda / PostRev 0.005489508 s 0.00534098 s 1.03
jaxmd20 / HLOOpt / cuda / BothRev 0.002834129 s 0.0027194669999999 s 1.04
jaxmd20 / PartOpt / cuda / PreRev 0.002903921 s 0.002839114 s 1.02
jaxmd20 / PartOpt / cuda / PostRev 0.005585346 s 0.005380503 s 1.04
jaxmd20 / PartOpt / cuda / BothRev 0.002809936 s 0.002758475 s 1.02
jaxmd20 / IPartOpt / cuda / PreRev 0.0029408159999999 s 0.002806923 s 1.05
jaxmd20 / IPartOpt / cuda / PostRev 0.005604355 s 0.005431319 s 1.03
jaxmd20 / IPartOpt / cuda / BothRev 0.002861937 s 0.002746092 s 1.04
jaxmd20 / DefOpt / cuda / PreRev 0.002919311 s 0.002836201 s 1.03
jaxmd20 / DefOpt / cuda / PostRev 0.002856753 s 0.002780043 s 1.03
jaxmd20 / DefOpt / cuda / BothRev 0.00298264 s 0.002767851 s 1.08
jaxmd20 / IDefOpt / cuda / PreRev 0.002914193 s 0.002789162 s 1.04
jaxmd20 / IDefOpt / cuda / PostRev 0.002356563 s 0.002326062 s 1.01
jaxmd20 / IDefOpt / cuda / BothRev 0.0028880159999999 s 0.002757003 s 1.05
jaxmd20 / JaXPipe / tpu / Primal 0.0092835 s 0.009271920625 s 1.00
jaxmd20 / Jax / tpu / Primal 0.0092628912499999 s 0.00926478375 s 1.00
jaxmd20 / HLOOpt / tpu / Primal 0.009153585625 s 0.009154375 s 1.00
jaxmd20 / PartOpt / tpu / Primal 0.00919604125 s 0.0091968425 s 1.00
jaxmd20 / IPartOpt / tpu / Primal 0.009199874375 s 0.00920241 s 1.00
jaxmd20 / DefOpt / tpu / Primal 0.008794388125 s 0.00879217375 s 1.00
jaxmd20 / IDefOpt / tpu / Primal 0.00870145125 s 0.008697745625 s 1.00
jaxmd20 / JaXPipe / tpu / Forward 0.017430190625 s 0.01741725375 s 1.00
jaxmd20 / Jax / tpu / Forward 0.01873101125 s 0.01872633625 s 1.00
jaxmd20 / HLOOpt / tpu / Forward 0.017404078125 s 0.017394088125 s 1.00
jaxmd20 / PartOpt / tpu / Forward 0.017422699375 s 0.01740757375 s 1.00
jaxmd20 / IPartOpt / tpu / Forward 0.0174257087499999 s 0.0174110075 s 1.00
jaxmd20 / DefOpt / tpu / Forward 0.01742289375 s 0.01741526125 s 1.00
jaxmd20 / IDefOpt / tpu / Forward 0.01742419125 s 0.017414086875 s 1.00
jaxmd20 / JaXPipe / tpu / PreRev 0.02544791 s 0.0254551675 s 1.00
jaxmd20 / JaXPipe / tpu / PostRev 0.021853091875 s 0.021894850625 s 1.00
jaxmd20 / JaXPipe / tpu / BothRev 0.0254696 s 0.02547269 s 1.00
jaxmd20 / Jax / tpu / BothRev 0.0218608162499999 s 0.021891873125 s 1.00
jaxmd20 / HLOOpt / tpu / PreRev 0.0255818937499999 s 0.02558601875 s 1.00
jaxmd20 / HLOOpt / tpu / PostRev 0.02070597125 s 0.020728129375 s 1.00
jaxmd20 / HLOOpt / tpu / BothRev 0.025678291875 s 0.02567900375 s 1.00
jaxmd20 / PartOpt / tpu / PreRev 0.02547572625 s 0.025504339375 s 1.00
jaxmd20 / PartOpt / tpu / PostRev 0.02151574875 s 0.021511336875 s 1.00
jaxmd20 / PartOpt / tpu / BothRev 0.025555076875 s 0.025595189375 s 1.00
jaxmd20 / IPartOpt / tpu / PreRev 0.0254716393749999 s 0.025476863125 s 1.00
jaxmd20 / IPartOpt / tpu / PostRev 0.0215189925 s 0.021535506875 s 1.00
jaxmd20 / IPartOpt / tpu / BothRev 0.025553548125 s 0.025550848125 s 1.00
jaxmd20 / DefOpt / tpu / PreRev 0.025477053125 s 0.0255030593749999 s 1.00
jaxmd20 / DefOpt / tpu / PostRev 0.018808345 s 0.01880534625 s 1.00
jaxmd20 / DefOpt / tpu / BothRev 0.025561075625 s 0.025599089375 s 1.00
jaxmd20 / IDefOpt / tpu / PreRev 0.025476626875 s 0.02547722375 s 1.00
jaxmd20 / IDefOpt / tpu / PostRev 0.01833314125 s 0.018347999375 s 1.00
jaxmd20 / IDefOpt / tpu / BothRev 0.025554355625 s 0.025558546875 s 1.00
jaxmd40 / JaXPipe / cpu / Primal 0.069864041 s 0.070455611 s 0.99
jaxmd40 / Jax / cpu / Primal 0.070985658 s 0.070977207 s 1.00
jaxmd40 / HLOOpt / cpu / Primal 0.090561247 s 0.090696345 s 1.00
jaxmd40 / PartOpt / cpu / Primal 0.072999064 s 0.071462105 s 1.02
jaxmd40 / IPartOpt / cpu / Primal 0.071307804 s 0.0697529569999999 s 1.02
jaxmd40 / DefOpt / cpu / Primal 0.085994243 s 0.090211465 s 0.95
jaxmd40 / IDefOpt / cpu / Primal 0.089946447 s 0.091006028 s 0.99
jaxmd40 / JaXPipe / cpu / Forward 0.157225393 s 0.160325353 s 0.98
jaxmd40 / Jax / cpu / Forward 0.090850408 s 0.086059653 s 1.06
jaxmd40 / HLOOpt / cpu / Forward 0.163614447 s 0.163041726 s 1.00
jaxmd40 / PartOpt / cpu / Forward 0.157494823 s 0.158902308 s 0.99
jaxmd40 / IPartOpt / cpu / Forward 0.156424391 s 0.156841914 s 1.00
jaxmd40 / DefOpt / cpu / Forward 0.159538319 s 0.150396513 s 1.06
jaxmd40 / IDefOpt / cpu / Forward 0.158084739 s 0.159234883 s 0.99
jaxmd40 / JaXPipe / cpu / PreRev 0.220397539 s 0.2329717379999999 s 0.95
jaxmd40 / JaXPipe / cpu / PostRev 0.133936695 s 0.136318554 s 0.98
jaxmd40 / JaXPipe / cpu / BothRev 0.220050767 s 0.2388221069999999 s 0.92
jaxmd40 / Jax / cpu / BothRev 0.123031299 s 0.139153379 s 0.88
jaxmd40 / HLOOpt / cpu / PreRev 0.223328987 s 0.21605981 s 1.03
jaxmd40 / HLOOpt / cpu / PostRev 0.176871989 s 0.174469936 s 1.01
jaxmd40 / HLOOpt / cpu / BothRev 0.256407197 s 0.246942098 s 1.04
jaxmd40 / PartOpt / cpu / PreRev 0.210923072 s 0.2335476779999999 s 0.90
jaxmd40 / PartOpt / cpu / PostRev 0.1298896269999999 s 0.1233247329999999 s 1.05
jaxmd40 / PartOpt / cpu / BothRev 0.249815782 s 0.265456492 s 0.94
jaxmd40 / IPartOpt / cpu / PreRev 0.216901774 s 0.219451872 s 0.99
jaxmd40 / IPartOpt / cpu / PostRev 0.125544352 s 0.128084786 s 0.98
jaxmd40 / IPartOpt / cpu / BothRev 0.240492003 s 0.243523231 s 0.99
jaxmd40 / DefOpt / cpu / PreRev 0.227645213 s 0.217161572 s 1.05
jaxmd40 / DefOpt / cpu / PostRev 0.166643279 s 0.191683424 s 0.87
jaxmd40 / DefOpt / cpu / BothRev 0.232830793 s 0.229506267 s 1.01
jaxmd40 / IDefOpt / cpu / PreRev 0.221408361 s 0.204105072 s 1.08
jaxmd40 / IDefOpt / cpu / PostRev 0.169468681 s 0.167389919 s 1.01
jaxmd40 / IDefOpt / cpu / BothRev 0.236044521 s 0.263771102 s 0.89
jaxley_l5pc / JaXPipe / cuda / Primal 3.0925719955048407 s
jaxley_l5pc / Jax / cuda / Primal 3.091246781499649 s
jaxley_l5pc / HLOOpt / cuda / Primal 3.641045484997449 s
jaxley_l5pc / PartOpt / cuda / Primal 3.4375655574986013 s
jaxley_l5pc / IPartOpt / cuda / Primal 3.4381210490028025 s
jaxley_l5pc / DefOpt / cuda / Primal 3.1555231930033187 s
jaxley_l5pc / IDefOpt / cuda / Primal 3.3513392664972343 s
jaxley_l5pc / JaXPipe / cuda / Forward 6.33260042549955 s
jaxley_l5pc / Jax / cuda / Forward 5.855309637503524 s
jaxley_l5pc / HLOOpt / cuda / Forward 6.116221485499409 s
jaxley_l5pc / PartOpt / cuda / Forward 6.331197531995713 s
jaxley_l5pc / IPartOpt / cuda / Forward 6.331824204004079 s
jaxley_l5pc / DefOpt / cuda / Forward 6.331567016495683 s
jaxley_l5pc / IDefOpt / cuda / Forward 6.331120531001943 s
jaxley_l5pc / JaXPipe / cpu / Primal 1.0277289004998238 s
jaxley_l5pc / Jax / cpu / Primal 1.0022906139997758 s
jaxley_l5pc / HLOOpt / cpu / Primal 1.052065173500523 s
jaxley_l5pc / PartOpt / cpu / Primal 0.8358459615001266 s
jaxley_l5pc / IPartOpt / cpu / Primal 0.987979125000038 s
jaxley_l5pc / DefOpt / cpu / Primal 0.9394538745000318 s
jaxley_l5pc / IDefOpt / cpu / Primal 0.980509082000026 s
jaxley_l5pc / JaXPipe / cpu / Forward 21.198061596499883 s
jaxley_l5pc / Jax / cpu / Forward 26.150097345000177 s
jaxley_l5pc / HLOOpt / cpu / Forward 21.25610011299977 s
jaxley_l5pc / PartOpt / cpu / Forward 21.31876329099987 s
jaxley_l5pc / IPartOpt / cpu / Forward 21.608647604 s
jaxley_l5pc / DefOpt / cpu / Forward 21.42443357550019 s
jaxley_l5pc / IDefOpt / cpu / Forward 21.19515658599994 s
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal 0.000295582 s 0.000283262 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal 0.000295518 s 0.000282558 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal 0.000302238 s 0.000290237 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal 0.000295359 s 0.000282238 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal 0.000295263 s 0.000283134 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal 0.000303422 s 0.00029011 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal 0.0003018859999999 s 0.000290237 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward 0.000582461 s 0.000559131 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward 0.000566589 s 0.0005419159999999 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward 0.000583261 s 0.000559963 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward 0.00058278 s 0.000558619 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward 0.000582365 s 0.0005594829999999 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward 0.000582109 s 0.000558172 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward 0.0005830039999999 s 0.0005583949999999 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev 0.001054362 s 0.001028888 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev 0.001011802 s 0.000987672 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev 0.001048538 s 0.001027992 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev 0.001004026 s 0.000991256 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev 0.001035002 s 0.00101852 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev 0.001060699 s 0.001043415 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev 0.001033626 s 0.001014904 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev 0.0010476749999999 s 0.001031736 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev 0.000998715 s 0.000982233 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev 0.001048186 s 0.001030904 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev 0.001050107 s 0.00103148 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev 0.0009986819999999 s 0.000977753 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev 0.00105081 s 0.001031 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev 0.0010502979999999 s 0.00102556 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev 0.000984539 s 0.000965176 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev 0.0010487299999999 s 0.001027641 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev 0.001048762 s 0.0010238 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev 0.001051802 s 0.001025368 s 1.03
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev 0.001051995 s 0.001026904 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal 0.00012932075 s 0.00012361825 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal 0.00012362925 s 0.00012670775 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal 0.000158851 s 0.00015277825 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal 0.00013093625 s 0.00013410125 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal 0.00013707525 s 0.00013109675 s 1.05
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal 0.00014527725 s 0.00014807925 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal 0.00015678025 s 0.00015087025 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward 0.000213655 s 0.0002119264999999 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward 0.00026234575 s 0.00026120825 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward 0.00021999375 s 0.00021193525 s 1.04
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward 0.0002138494999999 s 0.000218387 s 0.98
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward 0.00021629575 s 0.00021170925 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward 0.0002179435 s 0.0002185024999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward 0.00021621525 s 0.0002119279999999 s 1.02
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev 0.00035764825 s 0.0003552015 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev 0.0002559782499999 s 0.00025893075 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev 0.0003571572499999 s 0.00035548025 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev 0.00025769525 s 0.00025925425 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev 0.00035793225 s 0.0003552615 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev 0.0002910579999999 s 0.00029143275 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev 0.00035788775 s 0.000355109 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev 0.00035559275 s 0.000356625 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev 0.00027471125 s 0.0002719432499999 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev 0.0003556765 s 0.000356626 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev 0.00035784925 s 0.0003552165 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev 0.0002721227499999 s 0.00027480525 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev 0.0003575155 s 0.000355203 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev 0.0003577875 s 0.0003589354999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev 0.00028508275 s 0.000283826 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev 0.0003575419999999 s 0.000359149 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev 0.00035971475 s 0.000357337 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev 0.000301159 s 0.0003017189999999 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev 0.000360002 s 0.0003577595 s 1.01
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal 0.00139711 s 0.0009588552000423 s 1.46
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal 0.001357834 s 0.0009516709998933 s 1.43
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal 0.001523989 s 0.0010057729999971 s 1.52
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal 0.0014595629999999 s 0.0009291818001656 s 1.57
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal 0.0013188599999999 s 0.0009374562001539 s 1.41
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal 0.001504152 s 0.0009786819997316 s 1.54
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal 0.001549951 s 0.0009964631999537 s 1.56
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward 0.0042980219999999 s 0.0022665190001134 s 1.90
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward 0.004218632 s 0.002419118599937 s 1.74
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward 0.004227222 s 0.0022202353999091 s 1.90
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward 0.003932014 s 0.002213902999938 s 1.78
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward 0.00384483 s 0.0022291920000498 s 1.72
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward 0.004096659 s 0.0022840097997686 s 1.79
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward 0.004299596 s 0.0023154484002589 s 1.86
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev 0.0075002969999999 s 0.0059770792000563 s 1.25
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev 0.00869369 s 0.006564533799974 s 1.32
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev 0.007583365 s 0.0062062592000074 s 1.22
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev 0.006975437 s 0.0070674812001016 s 0.99
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev 0.007455173 s 0.0052356914000483 s 1.42
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev 0.005981815 s 0.0062103693997414 s 0.96
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev 0.007810482 s 0.006651248199887 s 1.17
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev 0.007870404 s 0.0060678497999106 s 1.30
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev 0.008260557 s 0.0062865141995644 s 1.31
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev 0.007379295 s 0.005630325400125 s 1.31
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev 0.008052593 s 0.005550784800107 s 1.45
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev 0.008406844 s 0.0056538740001997 s 1.49
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev 0.00782498 s 0.0065768796001066 s 1.19
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev 0.008337991 s 0.0056750671999907 s 1.47
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev 0.00687821 s 0.0068473344001176 s 1.00
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev 0.008554965 s 0.0057122905998767 s 1.50
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev 0.007913134 s 0.005576557400127 s 1.42
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev 0.008546929 s 0.0068063960001381 s 1.26
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev 0.008348635 s 0.0055609175997233 s 1.50
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal 1.702040499 s 1.702951013 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal 1.705016895 s 1.705186629 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal 1.71504903 s 1.717234453 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal 1.696890089 s 1.696901907 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal 1.694657655 s 1.694897228 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal 1.66591526 s 1.66528891 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal 1.911819131 s 1.914906271 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal 3.03812712875 s 3.038658345625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal 3.038645011875 s 3.03917762375 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal 3.12088705875 s 3.12155321875 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal 3.0594905100000003 s 3.0599546075 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal 3.0596811125 s 3.060167825 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal 2.102192178125 s 2.10236119125 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal 2.94652327875 s 2.948083710625 s 1.00
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal 5.7309540740000005 s 5.872614181 s 0.98
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal 5.873826306 s 6.02566513 s 0.97
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal 5.945230124 s 6.010850407 s 0.99
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal 5.813805289 s 5.953050324 s 0.98
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal 5.916455158 s 6.100518878 s 0.97
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal 2.1836341860000004 s 2.279712861 s 0.96
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal 6.294406859 s 6.451860342 s 0.98
scatter_sum / JaXPipe / cuda / Primal 0.000010688 s 0.000010016 s 1.07
scatter_sum / Jax / cuda / Primal 0.000010432 s 0.000009568 s 1.09
scatter_sum / HLOOpt / cuda / Primal 0.000010912 s 0.0000096 s 1.14
scatter_sum / PartOpt / cuda / Primal 0.0000104 s 0.000009856 s 1.06
scatter_sum / IPartOpt / cuda / Primal 0.00001072 s 0.00000992 s 1.08
scatter_sum / DefOpt / cuda / Primal 0.000010752 s 0.000009504 s 1.13
scatter_sum / IDefOpt / cuda / Primal 0.0000104 s 0.000009696 s 1.07
scatter_sum / JaXPipe / cuda / Forward 0.000017343 s 0.00001664 s 1.04
scatter_sum / Jax / cuda / Forward 0.000017311 s 0.000016609 s 1.04
scatter_sum / HLOOpt / cuda / Forward 0.000017760000000000003 s 0.000016607 s 1.07
scatter_sum / PartOpt / cuda / Forward 0.000017472 s 0.00001728 s 1.01
scatter_sum / IPartOpt / cuda / Forward 0.000017439 s 0.000016385 s 1.06
scatter_sum / DefOpt / cuda / Forward 0.0000176 s 0.000017056 s 1.03
scatter_sum / IDefOpt / cuda / Forward 0.000017568000000000002 s 0.000016927999999999998 s 1.04
scatter_sum / JaXPipe / cuda / PreRev 0.000017534999999999997 s 0.000016768000000000003 s 1.05
scatter_sum / JaXPipe / cuda / PostRev 0.000017408 s 0.000016223 s 1.07
scatter_sum / JaXPipe / cuda / BothRev 0.000017152 s 0.000016383999999999998 s 1.05
scatter_sum / Jax / cuda / BothRev 0.000017406999999999998 s 0.000016288 s 1.07
scatter_sum / HLOOpt / cuda / PreRev 0.00001744 s 0.00001696 s 1.03
scatter_sum / HLOOpt / cuda / PostRev 0.000017215 s 0.000016128 s 1.07
scatter_sum / HLOOpt / cuda / BothRev 0.000017536 s 0.000016576000000000002 s 1.06
scatter_sum / PartOpt / cuda / PreRev 0.000017919999999999998 s 0.000017184 s 1.04
scatter_sum / PartOpt / cuda / PostRev 0.000016992 s 0.000016832 s 1.01
scatter_sum / PartOpt / cuda / BothRev 0.0000176 s 0.000016896000000000002 s 1.04
scatter_sum / IPartOpt / cuda / PreRev 0.000017056 s 0.000017152 s 0.99
scatter_sum / IPartOpt / cuda / PostRev 0.00001728 s 0.000016704 s 1.03
scatter_sum / IPartOpt / cuda / BothRev 0.000017856 s 0.00001648 s 1.08
scatter_sum / DefOpt / cuda / PreRev 0.000017887 s 0.000017088 s 1.05
scatter_sum / DefOpt / cuda / PostRev 0.00001632 s 0.000016768000000000003 s 0.97
scatter_sum / DefOpt / cuda / BothRev 0.00001712 s 0.000016768000000000003 s 1.02
scatter_sum / IDefOpt / cuda / PreRev 0.00001744 s 0.000016767 s 1.04
scatter_sum / IDefOpt / cuda / PostRev 0.000017247999999999998 s 0.000016991 s 1.02
scatter_sum / IDefOpt / cuda / BothRev 0.00001728 s 0.00001664 s 1.04
scatter_sum / JaXPipe / tpu / Primal 0.000001343125 s 0.0000013433999999999995 s 1.00
scatter_sum / Jax / tpu / Primal 0.000001357475 s 0.000001403875 s 0.97
scatter_sum / HLOOpt / tpu / Primal 0.00000134345 s 0.00000134345 s 1
scatter_sum / PartOpt / tpu / Primal 0.00000135735 s 0.000001404025 s 0.97
scatter_sum / IPartOpt / tpu / Primal 0.0000013437 s 0.0000013430500000000002 s 1.00
scatter_sum / DefOpt / tpu / Primal 0.0000013573250000000002 s 0.000001404625 s 0.97
scatter_sum / IDefOpt / tpu / Primal 0.000001343975 s 0.000001343275 s 1.00
scatter_sum / JaXPipe / tpu / Forward 0.0000027404 s 0.0000027061 s 1.01
scatter_sum / Jax / tpu / Forward 0.00000276875 s 0.0000027175250000000003 s 1.02
scatter_sum / HLOOpt / tpu / Forward 0.0000027389999999999995 s 0.000002700425 s 1.01
scatter_sum / PartOpt / tpu / Forward 0.00000272055 s 0.000002684175 s 1.01
scatter_sum / IPartOpt / tpu / Forward 0.000002738325 s 0.0000027013 s 1.01
scatter_sum / DefOpt / tpu / Forward 0.000002717875 s 0.000002689025 s 1.01
scatter_sum / IDefOpt / tpu / Forward 0.0000027376000000000003 s 0.00000271025 s 1.01
scatter_sum / JaXPipe / tpu / PreRev 0.000002712225 s 0.000002679725 s 1.01
scatter_sum / JaXPipe / tpu / PostRev 0.000002740725 s 0.00000268195 s 1.02
scatter_sum / JaXPipe / tpu / BothRev 0.000002730375 s 0.00000269985 s 1.01
scatter_sum / Jax / tpu / BothRev 0.000002790875 s 0.0000027357 s 1.02
scatter_sum / HLOOpt / tpu / PreRev 0.000002729475 s 0.00000269875 s 1.01
scatter_sum / HLOOpt / tpu / PostRev 0.00000278675 s 0.000002735775 s 1.02
scatter_sum / HLOOpt / tpu / BothRev 0.0000027249 s 0.0000027005249999999994 s 1.01
scatter_sum / PartOpt / tpu / PreRev 0.00000278905 s 0.00000274035 s 1.02
scatter_sum / PartOpt / tpu / PostRev 0.0000027276000000000004 s 0.0000026911 s 1.01
scatter_sum / PartOpt / tpu / BothRev 0.00000279165 s 0.000002741575 s 1.02
scatter_sum / IPartOpt / tpu / PreRev 0.00000273165 s 0.0000026922 s 1.01
scatter_sum / IPartOpt / tpu / PostRev 0.00000278895 s 0.0000027361 s 1.02
scatter_sum / IPartOpt / tpu / BothRev 0.000002732925 s 0.000002695175 s 1.01
scatter_sum / DefOpt / tpu / PreRev 0.0000027963500000000003 s 0.000002741475 s 1.02
scatter_sum / DefOpt / tpu / PostRev 0.00000273155 s 0.000002697025 s 1.01
scatter_sum / DefOpt / tpu / BothRev 0.000002787375000000001 s 0.00000274035 s 1.02
scatter_sum / IDefOpt / tpu / PreRev 0.0000027260250000000003 s 0.000002692925 s 1.01
scatter_sum / IDefOpt / tpu / PostRev 0.000002785325 s 0.00000273545 s 1.02
scatter_sum / IDefOpt / tpu / BothRev 0.0000027261000000000005 s 0.0000026958 s 1.01
scatter_sum / JaXPipe / cpu / Primal 0.000015418 s 0.000007860860059736296 s 1.96
scatter_sum / Jax / cpu / Primal 0.000015472 s 0.000007609080003021518 s 2.03
scatter_sum / HLOOpt / cpu / Primal 0.000015353 s 0.000007795760029694066 s 1.97
scatter_sum / PartOpt / cpu / Primal 0.000015461 s 0.000007391420022031525 s 2.09
scatter_sum / IPartOpt / cpu / Primal 0.000015388 s 0.000007779839907016139 s 1.98
scatter_sum / DefOpt / cpu / Primal 0.0000155 s 0.000007508239996241171 s 2.06
scatter_sum / IDefOpt / cpu / Primal 0.000015673 s 0.000007569679928565165 s 2.07
scatter_sum / JaXPipe / cpu / Forward 0.000022455 s 0.000011678260034386766 s 1.92
scatter_sum / Jax / cpu / Forward 0.000022273 s 0.000011107439968327526 s 2.01
scatter_sum / HLOOpt / cpu / Forward 0.000022358 s 0.000011391500029276358 s 1.96
scatter_sum / PartOpt / cpu / Forward 0.000022228 s 0.000011152620045322692 s 1.99
scatter_sum / IPartOpt / cpu / Forward 0.000022412 s 0.000012152039926149882 s 1.84
scatter_sum / DefOpt / cpu / Forward 0.000022413 s 0.000011344800022925482 s 1.98
scatter_sum / IDefOpt / cpu / Forward 0.000022692 s 0.000011099660077888985 s 2.04
scatter_sum / JaXPipe / cpu / PreRev 0.000022773 s 0.000011580820009839954 s 1.97
scatter_sum / JaXPipe / cpu / PostRev 0.000022632 s 0.000011251139967498605 s 2.01
scatter_sum / JaXPipe / cpu / BothRev 0.000022338 s 0.000011530240008141846 s 1.94
scatter_sum / Jax / cpu / BothRev 0.000022398 s 0.000011377560058463132 s 1.97
scatter_sum / HLOOpt / cpu / PreRev 0.000022701 s 0.000011670840012811823 s 1.95
scatter_sum / HLOOpt / cpu / PostRev 0.000022489 s 0.000013719059916184051 s 1.64
scatter_sum / HLOOpt / cpu / BothRev 0.000022015 s 0.000011614680006459822 s 1.90
scatter_sum / PartOpt / cpu / PreRev 0.000022609 s 0.000011115040051663528 s 2.03
scatter_sum / PartOpt / cpu / PostRev 0.000022607 s 0.000011269699989497894 s 2.01
scatter_sum / PartOpt / cpu / BothRev 0.000022302 s 0.0000121184199815616 s 1.84
scatter_sum / IPartOpt / cpu / PreRev 0.000022283 s 0.000011329860008117976 s 1.97
scatter_sum / IPartOpt / cpu / PostRev 0.000022752 s 0.000011477400021249197 s 1.98
scatter_sum / IPartOpt / cpu / BothRev 0.000022606 s 0.00001145995996921556 s 1.97
scatter_sum / DefOpt / cpu / PreRev 0.000022718 s 0.00001141110002208734 s 1.99
scatter_sum / DefOpt / cpu / PostRev 0.000022317 s 0.000011709120044542944 s 1.91
scatter_sum / DefOpt / cpu / BothRev 0.000022342 s 0.000011378699946362758 s 1.96
scatter_sum / IDefOpt / cpu / PreRev 0.000022687 s 0.000011255140034336364 s 2.02
scatter_sum / IDefOpt / cpu / PostRev 0.000021917 s 0.000011675160167214926 s 1.88
scatter_sum / IDefOpt / cpu / BothRev 0.000022432 s 0.00001117430001613684 s 2.01
slicing / JaXPipe / cuda / Primal 0.000002303 s 0.000001887 s 1.22
slicing / Jax / cuda / Primal 0.000002303 s 0.000001887 s 1.22
slicing / HLOOpt / cuda / Primal 0.000002303 s 0.000001887 s 1.22
slicing / PartOpt / cuda / Primal 0.000002304 s 0.000001887 s 1.22
slicing / IPartOpt / cuda / Primal 0.000002303 s 0.000001887 s 1.22
slicing / DefOpt / cuda / Primal 0.000002303 s 0.000001887 s 1.22
slicing / IDefOpt / cuda / Primal 0.000002304 s 0.000001887 s 1.22
slicing / JaXPipe / cuda / Forward 0.00001008 s 0.0000112 s 0.90
slicing / Jax / cuda / Forward 0.00001072 s 0.000010784 s 0.99
slicing / HLOOpt / cuda / Forward 0.000010559 s 0.000009696 s 1.09
slicing / PartOpt / cuda / Forward 0.000011136 s 0.0000096 s 1.16
slicing / IPartOpt / cuda / Forward 0.000011231 s 0.0000096 s 1.17
slicing / DefOpt / cuda / Forward 0.000011712 s 0.000009344 s 1.25
slicing / IDefOpt / cuda / Forward 0.0000104 s 0.000009888 s 1.05
slicing / JaXPipe / cuda / PreRev 0.00001024 s 0.00000976 s 1.05
slicing / JaXPipe / cuda / PostRev 0.000010336 s 0.000009248 s 1.12
slicing / JaXPipe / cuda / BothRev 0.000010816 s 0.000009408 s 1.15
slicing / Jax / cuda / BothRev 0.00001024 s 0.00000976 s 1.05
slicing / HLOOpt / cuda / PreRev 0.0000104 s 0.000009311 s 1.12
slicing / HLOOpt / cuda / PostRev 0.000010208 s 0.000009184 s 1.11
slicing / HLOOpt / cuda / BothRev 0.00001008 s 0.000009407 s 1.07
slicing / PartOpt / cuda / PreRev 0.000010337 s 0.000009632 s 1.07
slicing / PartOpt / cuda / PostRev 0.000010209 s 0.000009536 s 1.07
slicing / PartOpt / cuda / BothRev 0.0000104 s 0.000009696 s 1.07
slicing / IPartOpt / cuda / PreRev 0.000010528 s 0.000010176 s 1.03
slicing / IPartOpt / cuda / PostRev 0.000009856 s 0.000009152 s 1.08
slicing / IPartOpt / cuda / BothRev 0.000010304 s 0.000009376 s 1.10
slicing / DefOpt / cuda / PreRev 0.000010048 s 0.000009792 s 1.03
slicing / DefOpt / cuda / PostRev 0.000010335 s 0.000009536 s 1.08
slicing / DefOpt / cuda / BothRev 0.00001024 s 0.000009824 s 1.04
slicing / IDefOpt / cuda / PreRev 0.000010336 s 0.000009664 s 1.07
slicing / IDefOpt / cuda / PostRev 0.000010176 s 0.00000928 s 1.10
slicing / IDefOpt / cuda / BothRev 0.00001024 s 0.000009407 s 1.09
slicing / JaXPipe / tpu / Primal 9.46775e-7 s 0.0000010327749999999998 s 0.92
slicing / Jax / tpu / Primal 9.6475e-7 s 9.6835e-7 s 1.00
slicing / HLOOpt / tpu / Primal 9.49225e-7 s 0.000001028125 s 0.92
slicing / PartOpt / tpu / Primal 9.62925e-7 s 9.7225e-7 s 0.99
slicing / IPartOpt / tpu / Primal 9.53675e-7 s 0.00000102785 s 0.93
slicing / DefOpt / tpu / Primal 9.6465e-7 s 9.73525e-7 s 0.99
slicing / IDefOpt / tpu / Primal 9.52575e-7 s 0.00000102885 s 0.93
slicing / JaXPipe / tpu / Forward 0.00000140235 s 0.000001412225 s 0.99
slicing / Jax / tpu / Forward 0.00000139985 s 0.0000014785 s 0.95
slicing / HLOOpt / tpu / Forward 0.0000015031 s 0.00000152365 s 0.99
slicing / PartOpt / tpu / Forward 0.00000141785 s 0.0000014948749999999998 s 0.95
slicing / IPartOpt / tpu / Forward 0.000001504775 s 0.000001518125 s 0.99
slicing / DefOpt / tpu / Forward 0.00000142385 s 0.00000149345 s 0.95
slicing / IDefOpt / tpu / Forward 0.0000015093 s 0.000001518 s 0.99
slicing / JaXPipe / tpu / PreRev 0.000002337975 s 0.000002574175 s 0.91
slicing / JaXPipe / tpu / PostRev 0.0000025166500000000003 s 0.00000252945 s 0.99
slicing / JaXPipe / tpu / BothRev 0.0000023508 s 0.00000258505 s 0.91
slicing / Jax / tpu / BothRev 0.00000252575 s 0.000002541075 s 0.99
slicing / HLOOpt / tpu / PreRev 0.000002357475 s 0.000002582225 s 0.91
slicing / HLOOpt / tpu / PostRev 0.0000025261250000000003 s 0.000002545775 s 0.99
slicing / HLOOpt / tpu / BothRev 0.00000236105 s 0.000002588025 s 0.91
slicing / PartOpt / tpu / PreRev 0.0000025314 s 0.000002544625 s 0.99
slicing / PartOpt / tpu / PostRev 0.000002341825 s 0.000002578375 s 0.91
slicing / PartOpt / tpu / BothRev 0.0000025297 s 0.00000254555 s 0.99
slicing / IPartOpt / tpu / PreRev 0.00000234325 s 0.0000025965 s 0.90
slicing / IPartOpt / tpu / PostRev 0.0000025286 s 0.0000025475999999999995 s 0.99
slicing / IPartOpt / tpu / BothRev 0.0000023469000000000003 s 0.000002588025 s 0.91
slicing / DefOpt / tpu / PreRev 0.00000253315 s 0.0000025353750000000003 s 1.00
slicing / DefOpt / tpu / PostRev 0.00000234965 s 0.000002575875 s 0.91
slicing / DefOpt / tpu / BothRev 0.0000025242249999999995 s 0.0000025388 s 0.99
slicing / IDefOpt / tpu / PreRev 0.00000234065 s 0.00000258995 s 0.90
slicing / IDefOpt / tpu / PostRev 0.000002533925 s 0.000002540325 s 1.00
slicing / IDefOpt / tpu / BothRev 0.0000023498250000000004 s 0.0000025786 s 0.91
slicing / JaXPipe / cpu / Primal 0.000012667 s 0.000006055780013412004 s 2.09
slicing / Jax / cpu / Primal 0.00001259 s 0.000006401219998224406 s 1.97
slicing / HLOOpt / cpu / Primal 0.000012482 s 0.000006648099879384972 s 1.88
slicing / PartOpt / cpu / Primal 0.000012418 s 0.00000627523999355617 s 1.98
slicing / IPartOpt / cpu / Primal 0.000012505 s 0.000006483800007117679 s 1.93
slicing / DefOpt / cpu / Primal 0.000012631 s 0.000006234299999050563 s 2.03
slicing / IDefOpt / cpu / Primal 0.000012385 s 0.00000625420003416366 s 1.98
slicing / JaXPipe / cpu / Forward 0.000017035999999999997 s 0.00000923432007766678 s 1.84
slicing / Jax / cpu / Forward 0.000016667 s 0.00000918981999348034 s 1.81
slicing / HLOOpt / cpu / Forward 0.000016545 s 0.000009896999945340212 s 1.67
slicing / PartOpt / cpu / Forward 0.000016739 s 0.000009109320035349811 s 1.84
slicing / IPartOpt / cpu / Forward 0.000016833 s 0.000009506240094196982 s 1.77
slicing / DefOpt / cpu / Forward 0.000016561 s 0.000009059960066224448 s 1.83
slicing / IDefOpt / cpu / Forward 0.000016723 s 0.000009192240067932287 s 1.82
slicing / JaXPipe / cpu / PreRev 0.000017347000000000002 s 0.000010620739940350176 s 1.63
slicing / JaXPipe / cpu / PostRev 0.000017295000000000003 s 0.000010175260104006155 s 1.70
slicing / JaXPipe / cpu / BothRev 0.000017391 s 0.00001053578003848088 s 1.65
slicing / Jax / cpu / BothRev 0.00001717 s 0.000010063680038001622 s 1.71
slicing / HLOOpt / cpu / PreRev 0.000017249 s 0.000010681059993657982 s 1.61
slicing / HLOOpt / cpu / PostRev 0.00001737 s 0.000011751160000130767 s 1.48
slicing / HLOOpt / cpu / BothRev 0.000017427 s 0.000009757120024005417 s 1.79
slicing / PartOpt / cpu / PreRev 0.000017458 s 0.000009965299923351268 s 1.75
slicing / PartOpt / cpu / PostRev 0.000017454999999999998 s 0.000009642119948694017 s 1.81
slicing / PartOpt / cpu / BothRev 0.000017281000000000003 s 0.00001038146005157614 s 1.66
slicing / IPartOpt / cpu / PreRev 0.000017149 s 0.00001028291997499764 s 1.67
slicing / IPartOpt / cpu / PostRev 0.000017128999999999998 s 0.000009681240007921588 s 1.77
slicing / IPartOpt / cpu / BothRev 0.000017454 s 0.000009983480067603525 s 1.75
slicing / DefOpt / cpu / PreRev 0.000017446 s 0.000009870160047285026 s 1.77
slicing / DefOpt / cpu / PostRev 0.000017372 s 0.000009663760047260438 s 1.80
slicing / DefOpt / cpu / BothRev 0.000017346 s 0.00000990344004094368 s 1.75
slicing / IDefOpt / cpu / PreRev 0.000017514 s 0.00000993309995465097 s 1.76
slicing / IDefOpt / cpu / PostRev 0.000017101 s 0.000010138960005861008 s 1.69
slicing / IDefOpt / cpu / BothRev 0.000017193 s 0.000009660899995651564 s 1.78
sum / JaXPipe / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / Jax / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / HLOOpt / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / PartOpt / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / IPartOpt / cuda / Primal 0.000002496 s 0.000002079 s 1.20
sum / DefOpt / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / IDefOpt / cuda / Primal 0.000002496 s 0.00000208 s 1.20
sum / JaXPipe / cuda / Forward 0.000010943 s 0.00001024 s 1.07
sum / Jax / cuda / Forward 0.00001072 s 0.000010015 s 1.07
sum / HLOOpt / cuda / Forward 0.000010592 s 0.000009791 s 1.08
sum / PartOpt / cuda / Forward 0.0000104 s 0.00001024 s 1.02
sum / IPartOpt / cuda / Forward 0.00001056 s 0.000010144 s 1.04
sum / DefOpt / cuda / Forward 0.00001072 s 0.000010144 s 1.06
sum / IDefOpt / cuda / Forward 0.000010815 s 0.000009952 s 1.09
sum / JaXPipe / cuda / PreRev 0.000010112 s 0.000009504 s 1.06
sum / JaXPipe / cuda / PostRev 0.000010272 s 0.000009312000000000002 s 1.10
sum / JaXPipe / cuda / BothRev 0.0000104 s 0.000009824 s 1.06
sum / Jax / cuda / BothRev 0.000010304 s 0.000009504 s 1.08
sum / HLOOpt / cuda / PreRev 0.000010752 s 0.000009472 s 1.14
sum / HLOOpt / cuda / PostRev 0.000010272 s 0.00000944 s 1.09
sum / HLOOpt / cuda / BothRev 0.000010207 s 0.000009472 s 1.08
sum / PartOpt / cuda / PreRev 0.000008959999999999999 s 0.000009824 s 0.91
sum / PartOpt / cuda / PostRev 0.000010336 s 0.0000096 s 1.08
sum / PartOpt / cuda / BothRev 0.000010463 s 0.00000976 s 1.07
sum / IPartOpt / cuda / PreRev 0.000010399 s 0.000010016 s 1.04
sum / IPartOpt / cuda / PostRev 0.000009984 s 0.000009376 s 1.06
sum / IPartOpt / cuda / BothRev 0.00001008 s 0.000009408 s 1.07
sum / DefOpt / cuda / PreRev 0.000010144 s 0.000010048 s 1.01
sum / DefOpt / cuda / PostRev 0.0000104 s 0.000009248 s 1.12
sum / DefOpt / cuda / BothRev 0.000010367 s 0.000009472 s 1.09
sum / IDefOpt / cuda / PreRev 0.000010432 s 0.000009729 s 1.07
sum / IDefOpt / cuda / PostRev 0.000010336 s 0.00000976 s 1.06
sum / IDefOpt / cuda / BothRev 0.00001056 s 0.000009376 s 1.13
sum / JaXPipe / tpu / Primal 5.170999999999999e-7 s 5.104e-7 s 1.01
sum / Jax / tpu / Primal 5.468e-7 s 5.4715e-7 s 1.00
sum / HLOOpt / tpu / Primal 5.1685e-7 s 5.106499999999999e-7 s 1.01
sum / PartOpt / tpu / Primal 5.47125e-7 s 5.47325e-7 s 1.00
sum / IPartOpt / tpu / Primal 5.1715e-7 s 5.10375e-7 s 1.01
sum / DefOpt / tpu / Primal 5.4755e-7 s 5.4725e-7 s 1.00
sum / IDefOpt / tpu / Primal 5.1685e-7 s 5.10875e-7 s 1.01
sum / JaXPipe / tpu / Forward 0.00000155165 s 0.00000154625 s 1.00
sum / Jax / tpu / Forward 0.0000015031 s 0.000001501025 s 1.00
sum / HLOOpt / tpu / Forward 0.000001527925 s 0.000001532 s 1.00
sum / PartOpt / tpu / Forward 0.000001499825 s 0.000001497225 s 1.00
sum / IPartOpt / tpu / Forward 0.000001528325 s 0.000001537975 s 0.99
sum / DefOpt / tpu / Forward 0.000001504075 s 0.00000150195 s 1.00
sum / IDefOpt / tpu / Forward 0.0000015416750000000002 s 0.000001539375 s 1.00
sum / JaXPipe / tpu / PreRev 0.0000010075749999999998 s 0.000001047525 s 0.96
sum / JaXPipe / tpu / PostRev 0.0000010301749999999998 s 0.000001085425 s 0.95
sum / JaXPipe / tpu / BothRev 0.000001004125 s 0.0000010469749999999998 s 0.96
sum / Jax / tpu / BothRev 0.000001030775 s 0.00000108705 s 0.95
sum / HLOOpt / tpu / PreRev 0.00000100665 s 0.0000010459 s 0.96
sum / HLOOpt / tpu / PostRev 0.000001034025 s 0.000001083625 s 0.95
sum / HLOOpt / tpu / BothRev 0.0000010108250000000002 s 0.000001051 s 0.96
sum / PartOpt / tpu / PreRev 0.000001037825 s 0.000001084875 s 0.96
sum / PartOpt / tpu / PostRev 0.00000100115 s 0.00000105225 s 0.95
sum / PartOpt / tpu / BothRev 0.000001029925 s 0.000001105025 s 0.93
sum / IPartOpt / tpu / PreRev 0.0000010099 s 0.000001056775 s 0.96
sum / IPartOpt / tpu / PostRev 0.000001037725 s 0.0000011002 s 0.94
sum / IPartOpt / tpu / BothRev 9.9985e-7 s 0.000001069775 s 0.93
sum / DefOpt / tpu / PreRev 0.0000010390499999999998 s 0.000001087875 s 0.96
sum / DefOpt / tpu / PostRev 9.996249999999998e-7 s 0.000001063325 s 0.94
sum / DefOpt / tpu / BothRev 0.0000010315 s 0.0000010876 s 0.95
sum / IDefOpt / tpu / PreRev 0.000001011875 s 0.00000105495 s 0.96
sum / IDefOpt / tpu / PostRev 0.000001029875 s 0.0000010968 s 0.94
sum / IDefOpt / tpu / BothRev 0.0000010097 s 0.000001060625 s 0.95
sum / JaXPipe / cpu / Primal 0.000014758 s 0.000007892679968790617 s 1.87
sum / Jax / cpu / Primal 0.000014287 s 0.00000730996003767359 s 1.95
sum / HLOOpt / cpu / Primal 0.000014439 s 0.000008103620002657408 s 1.78
sum / PartOpt / cpu / Primal 0.00001451 s 0.000007371400042757159 s 1.97
sum / IPartOpt / cpu / Primal 0.00001448 s 0.000007986760138010141 s 1.81
sum / DefOpt / cpu / Primal 0.000014525 s 0.00000800075993538485 s 1.82
sum / IDefOpt / cpu / Primal 0.000014306 s 0.000007718659908277914 s 1.85
sum / JaXPipe / cpu / Forward 0.000020017 s 0.000011231939952267568 s 1.78
sum / Jax / cpu / Forward 0.000020133 s 0.000011011899896402613 s 1.83
sum / HLOOpt / cpu / Forward 0.000019511 s 0.000011633680114755407 s 1.68
sum / PartOpt / cpu / Forward 0.000019969 s 0.000010882500027946662 s 1.83
sum / IPartOpt / cpu / Forward 0.00001986 s 0.000011480140037747334 s 1.73
sum / DefOpt / cpu / Forward 0.000019588000000000003 s 0.00001102809999792953 s 1.78
sum / IDefOpt / cpu / Forward 0.000019379 s 0.000010877019994950388 s 1.78
sum / JaXPipe / cpu / PreRev 0.000019235 s 0.000011482460013212404 s 1.68
sum / JaXPipe / cpu / PostRev 0.000018746 s 0.000010741920032160124 s 1.75
sum / JaXPipe / cpu / BothRev 0.000018625 s 0.000010569679961918154 s 1.76
sum / Jax / cpu / BothRev 0.000018579 s 0.000010753660044429123 s 1.73
sum / HLOOpt / cpu / PreRev 0.000019088 s 0.00001103539996620384 s 1.73
sum / HLOOpt / cpu / PostRev 0.000018509 s 0.000012518980001914317 s 1.48
sum / HLOOpt / cpu / BothRev 0.000018686 s 0.00001066204007656779 s 1.75
sum / PartOpt / cpu / PreRev 0.000018904 s 0.00001123383997764904 s 1.68
sum / PartOpt / cpu / PostRev 0.000018763 s 0.000010826800044014816 s 1.73
sum / PartOpt / cpu / BothRev 0.000018801 s 0.000010874180097744102 s 1.73
sum / IPartOpt / cpu / PreRev 0.000018903 s 0.00001092020002033678 s 1.73
sum / IPartOpt / cpu / PostRev 0.000018672 s 0.000010615059964038664 s 1.76
sum / IPartOpt / cpu / BothRev 0.000018671 s 0.000010592939961497903 s 1.76
sum / DefOpt / cpu / PreRev 0.000018645 s 0.00001073278008334455 s 1.74
sum / DefOpt / cpu / PostRev 0.00001915 s 0.000010550599963607963 s 1.82
sum / DefOpt / cpu / BothRev 0.000018634 s 0.00001068278010279755 s 1.74
sum / IDefOpt / cpu / PreRev 0.000018907 s 0.00001057109999237582 s 1.79
sum / IDefOpt / cpu / PostRev 0.00001902 s 0.000010405599969089964 s 1.83
sum / IDefOpt / cpu / BothRev 0.000018892 s 0.000010484559952601558 s 1.80
value_and_grad / JaXPipe / cuda / Primal 0.000032896000000000005 s 0.000033792000000000004 s 0.97
value_and_grad / Jax / cuda / Primal 0.000032 s 0.000033184 s 0.96
value_and_grad / HLOOpt / cuda / Primal 0.000032096 s 0.000032767999999999995 s 0.98
value_and_grad / PartOpt / cuda / Primal 0.000032096 s 0.00003232 s 0.99
value_and_grad / IPartOpt / cuda / Primal 0.000032063 s 0.000033152000000000004 s 0.97
value_and_grad / DefOpt / cuda / Primal 0.000032992 s 0.000033343 s 0.99
value_and_grad / IDefOpt / cuda / Primal 0.000032672 s 0.000032992 s 0.99
value_and_grad / JaXPipe / tpu / Primal 0 s 0 s 1
value_and_grad / Jax / tpu / Primal 0 s 0 s 1
value_and_grad / HLOOpt / tpu / Primal 0 s 0 s 1
value_and_grad / PartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IPartOpt / tpu / Primal 0 s 0 s 1
value_and_grad / DefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / IDefOpt / tpu / Primal 0 s 0 s 1
value_and_grad / JaXPipe / cpu / Primal 0.000023551 s 0.000013518900032067904 s 1.74
value_and_grad / Jax / cpu / Primal 0.000022913 s 0.000013619000092148782 s 1.68
value_and_grad / HLOOpt / cpu / Primal 0.000023013 s 0.00001315354005782865 s 1.75
value_and_grad / PartOpt / cpu / Primal 0.000023037 s 0.000013153380095900502 s 1.75
value_and_grad / IPartOpt / cpu / Primal 0.000023052 s 0.00001324175998888677 s 1.74
value_and_grad / DefOpt / cpu / Primal 0.000023083 s 0.000013427060039248318 s 1.72
value_and_grad / IDefOpt / cpu / Primal 0.000022897 s 0.00001322780008194968 s 1.73

This comment was automatically generated by workflow using github-action-benchmark.

@avik-pal avik-pal force-pushed the ap/neuro_benchmark branch from c50080b to 37aa1e7 Compare January 8, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants