-
Notifications
You must be signed in to change notification settings - Fork 27
feat: transpose scatter to scatter transpose #1887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
avik-pal
wants to merge
1
commit into
main
Choose a base branch
from
ap/transpose_opts2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
b74d2c3 to
9fffac3
Compare
Collaborator
Author
9fffac3 to
c2dfd55
Compare
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: c2dfd55 | Previous: 972c249 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000007214840015876689 s |
0.000006562319977092557 s |
1.10 |
actmtch / Jax / cpu / Primal |
0.000006246880038816016 s |
0.0000062876599804440046 s |
0.99 |
actmtch / HLOOpt / cpu / Primal |
0.000007130059939299826 s |
0.000007609099993715063 s |
0.94 |
actmtch / PartOpt / cpu / Primal |
0.000006957740015423042 s |
0.000006221880139491986 s |
1.12 |
actmtch / IPartOpt / cpu / Primal |
0.000006426199988709414 s |
0.000006396459939423949 s |
1.00 |
actmtch / DefOpt / cpu / Primal |
0.000007394500034934026 s |
0.000007537180008512223 s |
0.98 |
actmtch / IDefOpt / cpu / Primal |
0.00000781218002884998 s |
0.000006963640062167542 s |
1.12 |
actmtch / JaXPipe / cpu / Forward |
0.000011777439976867752 s |
0.00001072302004104131 s |
1.10 |
actmtch / Jax / cpu / Forward |
0.00000898618001883733 s |
0.000010128840040124487 s |
0.89 |
actmtch / HLOOpt / cpu / Forward |
0.000011303360015517682 s |
0.000010821760133694624 s |
1.04 |
actmtch / PartOpt / cpu / Forward |
0.000010601879976093187 s |
0.00001043737996951677 s |
1.02 |
actmtch / IPartOpt / cpu / Forward |
0.0000109303199860733 s |
0.000010789780044433427 s |
1.01 |
actmtch / DefOpt / cpu / Forward |
0.000010948400040433626 s |
0.0000100856801327609 s |
1.09 |
actmtch / IDefOpt / cpu / Forward |
0.00001043956002831692 s |
0.00001044433998686145 s |
1.00 |
actmtch / JaXPipe / cpu / PreRev |
0.000010523820037633412 s |
0.00001114240005335887 s |
0.94 |
actmtch / JaXPipe / cpu / PostRev |
0.000010111800002050586 s |
0.000009797399943636265 s |
1.03 |
actmtch / JaXPipe / cpu / BothRev |
0.000011244080005781144 s |
0.00001141130011092173 s |
0.99 |
actmtch / Jax / cpu / BothRev |
0.000010050399996544002 s |
0.000009590319968992844 s |
1.05 |
actmtch / HLOOpt / cpu / PreRev |
0.000010932359991784324 s |
0.00001094651990570128 s |
1.00 |
actmtch / HLOOpt / cpu / PostRev |
0.000012235020049047308 s |
0.000012295760061533656 s |
1.00 |
actmtch / HLOOpt / cpu / BothRev |
0.000010914080066868336 s |
0.000010983460069837748 s |
0.99 |
actmtch / PartOpt / cpu / PreRev |
0.000011122439991595456 s |
0.000010270799921272557 s |
1.08 |
actmtch / PartOpt / cpu / PostRev |
0.000009891100035019916 s |
0.000009451600017200687 s |
1.05 |
actmtch / PartOpt / cpu / BothRev |
0.0000109029999566701 s |
0.000011059059979743323 s |
0.99 |
actmtch / IPartOpt / cpu / PreRev |
0.000010271739974996309 s |
0.000010759480028355029 s |
0.95 |
actmtch / IPartOpt / cpu / PostRev |
0.000010339620002923766 s |
0.000009635619935579598 s |
1.07 |
actmtch / IPartOpt / cpu / BothRev |
0.000010751439995146938 s |
0.000010833199994522146 s |
0.99 |
actmtch / DefOpt / cpu / PreRev |
0.000011190080012966064 s |
0.000010563840023678494 s |
1.06 |
actmtch / DefOpt / cpu / PostRev |
0.000010951019976346289 s |
0.000010677940026653233 s |
1.03 |
actmtch / DefOpt / cpu / BothRev |
0.000011463900018497952 s |
0.000011130339989904314 s |
1.03 |
actmtch / IDefOpt / cpu / PreRev |
0.000011050059965782566 s |
0.000010484219928912352 s |
1.05 |
actmtch / IDefOpt / cpu / PostRev |
0.000011233879995415918 s |
0.00001111930003389716 s |
1.01 |
actmtch / IDefOpt / cpu / BothRev |
0.000010483420037417093 s |
0.000010797219929372658 s |
0.97 |
actmtch / JaXPipe / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / Jax / cuda / Primal |
0.000002016 s |
0.000002399 s |
0.84 |
actmtch / HLOOpt / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / PartOpt / cuda / Primal |
0.000002015 s |
0.0000024 s |
0.84 |
actmtch / IPartOpt / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / DefOpt / cuda / Primal |
0.000002016 s |
0.0000024 s |
0.84 |
actmtch / IDefOpt / cuda / Primal |
0.000002016 s |
0.000002399 s |
0.84 |
actmtch / JaXPipe / cuda / Forward |
0.000009504 s |
0.00001024 s |
0.93 |
actmtch / Jax / cuda / Forward |
0.000010624 s |
0.000010432 s |
1.02 |
actmtch / HLOOpt / cuda / Forward |
0.000011296 s |
0.000010176 s |
1.11 |
actmtch / PartOpt / cuda / Forward |
0.000010144 s |
0.000010592 s |
0.96 |
actmtch / IPartOpt / cuda / Forward |
0.000009792 s |
0.000010303 s |
0.95 |
actmtch / DefOpt / cuda / Forward |
0.000010336 s |
0.000010496 s |
0.98 |
actmtch / IDefOpt / cuda / Forward |
0.000010016 s |
0.000010369 s |
0.97 |
actmtch / JaXPipe / cuda / PreRev |
0.00000976 s |
0.000010784 s |
0.91 |
actmtch / JaXPipe / cuda / PostRev |
0.00001008 s |
0.000010624 s |
0.95 |
actmtch / JaXPipe / cuda / BothRev |
0.000011488 s |
0.000010368 s |
1.11 |
actmtch / Jax / cuda / BothRev |
0.00001008 s |
0.000010624 s |
0.95 |
actmtch / HLOOpt / cuda / PreRev |
0.000010112 s |
0.0000104 s |
0.97 |
actmtch / HLOOpt / cuda / PostRev |
0.000010081 s |
0.0000104 s |
0.97 |
actmtch / HLOOpt / cuda / BothRev |
0.000010017 s |
0.000010336 s |
0.97 |
actmtch / PartOpt / cuda / PreRev |
0.000010016 s |
0.000010463 s |
0.96 |
actmtch / PartOpt / cuda / PostRev |
0.000010272 s |
0.00001056 s |
0.97 |
actmtch / PartOpt / cuda / BothRev |
0.000010208 s |
0.000010592 s |
0.96 |
actmtch / IPartOpt / cuda / PreRev |
0.000009952 s |
0.0000104 s |
0.96 |
actmtch / IPartOpt / cuda / PostRev |
0.00001008 s |
0.000010369 s |
0.97 |
actmtch / IPartOpt / cuda / BothRev |
0.000010336 s |
0.000011232 s |
0.92 |
actmtch / DefOpt / cuda / PreRev |
0.000010144 s |
0.000011616 s |
0.87 |
actmtch / DefOpt / cuda / PostRev |
0.000010048 s |
0.00001136 s |
0.88 |
actmtch / DefOpt / cuda / BothRev |
0.000010016 s |
0.00001072 s |
0.93 |
actmtch / IDefOpt / cuda / PreRev |
0.000010336 s |
0.000010432 s |
0.99 |
actmtch / IDefOpt / cuda / PostRev |
0.00001008 s |
0.00001072 s |
0.94 |
actmtch / IDefOpt / cuda / BothRev |
0.00001008 s |
0.000010368 s |
0.97 |
actmtch / JaXPipe / tpu / Primal |
5.633749999999999e-7 s |
5.633749999999999e-7 s |
1 |
actmtch / Jax / tpu / Primal |
5.967e-7 s |
5.973e-7 s |
1.00 |
actmtch / HLOOpt / tpu / Primal |
0.000002102125 s |
0.000002102425 s |
1.00 |
actmtch / PartOpt / tpu / Primal |
5.969000000000001e-7 s |
5.96875e-7 s |
1.00 |
actmtch / IPartOpt / tpu / Primal |
5.52375e-7 s |
5.5285e-7 s |
1.00 |
actmtch / DefOpt / tpu / Primal |
0.0000021595 s |
0.00000215745 s |
1.00 |
actmtch / IDefOpt / tpu / Primal |
0.000002108025 s |
0.000002101375 s |
1.00 |
actmtch / JaXPipe / tpu / Forward |
0.000003828225 s |
0.000003832575 s |
1.00 |
actmtch / Jax / tpu / Forward |
0.000001211575 s |
0.000001213375 s |
1.00 |
actmtch / HLOOpt / tpu / Forward |
0.00000393725 s |
0.0000039319250000000005 s |
1.00 |
actmtch / PartOpt / tpu / Forward |
0.000003916275 s |
0.00000391155 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.0000039354000000000005 s |
0.000003945575 s |
1.00 |
actmtch / DefOpt / tpu / Forward |
0.0000039282 s |
0.000003906125 s |
1.01 |
actmtch / IDefOpt / tpu / Forward |
0.0000039336 s |
0.0000039276 s |
1.00 |
actmtch / JaXPipe / tpu / PreRev |
0.000003468325 s |
0.000003482075 s |
1.00 |
actmtch / JaXPipe / tpu / PostRev |
0.000001651075 s |
0.0000016388 s |
1.01 |
actmtch / JaXPipe / tpu / BothRev |
0.00000348735 s |
0.00000348365 s |
1.00 |
actmtch / Jax / tpu / BothRev |
0.0000016338 s |
0.0000016362 s |
1.00 |
actmtch / HLOOpt / tpu / PreRev |
0.00000347085 s |
0.0000034763 s |
1.00 |
actmtch / HLOOpt / tpu / PostRev |
0.000003412625 s |
0.0000034209750000000003 s |
1.00 |
actmtch / HLOOpt / tpu / BothRev |
0.000003477675 s |
0.000003486025 s |
1.00 |
actmtch / PartOpt / tpu / PreRev |
0.0000034107500000000003 s |
0.000003394725 s |
1.00 |
actmtch / PartOpt / tpu / PostRev |
0.000001594525 s |
0.0000015865000000000002 s |
1.01 |
actmtch / PartOpt / tpu / BothRev |
0.0000034103 s |
0.000003427625 s |
0.99 |
actmtch / IPartOpt / tpu / PreRev |
0.000003470925 s |
0.000003497 s |
0.99 |
actmtch / IPartOpt / tpu / PostRev |
0.0000016547999999999998 s |
0.0000016344 s |
1.01 |
actmtch / IPartOpt / tpu / BothRev |
0.0000034697 s |
0.0000034946 s |
0.99 |
actmtch / DefOpt / tpu / PreRev |
0.000003404875 s |
0.000003425075 s |
0.99 |
actmtch / DefOpt / tpu / PostRev |
0.000003431125 s |
0.000003411675 s |
1.01 |
actmtch / DefOpt / tpu / BothRev |
0.000003412775 s |
0.000003417575 s |
1.00 |
actmtch / IDefOpt / tpu / PreRev |
0.000003492225 s |
0.00000347555 s |
1.00 |
actmtch / IDefOpt / tpu / PostRev |
0.00000341065 s |
0.000003417775 s |
1.00 |
actmtch / IDefOpt / tpu / BothRev |
0.00000346825 s |
0.0000034766249999999995 s |
1.00 |
actmtch / JaXPipe / cpu / Primal |
0.000012988 s |
0.000006562319977092557 s |
1.98 |
actmtch / Jax / cpu / Primal |
0.000013129 s |
0.0000062876599804440046 s |
2.09 |
actmtch / HLOOpt / cpu / Primal |
0.000014042 s |
0.000007609099993715063 s |
1.85 |
actmtch / PartOpt / cpu / Primal |
0.000012956 s |
0.000006221880139491986 s |
2.08 |
actmtch / IPartOpt / cpu / Primal |
0.000013295 s |
0.000006396459939423949 s |
2.08 |
actmtch / DefOpt / cpu / Primal |
0.000013726 s |
0.000007537180008512223 s |
1.82 |
actmtch / IDefOpt / cpu / Primal |
0.000013753 s |
0.000006963640062167542 s |
1.97 |
actmtch / JaXPipe / cpu / Forward |
0.000018612 s |
0.00001072302004104131 s |
1.74 |
actmtch / Jax / cpu / Forward |
0.000017637 s |
0.000010128840040124487 s |
1.74 |
actmtch / HLOOpt / cpu / Forward |
0.000018914 s |
0.000010821760133694624 s |
1.75 |
actmtch / PartOpt / cpu / Forward |
0.000018427 s |
0.00001043737996951677 s |
1.77 |
actmtch / IPartOpt / cpu / Forward |
0.000018908000000000003 s |
0.000010789780044433427 s |
1.75 |
actmtch / DefOpt / cpu / Forward |
0.000018862 s |
0.0000100856801327609 s |
1.87 |
actmtch / IDefOpt / cpu / Forward |
0.000019001 s |
0.00001044433998686145 s |
1.82 |
actmtch / JaXPipe / cpu / PreRev |
0.000019666 s |
0.00001114240005335887 s |
1.76 |
actmtch / JaXPipe / cpu / PostRev |
0.000017236000000000002 s |
0.000009797399943636265 s |
1.76 |
actmtch / JaXPipe / cpu / BothRev |
0.000019368 s |
0.00001141130011092173 s |
1.70 |
actmtch / Jax / cpu / BothRev |
0.000017175 s |
0.000009590319968992844 s |
1.79 |
actmtch / HLOOpt / cpu / PreRev |
0.000019002 s |
0.00001094651990570128 s |
1.74 |
actmtch / HLOOpt / cpu / PostRev |
0.000018688 s |
0.000012295760061533656 s |
1.52 |
actmtch / HLOOpt / cpu / BothRev |
0.000019291 s |
0.000010983460069837748 s |
1.76 |
actmtch / PartOpt / cpu / PreRev |
0.000018869 s |
0.000010270799921272557 s |
1.84 |
actmtch / PartOpt / cpu / PostRev |
0.000017537 s |
0.000009451600017200687 s |
1.86 |
actmtch / PartOpt / cpu / BothRev |
0.000018727 s |
0.000011059059979743323 s |
1.69 |
actmtch / IPartOpt / cpu / PreRev |
0.000019275 s |
0.000010759480028355029 s |
1.79 |
actmtch / IPartOpt / cpu / PostRev |
0.000017324 s |
0.000009635619935579598 s |
1.80 |
actmtch / IPartOpt / cpu / BothRev |
0.000019117 s |
0.000010833199994522146 s |
1.76 |
actmtch / DefOpt / cpu / PreRev |
0.000019261 s |
0.000010563840023678494 s |
1.82 |
actmtch / DefOpt / cpu / PostRev |
0.000019041 s |
0.000010677940026653233 s |
1.78 |
actmtch / DefOpt / cpu / BothRev |
0.000018981 s |
0.000011130339989904314 s |
1.71 |
actmtch / IDefOpt / cpu / PreRev |
0.000018751 s |
0.000010484219928912352 s |
1.79 |
actmtch / IDefOpt / cpu / PostRev |
0.000019488 s |
0.00001111930003389716 s |
1.75 |
actmtch / IDefOpt / cpu / BothRev |
0.000019273 s |
0.000010797219929372658 s |
1.78 |
add_one / JaXPipe / cpu / Primal |
0.000006452600027841981 s |
0.000006647580121352803 s |
0.97 |
add_one / Jax / cpu / Primal |
0.000006513920006909757 s |
0.000007019260065135313 s |
0.93 |
add_one / HLOOpt / cpu / Primal |
0.000006593100006284658 s |
0.000006855379924672889 s |
0.96 |
add_one / PartOpt / cpu / Primal |
0.000006480499987446819 s |
0.000006245519980438985 s |
1.04 |
add_one / IPartOpt / cpu / Primal |
0.000006900440030221944 s |
0.000006942599975445774 s |
0.99 |
add_one / DefOpt / cpu / Primal |
0.000006850679992567165 s |
0.0000069157200050540265 s |
0.99 |
add_one / IDefOpt / cpu / Primal |
0.000006841899967184873 s |
0.00000644126001134282 s |
1.06 |
add_one / JaXPipe / cpu / Forward |
0.000009777499972187798 s |
0.000010058619955088945 s |
0.97 |
add_one / Jax / cpu / Forward |
0.000010239560006084502 s |
0.000009927860028255965 s |
1.03 |
add_one / HLOOpt / cpu / Forward |
0.000010132299967153814 s |
0.000010169480028707766 s |
1.00 |
add_one / PartOpt / cpu / Forward |
0.000009659980050855666 s |
0.000010235980043944435 s |
0.94 |
add_one / IPartOpt / cpu / Forward |
0.000009985979977500392 s |
0.00000981865994617692 s |
1.02 |
add_one / DefOpt / cpu / Forward |
0.00000960511997618596 s |
0.000010122280018549644 s |
0.95 |
add_one / IDefOpt / cpu / Forward |
0.000009439319983357564 s |
0.000009593060021870769 s |
0.98 |
add_one / JaXPipe / cpu / PreRev |
0.000011875420013893744 s |
0.000011487320007290693 s |
1.03 |
add_one / JaXPipe / cpu / PostRev |
0.000011418859976402018 s |
0.000011399719842302149 s |
1.00 |
add_one / JaXPipe / cpu / BothRev |
0.000011376499978723586 s |
0.000011612500002229353 s |
0.98 |
add_one / Jax / cpu / BothRev |
0.000010846159975699264 s |
0.000011088899955211672 s |
0.98 |
add_one / HLOOpt / cpu / PreRev |
0.00001157820001935761 s |
0.000011683019965857968 s |
0.99 |
add_one / HLOOpt / cpu / PostRev |
0.000013604060013676644 s |
0.000014739599992026342 s |
0.92 |
add_one / HLOOpt / cpu / BothRev |
0.000011021959962818074 s |
0.000011134360011055832 s |
0.99 |
add_one / PartOpt / cpu / PreRev |
0.000011507000008350588 s |
0.0000115139000081399 s |
1.00 |
add_one / PartOpt / cpu / PostRev |
0.000011434039997766376 s |
0.000011024920004274463 s |
1.04 |
add_one / PartOpt / cpu / BothRev |
0.000011449060029917748 s |
0.00001160763993539149 s |
0.99 |
add_one / IPartOpt / cpu / PreRev |
0.000010801560001709732 s |
0.000011384020108380355 s |
0.95 |
add_one / IPartOpt / cpu / PostRev |
0.000011328940008752395 s |
0.000011333639959048014 s |
1.00 |
add_one / IPartOpt / cpu / BothRev |
0.000011806399998022243 s |
0.000010866900011023972 s |
1.09 |
add_one / DefOpt / cpu / PreRev |
0.000011295460017208826 s |
0.00001100651994420332 s |
1.03 |
add_one / DefOpt / cpu / PostRev |
0.000011590360018089995 s |
0.000010951919994113267 s |
1.06 |
add_one / DefOpt / cpu / BothRev |
0.000011139119960716924 s |
0.000010872719994949876 s |
1.02 |
add_one / IDefOpt / cpu / PreRev |
0.000011444520023360384 s |
0.000011335459948895732 s |
1.01 |
add_one / IDefOpt / cpu / PostRev |
0.000011111999965578434 s |
0.000010873259907384635 s |
1.02 |
add_one / IDefOpt / cpu / BothRev |
0.00001165393999144726 s |
0.000010865979947993764 s |
1.07 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002304 s |
0.83 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002303 s |
0.83 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002335 s |
0.82 |
add_one / JaXPipe / cuda / Forward |
0.00001008 s |
0.000010752 s |
0.94 |
add_one / Jax / cuda / Forward |
0.000010112 s |
0.000010432 s |
0.97 |
add_one / HLOOpt / cuda / Forward |
0.000009472 s |
0.000010592 s |
0.89 |
add_one / PartOpt / cuda / Forward |
0.000009984 s |
0.000013215 s |
0.76 |
add_one / IPartOpt / cuda / Forward |
0.000009984 s |
0.000010304 s |
0.97 |
add_one / DefOpt / cuda / Forward |
0.00001008 s |
0.000010368 s |
0.97 |
add_one / IDefOpt / cuda / Forward |
0.000010017 s |
0.000010175 s |
0.98 |
add_one / JaXPipe / cuda / PreRev |
0.000024832 s |
0.000025792 s |
0.96 |
add_one / JaXPipe / cuda / PostRev |
0.00002448 s |
0.000025536 s |
0.96 |
add_one / JaXPipe / cuda / BothRev |
0.000025152 s |
0.000025696 s |
0.98 |
add_one / Jax / cuda / BothRev |
0.000024896 s |
0.000025792 s |
0.97 |
add_one / HLOOpt / cuda / PreRev |
0.00002464 s |
0.000025568 s |
0.96 |
add_one / HLOOpt / cuda / PostRev |
0.000024832 s |
0.000024256 s |
1.02 |
add_one / HLOOpt / cuda / BothRev |
0.000024768 s |
0.000025568 s |
0.97 |
add_one / PartOpt / cuda / PreRev |
0.00002496 s |
0.000025376 s |
0.98 |
add_one / PartOpt / cuda / PostRev |
0.000024864 s |
0.00002576 s |
0.97 |
add_one / PartOpt / cuda / BothRev |
0.000024288 s |
0.000025632 s |
0.95 |
add_one / IPartOpt / cuda / PreRev |
0.000025152 s |
0.000026272 s |
0.96 |
add_one / IPartOpt / cuda / PostRev |
0.000024673 s |
0.000025409 s |
0.97 |
add_one / IPartOpt / cuda / BothRev |
0.00002512 s |
0.000025919 s |
0.97 |
add_one / DefOpt / cuda / PreRev |
0.000024928 s |
0.00002608 s |
0.96 |
add_one / DefOpt / cuda / PostRev |
0.00002496 s |
0.000025695 s |
0.97 |
add_one / DefOpt / cuda / BothRev |
0.00002448 s |
0.000026016 s |
0.94 |
add_one / IDefOpt / cuda / PreRev |
0.000025375 s |
0.0000248 s |
1.02 |
add_one / IDefOpt / cuda / PostRev |
0.000025376 s |
0.000025888 s |
0.98 |
add_one / IDefOpt / cuda / BothRev |
0.000024992 s |
0.000025504 s |
0.98 |
add_one / JaXPipe / tpu / Primal |
0.0000014294000000000002 s |
0.00000142345 s |
1.00 |
add_one / Jax / tpu / Primal |
0.0000014027249999999998 s |
0.000001404425 s |
1.00 |
add_one / HLOOpt / tpu / Primal |
0.000001430825 s |
0.0000014319249999999998 s |
1.00 |
add_one / PartOpt / tpu / Primal |
0.0000014064750000000002 s |
0.000001407725 s |
1.00 |
add_one / IPartOpt / tpu / Primal |
0.0000014301 s |
0.0000014238 s |
1.00 |
add_one / DefOpt / tpu / Primal |
0.0000014045750000000002 s |
0.0000014118 s |
0.99 |
add_one / IDefOpt / tpu / Primal |
0.00000142725 s |
0.0000014299 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.00000184715 s |
0.000001843975 s |
1.00 |
add_one / Jax / tpu / Forward |
0.00000186055 s |
0.000001847625 s |
1.01 |
add_one / HLOOpt / tpu / Forward |
0.0000018466 s |
0.000001858425 s |
0.99 |
add_one / PartOpt / tpu / Forward |
0.000001840125 s |
0.00000184495 s |
1.00 |
add_one / IPartOpt / tpu / Forward |
0.000001861925 s |
0.000001852075 s |
1.01 |
add_one / DefOpt / tpu / Forward |
0.000001849675 s |
0.000001841675 s |
1.00 |
add_one / IDefOpt / tpu / Forward |
0.00000185315 s |
0.00000186845 s |
0.99 |
add_one / JaXPipe / tpu / PreRev |
0.000002242675 s |
0.00000224345 s |
1.00 |
add_one / JaXPipe / tpu / PostRev |
0.000002242225 s |
0.00000225025 s |
1.00 |
add_one / JaXPipe / tpu / BothRev |
0.0000022388 s |
0.000002233775 s |
1.00 |
add_one / Jax / tpu / BothRev |
0.0000022378 s |
0.000002238575 s |
1.00 |
add_one / HLOOpt / tpu / PreRev |
0.000002236825 s |
0.00000223345 s |
1.00 |
add_one / HLOOpt / tpu / PostRev |
0.0000022394 s |
0.000002241375 s |
1.00 |
add_one / HLOOpt / tpu / BothRev |
0.0000022341000000000003 s |
0.000002236875 s |
1.00 |
add_one / PartOpt / tpu / PreRev |
0.00000223765 s |
0.00000223495 s |
1.00 |
add_one / PartOpt / tpu / PostRev |
0.000002234475 s |
0.000002233225 s |
1.00 |
add_one / PartOpt / tpu / BothRev |
0.00000224045 s |
0.000002241225 s |
1.00 |
add_one / IPartOpt / tpu / PreRev |
0.00000223955 s |
0.000002242525 s |
1.00 |
add_one / IPartOpt / tpu / PostRev |
0.000002236175 s |
0.00000223895 s |
1.00 |
add_one / IPartOpt / tpu / BothRev |
0.0000022408 s |
0.000002238 s |
1.00 |
add_one / DefOpt / tpu / PreRev |
0.000002237925 s |
0.0000022502 s |
0.99 |
add_one / DefOpt / tpu / PostRev |
0.000002245275 s |
0.000002244275 s |
1.00 |
add_one / DefOpt / tpu / BothRev |
0.0000022398 s |
0.0000022360000000000003 s |
1.00 |
add_one / IDefOpt / tpu / PreRev |
0.0000022451 s |
0.000002237875 s |
1.00 |
add_one / IDefOpt / tpu / PostRev |
0.0000022443 s |
0.000002241175 s |
1.00 |
add_one / IDefOpt / tpu / BothRev |
0.0000022466 s |
0.000002246625 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000013041 s |
0.000006647580121352803 s |
1.96 |
add_one / Jax / cpu / Primal |
0.000012939 s |
0.000007019260065135313 s |
1.84 |
add_one / HLOOpt / cpu / Primal |
0.000012559 s |
0.000006855379924672889 s |
1.83 |
add_one / PartOpt / cpu / Primal |
0.000012924 s |
0.000006245519980438985 s |
2.07 |
add_one / IPartOpt / cpu / Primal |
0.000012524 s |
0.000006942599975445774 s |
1.80 |
add_one / DefOpt / cpu / Primal |
0.000012793 s |
0.0000069157200050540265 s |
1.85 |
add_one / IDefOpt / cpu / Primal |
0.000012802 s |
0.00000644126001134282 s |
1.99 |
add_one / JaXPipe / cpu / Forward |
0.0000174 s |
0.000010058619955088945 s |
1.73 |
add_one / Jax / cpu / Forward |
0.000017094000000000003 s |
0.000009927860028255965 s |
1.72 |
add_one / HLOOpt / cpu / Forward |
0.000017151 s |
0.000010169480028707766 s |
1.69 |
add_one / PartOpt / cpu / Forward |
0.00001715 s |
0.000010235980043944435 s |
1.68 |
add_one / IPartOpt / cpu / Forward |
0.000017332 s |
0.00000981865994617692 s |
1.77 |
add_one / DefOpt / cpu / Forward |
0.000017462 s |
0.000010122280018549644 s |
1.73 |
add_one / IDefOpt / cpu / Forward |
0.000017305 s |
0.000009593060021870769 s |
1.80 |
add_one / JaXPipe / cpu / PreRev |
0.000019639 s |
0.000011487320007290693 s |
1.71 |
add_one / JaXPipe / cpu / PostRev |
0.00001949 s |
0.000011399719842302149 s |
1.71 |
add_one / JaXPipe / cpu / BothRev |
0.000019433 s |
0.000011612500002229353 s |
1.67 |
add_one / Jax / cpu / BothRev |
0.000019185 s |
0.000011088899955211672 s |
1.73 |
add_one / HLOOpt / cpu / PreRev |
0.000019004 s |
0.000011683019965857968 s |
1.63 |
add_one / HLOOpt / cpu / PostRev |
0.000019333 s |
0.000014739599992026342 s |
1.31 |
add_one / HLOOpt / cpu / BothRev |
0.000019239 s |
0.000011134360011055832 s |
1.73 |
add_one / PartOpt / cpu / PreRev |
0.000019478 s |
0.0000115139000081399 s |
1.69 |
add_one / PartOpt / cpu / PostRev |
0.000019188 s |
0.000011024920004274463 s |
1.74 |
add_one / PartOpt / cpu / BothRev |
0.000019272 s |
0.00001160763993539149 s |
1.66 |
add_one / IPartOpt / cpu / PreRev |
0.00001881 s |
0.000011384020108380355 s |
1.65 |
add_one / IPartOpt / cpu / PostRev |
0.000019757 s |
0.000011333639959048014 s |
1.74 |
add_one / IPartOpt / cpu / BothRev |
0.000019488 s |
0.000010866900011023972 s |
1.79 |
add_one / DefOpt / cpu / PreRev |
0.000019348 s |
0.00001100651994420332 s |
1.76 |
add_one / DefOpt / cpu / PostRev |
0.000019679 s |
0.000010951919994113267 s |
1.80 |
add_one / DefOpt / cpu / BothRev |
0.000019611 s |
0.000010872719994949876 s |
1.80 |
add_one / IDefOpt / cpu / PreRev |
0.000019184 s |
0.000011335459948895732 s |
1.69 |
add_one / IDefOpt / cpu / PostRev |
0.000019581 s |
0.000010873259907384635 s |
1.80 |
add_one / IDefOpt / cpu / BothRev |
0.000019573 s |
0.000010865979947993764 s |
1.80 |
add_two / JaXPipe / cpu / Primal |
0.000006571939975401619 s |
0.000006799959974159719 s |
0.97 |
add_two / Jax / cpu / Primal |
0.000006666879999102093 s |
0.000006684760064672446 s |
1.00 |
add_two / HLOOpt / cpu / Primal |
0.0000068448800175247015 s |
0.000006848320026620058 s |
1.00 |
add_two / PartOpt / cpu / Primal |
0.000006629580002481816 s |
0.000007211239972093608 s |
0.92 |
add_two / IPartOpt / cpu / Primal |
0.000006899180016262108 s |
0.0000069097999767109285 s |
1.00 |
add_two / DefOpt / cpu / Primal |
0.000006695560032312642 s |
0.000007047919862088747 s |
0.95 |
add_two / IDefOpt / cpu / Primal |
0.000006662800024059834 s |
0.0000069972399978723845 s |
0.95 |
add_two / JaXPipe / cpu / Forward |
0.000009844259984674864 s |
0.00000989261992799584 s |
1.00 |
add_two / Jax / cpu / Forward |
0.000010339820009903631 s |
0.000009906680061249062 s |
1.04 |
add_two / HLOOpt / cpu / Forward |
0.000010109280010510702 s |
0.000010638600033416878 s |
0.95 |
add_two / PartOpt / cpu / Forward |
0.000009904720009217272 s |
0.000010299079913238528 s |
0.96 |
add_two / IPartOpt / cpu / Forward |
0.00000989731998743082 s |
0.000010367220074840588 s |
0.95 |
add_two / DefOpt / cpu / Forward |
0.000010397079950053013 s |
0.000010392959993623662 s |
1.00 |
add_two / IDefOpt / cpu / Forward |
0.000010397659998488962 s |
0.000010211160042672416 s |
1.02 |
add_two / JaXPipe / cpu / PreRev |
0.00001435203998880752 s |
0.000013334040031622865 s |
1.08 |
add_two / JaXPipe / cpu / PostRev |
0.000013896679984100049 s |
0.000013018559948250183 s |
1.07 |
add_two / JaXPipe / cpu / BothRev |
0.000013280660032251036 s |
0.00001362989989502239 s |
0.97 |
add_two / Jax / cpu / BothRev |
0.000013912559988966678 s |
0.00001345006005067262 s |
1.03 |
add_two / HLOOpt / cpu / PreRev |
0.000014072480007598644 s |
0.000013947060087957652 s |
1.01 |
add_two / HLOOpt / cpu / PostRev |
0.00001602999996066501 s |
0.000015459519945579815 s |
1.04 |
add_two / HLOOpt / cpu / BothRev |
0.000013738679972448154 s |
0.000013334459890756988 s |
1.03 |
add_two / PartOpt / cpu / PreRev |
0.000013722820031034643 s |
0.00001374192004732322 s |
1.00 |
add_two / PartOpt / cpu / PostRev |
0.000013932160054537235 s |
0.000013714780016016448 s |
1.02 |
add_two / PartOpt / cpu / BothRev |
0.000013922559992352036 s |
0.000014078620006330312 s |
0.99 |
add_two / IPartOpt / cpu / PreRev |
0.000013527760020224375 s |
0.000013926319952588527 s |
0.97 |
add_two / IPartOpt / cpu / PostRev |
0.0000137006600289169 s |
0.000013750919915764824 s |
1.00 |
add_two / IPartOpt / cpu / BothRev |
0.0000139493599999696 s |
0.000012942820067110006 s |
1.08 |
add_two / DefOpt / cpu / PreRev |
0.00001445802001398988 s |
0.000013346799914870643 s |
1.08 |
add_two / DefOpt / cpu / PostRev |
0.00001425317996108788 s |
0.000013584420030383624 s |
1.05 |
add_two / DefOpt / cpu / BothRev |
0.00001393896000990935 s |
0.000013460420032060938 s |
1.04 |
add_two / IDefOpt / cpu / PreRev |
0.00001413516000866366 s |
0.000013556180074374424 s |
1.04 |
add_two / IDefOpt / cpu / PostRev |
0.00001347014000202762 s |
0.0000135386000147264 s |
0.99 |
add_two / IDefOpt / cpu / BothRev |
0.00001330805997895368 s |
0.0000137441399601812 s |
0.97 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002432 s |
0.79 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002432 s |
0.79 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002431 s |
0.79 |
add_two / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002432 s |
0.79 |
add_two / JaXPipe / cuda / Forward |
0.000009728 s |
0.000010688 s |
0.91 |
add_two / Jax / cuda / Forward |
0.00001008 s |
0.0000104 s |
0.97 |
add_two / HLOOpt / cuda / Forward |
0.00000976 s |
0.000010432 s |
0.94 |
add_two / PartOpt / cuda / Forward |
0.000009888 s |
0.000010432 s |
0.95 |
add_two / IPartOpt / cuda / Forward |
0.000009824 s |
0.000010432 s |
0.94 |
add_two / DefOpt / cuda / Forward |
0.000009728 s |
0.000010495 s |
0.93 |
add_two / IDefOpt / cuda / Forward |
0.00000992 s |
0.000010432 s |
0.95 |
add_two / JaXPipe / cuda / PreRev |
0.000033121 s |
0.000031904000000000005 s |
1.04 |
add_two / JaXPipe / cuda / PostRev |
0.000032096 s |
0.000032767999999999995 s |
0.98 |
add_two / JaXPipe / cuda / BothRev |
0.000033057000000000006 s |
0.000032256 s |
1.02 |
add_two / Jax / cuda / BothRev |
0.000032191 s |
0.000032704 s |
0.98 |
add_two / HLOOpt / cuda / PreRev |
0.000033248 s |
0.000032800000000000004 s |
1.01 |
add_two / HLOOpt / cuda / PostRev |
0.000031456 s |
0.000032288 s |
0.97 |
add_two / HLOOpt / cuda / BothRev |
0.000032672 s |
0.000033344 s |
0.98 |
add_two / PartOpt / cuda / PreRev |
0.000033313000000000004 s |
0.000033344 s |
1.00 |
add_two / PartOpt / cuda / PostRev |
0.000032288 s |
0.00003248 s |
0.99 |
add_two / PartOpt / cuda / BothRev |
0.00003248 s |
0.00003248 s |
1 |
add_two / IPartOpt / cuda / PreRev |
0.000032800000000000004 s |
0.000033152000000000004 s |
0.99 |
add_two / IPartOpt / cuda / PostRev |
0.000031584 s |
0.000032992 s |
0.96 |
add_two / IPartOpt / cuda / BothRev |
0.000032769 s |
0.000032416 s |
1.01 |
add_two / DefOpt / cuda / PreRev |
0.000033568 s |
0.000032896000000000005 s |
1.02 |
add_two / DefOpt / cuda / PostRev |
0.00003808 s |
0.000032543 s |
1.17 |
add_two / DefOpt / cuda / BothRev |
0.000032224 s |
0.00003296 s |
0.98 |
add_two / IDefOpt / cuda / PreRev |
0.000032897 s |
0.000033472 s |
0.98 |
add_two / IDefOpt / cuda / PostRev |
0.000032704 s |
0.00003264 s |
1.00 |
add_two / IDefOpt / cuda / BothRev |
0.00003712 s |
0.000033472 s |
1.11 |
add_two / JaXPipe / tpu / Primal |
0.0000014740249999999998 s |
0.0000014749499999999995 s |
1.00 |
add_two / Jax / tpu / Primal |
0.000001441325 s |
0.000001439825 s |
1.00 |
add_two / HLOOpt / tpu / Primal |
0.000001484325 s |
0.0000014717749999999998 s |
1.01 |
add_two / PartOpt / tpu / Primal |
0.000001431275 s |
0.000001445775 s |
0.99 |
add_two / IPartOpt / tpu / Primal |
0.0000014793999999999998 s |
0.00000146895 s |
1.01 |
add_two / DefOpt / tpu / Primal |
0.0000014428 s |
0.0000014387 s |
1.00 |
add_two / IDefOpt / tpu / Primal |
0.0000014824249999999998 s |
0.00000147485 s |
1.01 |
add_two / JaXPipe / tpu / Forward |
0.000001819675 s |
0.00000181685 s |
1.00 |
add_two / Jax / tpu / Forward |
0.00000191125 s |
0.000001905375 s |
1.00 |
add_two / HLOOpt / tpu / Forward |
0.0000018207 s |
0.000001819725 s |
1.00 |
add_two / PartOpt / tpu / Forward |
0.000001911025 s |
0.00000192145 s |
0.99 |
add_two / IPartOpt / tpu / Forward |
0.000001842925 s |
0.000001815275 s |
1.02 |
add_two / DefOpt / tpu / Forward |
0.00000191755 s |
0.00000190725 s |
1.01 |
add_two / IDefOpt / tpu / Forward |
0.00000181875 s |
0.000001819325 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.000002863925 s |
0.000002864375 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.000002731275 s |
0.0000027442250000000004 s |
1.00 |
add_two / JaXPipe / tpu / BothRev |
0.0000028567 s |
0.0000028623 s |
1.00 |
add_two / Jax / tpu / BothRev |
0.0000027282 s |
0.000002725525 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.000002864475 s |
0.0000028695500000000003 s |
1.00 |
add_two / HLOOpt / tpu / PostRev |
0.0000027329750000000003 s |
0.0000027291 s |
1.00 |
add_two / HLOOpt / tpu / BothRev |
0.000002861975 s |
0.0000028571 s |
1.00 |
add_two / PartOpt / tpu / PreRev |
0.0000027194750000000003 s |
0.0000027297500000000004 s |
1.00 |
add_two / PartOpt / tpu / PostRev |
0.000002858875 s |
0.000002863175 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.000002727075 s |
0.0000027407 s |
1.00 |
add_two / IPartOpt / tpu / PreRev |
0.00000286535 s |
0.000002863175 s |
1.00 |
add_two / IPartOpt / tpu / PostRev |
0.000002722475 s |
0.0000027239 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.000002871925 s |
0.00000286345 s |
1.00 |
add_two / DefOpt / tpu / PreRev |
0.00000272725 s |
0.0000027285749999999995 s |
1.00 |
add_two / DefOpt / tpu / PostRev |
0.00000286085 s |
0.000002867725 s |
1.00 |
add_two / DefOpt / tpu / BothRev |
0.00000273385 s |
0.00000272025 s |
1.00 |
add_two / IDefOpt / tpu / PreRev |
0.00000287595 s |
0.000002858225 s |
1.01 |
add_two / IDefOpt / tpu / PostRev |
0.000002723925 s |
0.00000273385 s |
1.00 |
add_two / IDefOpt / tpu / BothRev |
0.000002866025 s |
0.0000028672 s |
1.00 |
add_two / JaXPipe / cpu / Primal |
0.000013621 s |
0.000006799959974159719 s |
2.00 |
add_two / Jax / cpu / Primal |
0.000013268 s |
0.000006684760064672446 s |
1.98 |
add_two / HLOOpt / cpu / Primal |
0.000013457 s |
0.000006848320026620058 s |
1.97 |
add_two / PartOpt / cpu / Primal |
0.000013202 s |
0.000007211239972093608 s |
1.83 |
add_two / IPartOpt / cpu / Primal |
0.000013277 s |
0.0000069097999767109285 s |
1.92 |
add_two / DefOpt / cpu / Primal |
0.000013161000000000002 s |
0.000007047919862088747 s |
1.87 |
add_two / IDefOpt / cpu / Primal |
0.000013279 s |
0.0000069972399978723845 s |
1.90 |
add_two / JaXPipe / cpu / Forward |
0.000017928 s |
0.00000989261992799584 s |
1.81 |
add_two / Jax / cpu / Forward |
0.000017963 s |
0.000009906680061249062 s |
1.81 |
add_two / HLOOpt / cpu / Forward |
0.000017742 s |
0.000010638600033416878 s |
1.67 |
add_two / PartOpt / cpu / Forward |
0.000017703 s |
0.000010299079913238528 s |
1.72 |
add_two / IPartOpt / cpu / Forward |
0.000017624 s |
0.000010367220074840588 s |
1.70 |
add_two / DefOpt / cpu / Forward |
0.000017825 s |
0.000010392959993623662 s |
1.72 |
add_two / IDefOpt / cpu / Forward |
0.00001918 s |
0.000010211160042672416 s |
1.88 |
add_two / JaXPipe / cpu / PreRev |
0.000023358 s |
0.000013334040031622865 s |
1.75 |
add_two / JaXPipe / cpu / PostRev |
0.000023026000000000003 s |
0.000013018559948250183 s |
1.77 |
add_two / JaXPipe / cpu / BothRev |
0.000022731 s |
0.00001362989989502239 s |
1.67 |
add_two / Jax / cpu / BothRev |
0.000022369 s |
0.00001345006005067262 s |
1.66 |
add_two / HLOOpt / cpu / PreRev |
0.000022364 s |
0.000013947060087957652 s |
1.60 |
add_two / HLOOpt / cpu / PostRev |
0.000022812 s |
0.000015459519945579815 s |
1.48 |
add_two / HLOOpt / cpu / BothRev |
0.000022732 s |
0.000013334459890756988 s |
1.70 |
add_two / PartOpt / cpu / PreRev |
0.000022923 s |
0.00001374192004732322 s |
1.67 |
add_two / PartOpt / cpu / PostRev |
0.000022887 s |
0.000013714780016016448 s |
1.67 |
add_two / PartOpt / cpu / BothRev |
0.000023088 s |
0.000014078620006330312 s |
1.64 |
add_two / IPartOpt / cpu / PreRev |
0.000022552 s |
0.000013926319952588527 s |
1.62 |
add_two / IPartOpt / cpu / PostRev |
0.000023084 s |
0.000013750919915764824 s |
1.68 |
add_two / IPartOpt / cpu / BothRev |
0.000022838 s |
0.000012942820067110006 s |
1.76 |
add_two / DefOpt / cpu / PreRev |
0.000022471 s |
0.000013346799914870643 s |
1.68 |
add_two / DefOpt / cpu / PostRev |
0.000022888 s |
0.000013584420030383624 s |
1.68 |
add_two / DefOpt / cpu / BothRev |
0.000022673 s |
0.000013460420032060938 s |
1.68 |
add_two / IDefOpt / cpu / PreRev |
0.000022855 s |
0.000013556180074374424 s |
1.69 |
add_two / IDefOpt / cpu / PostRev |
0.000022803 s |
0.0000135386000147264 s |
1.68 |
add_two / IDefOpt / cpu / BothRev |
0.00002285 s |
0.0000137441399601812 s |
1.66 |
cache / JaXPipe / cpu / Primal |
0.00000605527997322497 s |
0.000006199420040502446 s |
0.98 |
cache / Jax / cpu / Primal |
0.000006095900007494493 s |
0.000006489560000773053 s |
0.94 |
cache / HLOOpt / cpu / Primal |
0.000005971119962850935 s |
0.0000061433001064870044 s |
0.97 |
cache / PartOpt / cpu / Primal |
0.000006003539992889273 s |
0.0000059924800007138405 s |
1.00 |
cache / IPartOpt / cpu / Primal |
0.000006608739977309597 s |
0.000006026699920766987 s |
1.10 |
cache / DefOpt / cpu / Primal |
0.000006044019992259564 s |
0.000006346639984258218 s |
0.95 |
cache / IDefOpt / cpu / Primal |
0.000006449760012401384 s |
0.000006013300044287462 s |
1.07 |
cache / JaXPipe / cpu / Forward |
0.000015492340025957674 s |
0.000014293200001702644 s |
1.08 |
cache / Jax / cpu / Forward |
0.000015538520010522916 s |
0.000018218660061393164 s |
0.85 |
cache / HLOOpt / cpu / Forward |
0.000015981579954313928 s |
0.000015171059985732426 s |
1.05 |
cache / PartOpt / cpu / Forward |
0.000015863900016483968 s |
0.000014781880017835648 s |
1.07 |
cache / IPartOpt / cpu / Forward |
0.000015939359973344835 s |
0.000015402959925268077 s |
1.03 |
cache / DefOpt / cpu / Forward |
0.000015242140034388284 s |
0.000015190559988695895 s |
1.00 |
cache / IDefOpt / cpu / Forward |
0.000014601960019717808 s |
0.000015089259977685288 s |
0.97 |
cache / JaXPipe / cpu / PreRev |
0.000016125479996844662 s |
0.00001638670000829734 s |
0.98 |
cache / JaXPipe / cpu / PostRev |
0.000020882940043520647 s |
0.00002091621987347025 s |
1.00 |
cache / JaXPipe / cpu / BothRev |
0.00001707430000351451 s |
0.0000167550800506433 s |
1.02 |
cache / Jax / cpu / BothRev |
0.00002115883999067592 s |
0.000020225059906806563 s |
1.05 |
cache / HLOOpt / cpu / PreRev |
0.00001634177998312225 s |
0.000016973180026980118 s |
0.96 |
cache / HLOOpt / cpu / PostRev |
0.000019833300029858947 s |
0.00001933609994011931 s |
1.03 |
cache / HLOOpt / cpu / BothRev |
0.000015939240020088617 s |
0.00001763584001309937 s |
0.90 |
cache / PartOpt / cpu / PreRev |
0.00001715600003990403 s |
0.00001631077997444663 s |
1.05 |
cache / PartOpt / cpu / PostRev |
0.00002153580001504452 s |
0.00002172737995351781 s |
0.99 |
cache / PartOpt / cpu / BothRev |
0.000016333320036210353 s |
0.000016777699966041836 s |
0.97 |
cache / IPartOpt / cpu / PreRev |
0.000016299940007229452 s |
0.000016093579997686902 s |
1.01 |
cache / IPartOpt / cpu / PostRev |
0.000019703839961948687 s |
0.00002099820003422792 s |
0.94 |
cache / IPartOpt / cpu / BothRev |
0.000016726740050216904 s |
0.000016570839961786986 s |
1.01 |
cache / DefOpt / cpu / PreRev |
0.000015931619964248965 s |
0.00001723014000162948 s |
0.92 |
cache / DefOpt / cpu / PostRev |
0.00001641246000872343 s |
0.000017786020052881212 s |
0.92 |
cache / DefOpt / cpu / BothRev |
0.00001655479999499221 s |
0.00001701712002613931 s |
0.97 |
cache / IDefOpt / cpu / PreRev |
0.000016958959986368426 s |
0.000016709280007489724 s |
1.01 |
cache / IDefOpt / cpu / PostRev |
0.000017671479999989968 s |
0.000017414860012650023 s |
1.01 |
cache / IDefOpt / cpu / BothRev |
0.00001709716000732442 s |
0.000015970800013747067 s |
1.07 |
cache / JaXPipe / cuda / Primal |
0.000002304 s |
0.000002335 s |
0.99 |
cache / Jax / cuda / Primal |
0.000002304 s |
0.000002335 s |
0.99 |
cache / HLOOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / PartOpt / cuda / Primal |
0.000002271 s |
0.000002335 s |
0.97 |
cache / IPartOpt / cuda / Primal |
0.000002335 s |
0.000002335 s |
1 |
cache / DefOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / IDefOpt / cuda / Primal |
0.00000224 s |
0.000002335 s |
0.96 |
cache / JaXPipe / cuda / Forward |
0.000002336 s |
0.0000023670000000000004 s |
0.99 |
cache / Jax / cuda / Forward |
0.000002337 s |
0.000002336 s |
1.00 |
cache / HLOOpt / cuda / Forward |
0.000002335 s |
0.0000023670000000000004 s |
0.99 |
cache / PartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / IPartOpt / cuda / Forward |
0.000002335 s |
0.0000023670000000000004 s |
0.99 |
cache / DefOpt / cuda / Forward |
0.000002304 s |
0.0000023670000000000004 s |
0.97 |
cache / IDefOpt / cuda / Forward |
0.000002335 s |
0.0000023670000000000004 s |
0.99 |
cache / JaXPipe / cuda / PreRev |
0.000010912 s |
0.000011231 s |
0.97 |
cache / JaXPipe / cuda / PostRev |
0.000010528 s |
0.000010944 s |
0.96 |
cache / JaXPipe / cuda / BothRev |
0.000010848 s |
0.00001072 s |
1.01 |
cache / Jax / cuda / BothRev |
0.000010655 s |
0.000010687 s |
1.00 |
cache / HLOOpt / cuda / PreRev |
0.000013472 s |
0.000013696 s |
0.98 |
cache / HLOOpt / cuda / PostRev |
0.000013472 s |
0.0000136 s |
0.99 |
cache / HLOOpt / cuda / BothRev |
0.000013472 s |
0.000013663 s |
0.99 |
cache / PartOpt / cuda / PreRev |
0.000010752 s |
0.00001072 s |
1.00 |
cache / PartOpt / cuda / PostRev |
0.000010624 s |
0.000010688 s |
0.99 |
cache / PartOpt / cuda / BothRev |
0.000010688 s |
0.000010784 s |
0.99 |
cache / IPartOpt / cuda / PreRev |
0.000010304 s |
0.000010816 s |
0.95 |
cache / IPartOpt / cuda / PostRev |
0.00001024 s |
0.000010656 s |
0.96 |
cache / IPartOpt / cuda / BothRev |
0.000010464 s |
0.000010879 s |
0.96 |
cache / DefOpt / cuda / PreRev |
0.000010625 s |
0.000011615 s |
0.91 |
cache / DefOpt / cuda / PostRev |
0.000010624 s |
0.000010528 s |
1.01 |
cache / DefOpt / cuda / BothRev |
0.000010784 s |
0.000010592 s |
1.02 |
cache / IDefOpt / cuda / PreRev |
0.000010656 s |
0.000010624 s |
1.00 |
cache / IDefOpt / cuda / PostRev |
0.000010368 s |
0.000010912 s |
0.95 |
cache / IDefOpt / cuda / BothRev |
0.000010689 s |
0.000010752 s |
0.99 |
cache / JaXPipe / tpu / Primal |
0.000002455 s |
0.000002460175 s |
1.00 |
cache / Jax / tpu / Primal |
0.0000024693 s |
0.00000246785 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002452925 s |
0.00000248385 s |
0.99 |
cache / PartOpt / tpu / Primal |
0.00000246115 s |
0.0000024633 s |
1.00 |
cache / IPartOpt / tpu / Primal |
0.000002443225 s |
0.000002465975 s |
0.99 |
cache / DefOpt / tpu / Primal |
0.0000024657 s |
0.000002450575 s |
1.01 |
cache / IDefOpt / tpu / Primal |
0.000002446675 s |
0.0000024649 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.000003569475 s |
0.000003556525 s |
1.00 |
cache / Jax / tpu / Forward |
0.0000035581000000000003 s |
0.000003554925 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.000003582025 s |
0.000003565225 s |
1.00 |
cache / PartOpt / tpu / Forward |
0.00000354395 s |
0.00000356175 s |
1.00 |
cache / IPartOpt / tpu / Forward |
0.00000357635 s |
0.00000357025 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.000003542025 s |
0.000003556825 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.000003582975 s |
0.000003585375 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.0000049172 s |
0.000004975425000000001 s |
0.99 |
cache / JaXPipe / tpu / PostRev |
0.0000049464 s |
0.000004970025 s |
1.00 |
cache / JaXPipe / tpu / BothRev |
0.000004930325 s |
0.000004990425 s |
0.99 |
cache / Jax / tpu / BothRev |
0.00000496445 s |
0.000004974374999999999 s |
1.00 |
cache / HLOOpt / tpu / PreRev |
0.000003912475 s |
0.0000039449 s |
0.99 |
cache / HLOOpt / tpu / PostRev |
0.000004121975 s |
0.0000041285 s |
1.00 |
cache / HLOOpt / tpu / BothRev |
0.000003907375 s |
0.00000395235 s |
0.99 |
cache / PartOpt / tpu / PreRev |
0.0000049451 s |
0.000004977275 s |
0.99 |
cache / PartOpt / tpu / PostRev |
0.000004944425 s |
0.0000049768750000000005 s |
0.99 |
cache / PartOpt / tpu / BothRev |
0.000004951425 s |
0.000004991225 s |
0.99 |
cache / IPartOpt / tpu / PreRev |
0.0000049229 s |
0.00000500825 s |
0.98 |
cache / IPartOpt / tpu / PostRev |
0.000004953325 s |
0.00000496675 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.000004935825 s |
0.0000049771 s |
0.99 |
cache / DefOpt / tpu / PreRev |
0.000004970175 s |
0.0000049878 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.0000049409 s |
0.0000049536 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.0000049459 s |
0.000004968224999999999 s |
1.00 |
cache / IDefOpt / tpu / PreRev |
0.000004943725 s |
0.0000049741 s |
0.99 |
cache / IDefOpt / tpu / PostRev |
0.0000049496 s |
0.000004959374999999999 s |
1.00 |
cache / IDefOpt / tpu / BothRev |
0.0000049101 s |
0.0000049691500000000005 s |
0.99 |
cache / JaXPipe / cpu / Primal |
0.000012616 s |
0.000006199420040502446 s |
2.04 |
cache / Jax / cpu / Primal |
0.000012149 s |
0.000006489560000773053 s |
1.87 |
cache / HLOOpt / cpu / Primal |
0.000012649 s |
0.0000061433001064870044 s |
2.06 |
cache / PartOpt / cpu / Primal |
0.000012414 s |
0.0000059924800007138405 s |
2.07 |
cache / IPartOpt / cpu / Primal |
0.00001277 s |
0.000006026699920766987 s |
2.12 |
cache / DefOpt / cpu / Primal |
0.000012523 s |
0.000006346639984258218 s |
1.97 |
cache / IDefOpt / cpu / Primal |
0.000012545 s |
0.000006013300044287462 s |
2.09 |
cache / JaXPipe / cpu / Forward |
0.000016778 s |
0.000014293200001702644 s |
1.17 |
cache / Jax / cpu / Forward |
0.000017183 s |
0.000018218660061393164 s |
0.94 |
cache / HLOOpt / cpu / Forward |
0.00001685 s |
0.000015171059985732426 s |
1.11 |
cache / PartOpt / cpu / Forward |
0.000017602 s |
0.000014781880017835648 s |
1.19 |
cache / IPartOpt / cpu / Forward |
0.000017054 s |
0.000015402959925268077 s |
1.11 |
cache / DefOpt / cpu / Forward |
0.000016979 s |
0.000015190559988695895 s |
1.12 |
cache / IDefOpt / cpu / Forward |
0.000016795 s |
0.000015089259977685288 s |
1.11 |
cache / JaXPipe / cpu / PreRev |
0.000017382 s |
0.00001638670000829734 s |
1.06 |
cache / JaXPipe / cpu / PostRev |
0.000020942 s |
0.00002091621987347025 s |
1.00 |
cache / JaXPipe / cpu / BothRev |
0.000017468999999999998 s |
0.0000167550800506433 s |
1.04 |
cache / Jax / cpu / BothRev |
0.000020361 s |
0.000020225059906806563 s |
1.01 |
cache / HLOOpt / cpu / PreRev |
0.000017488 s |
0.000016973180026980118 s |
1.03 |
cache / HLOOpt / cpu / PostRev |
0.000017558 s |
0.00001933609994011931 s |
0.91 |
cache / HLOOpt / cpu / BothRev |
0.000016881 s |
0.00001763584001309937 s |
0.96 |
cache / PartOpt / cpu / PreRev |
0.000018198 s |
0.00001631077997444663 s |
1.12 |
cache / PartOpt / cpu / PostRev |
0.000020167 s |
0.00002172737995351781 s |
0.93 |
cache / PartOpt / cpu / BothRev |
0.000017933000000000003 s |
0.000016777699966041836 s |
1.07 |
cache / IPartOpt / cpu / PreRev |
0.000017757 s |
0.000016093579997686902 s |
1.10 |
cache / IPartOpt / cpu / PostRev |
0.000020432 s |
0.00002099820003422792 s |
0.97 |
cache / IPartOpt / cpu / BothRev |
0.000017538 s |
0.000016570839961786986 s |
1.06 |
cache / DefOpt / cpu / PreRev |
0.000017137 s |
0.00001723014000162948 s |
0.99 |
cache / DefOpt / cpu / PostRev |
0.000017686 s |
0.000017786020052881212 s |
0.99 |
cache / DefOpt / cpu / BothRev |
0.000017124 s |
0.00001701712002613931 s |
1.01 |
cache / IDefOpt / cpu / PreRev |
0.000018087 s |
0.000016709280007489724 s |
1.08 |
cache / IDefOpt / cpu / PostRev |
0.000017704999999999997 s |
0.000017414860012650023 s |
1.02 |
cache / IDefOpt / cpu / BothRev |
0.000017607000000000003 s |
0.000015970800013747067 s |
1.10 |
Concat / JaXPipe / cpu / Primal |
0.000007328719975703279 s |
0.000006435699961002684 s |
1.14 |
Concat / Jax / cpu / Primal |
0.000006753079978807364 s |
0.000006501879979623481 s |
1.04 |
Concat / HLOOpt / cpu / Primal |
0.000006777620001230389 s |
0.000006443000111175934 s |
1.05 |
Concat / PartOpt / cpu / Primal |
0.000006483779970949399 s |
0.00000656270001854864 s |
0.99 |
Concat / IPartOpt / cpu / Primal |
0.000006480600022769068 s |
0.000006713340062560746 s |
0.97 |
Concat / DefOpt / cpu / Primal |
0.0000062979400172480384 s |
0.000006703879953420255 s |
0.94 |
Concat / IDefOpt / cpu / Primal |
0.000006705340001644799 s |
0.000006299739998212317 s |
1.06 |
Concat / JaXPipe / cpu / Forward |
0.000009972340003514543 s |
0.000009630539989302633 s |
1.04 |
Concat / Jax / cpu / Forward |
0.000009642540007916975 s |
0.000010102839951287024 s |
0.95 |
Concat / HLOOpt / cpu / Forward |
0.000010232760023427546 s |
0.000009807360020204217 s |
1.04 |
Concat / PartOpt / cpu / Forward |
0.00000935677998313622 s |
0.000009883120037557092 s |
0.95 |
Concat / IPartOpt / cpu / Forward |
0.000009250719949704944 s |
0.00001014442004816374 s |
0.91 |
Concat / DefOpt / cpu / Forward |
0.00000976240003183193 s |
0.000009600979974493384 s |
1.02 |
Concat / IDefOpt / cpu / Forward |
0.000010346760027459823 s |
0.000009490459960943554 s |
1.09 |
Concat / JaXPipe / cpu / PreRev |
0.000011961120026171556 s |
0.00001145785996413906 s |
1.04 |
Concat / JaXPipe / cpu / PostRev |
0.000011661260004984798 s |
0.00001061990007656277 s |
1.10 |
Concat / JaXPipe / cpu / BothRev |
0.000010954140034300508 s |
0.00001069060004738276 s |
1.02 |
Concat / Jax / cpu / BothRev |
0.000011529319990586374 s |
0.00001097654005207005 s |
1.05 |
Concat / HLOOpt / cpu / PreRev |
0.000011624299968389096 s |
0.000011447080014477251 s |
1.02 |
Concat / HLOOpt / cpu / PostRev |
0.000012840659956054878 s |
0.000013339120014279616 s |
0.96 |
Concat / HLOOpt / cpu / BothRev |
0.000010970800012728432 s |
0.000011410279857955175 s |
0.96 |
Concat / PartOpt / cpu / PreRev |
0.000010970119965350023 s |
0.000010918520038103452 s |
1.00 |
Concat / PartOpt / cpu / PostRev |
0.000011574259970075218 s |
0.000011513339977682337 s |
1.01 |
Concat / PartOpt / cpu / BothRev |
0.000011646959983409031 s |
0.000011210639950149926 s |
1.04 |
Concat / IPartOpt / cpu / PreRev |
0.000011494420004964922 s |
0.000011497539999254512 s |
1.00 |
Concat / IPartOpt / cpu / PostRev |
0.000011627539979599533 s |
0.00001105878000089433 s |
1.05 |
Concat / IPartOpt / cpu / BothRev |
0.00001141059999099525 s |
0.000011240539970458486 s |
1.02 |
Concat / DefOpt / cpu / PreRev |
0.000011296919992673792 s |
0.000011512620076246096 s |
0.98 |
Concat / DefOpt / cpu / PostRev |
0.000011199139980817564 s |
0.00001148805995399016 s |
0.97 |
Concat / DefOpt / cpu / BothRev |
0.000011264199983997968 s |
0.000010975040077028096 s |
1.03 |
Concat / IDefOpt / cpu / PreRev |
0.000011677459997372352 s |
0.000011794299989560388 s |
0.99 |
Concat / IDefOpt / cpu / PostRev |
0.000011167240045324431 s |
0.00001099843995689298 s |
1.02 |
Concat / IDefOpt / cpu / BothRev |
0.00001162450004812854 s |
0.00001133895997554646 s |
1.03 |
Concat / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002464 s |
0.78 |
Concat / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000002463 s |
0.78 |
Concat / JaXPipe / cuda / Forward |
0.000009856 s |
0.000010688 s |
0.92 |
Concat / Jax / cuda / Forward |
0.00000992 s |
0.000010592 s |
0.94 |
Concat / HLOOpt / cuda / Forward |
0.00000992 s |
0.000010752 s |
0.92 |
Concat / PartOpt / cuda / Forward |
0.000009664 s |
0.000010592 s |
0.91 |
Concat / IPartOpt / cuda / Forward |
0.000009696 s |
0.000010528 s |
0.92 |
Concat / DefOpt / cuda / Forward |
0.000009856 s |
0.000010688 s |
0.92 |
Concat / IDefOpt / cuda / Forward |
0.000009824 s |
0.00001072 s |
0.92 |
Concat / JaXPipe / cuda / PreRev |
0.00001664 s |
0.000017216 s |
0.97 |
Concat / JaXPipe / cuda / PostRev |
0.000015904000000000002 s |
0.000016896000000000002 s |
0.94 |
Concat / JaXPipe / cuda / BothRev |
0.000016609 s |
0.000017152 s |
0.97 |
Concat / Jax / cuda / BothRev |
0.000016192 s |
0.000017088 s |
0.95 |
Concat / HLOOpt / cuda / PreRev |
0.000016 s |
0.000016864 s |
0.95 |
Concat / HLOOpt / cuda / PostRev |
0.000015904000000000002 s |
0.00001728 s |
0.92 |
Concat / HLOOpt / cuda / BothRev |
0.00001664 s |
0.000017247999999999998 s |
0.96 |
Concat / PartOpt / cuda / PreRev |
0.000016224 s |
0.000017088 s |
0.95 |
Concat / PartOpt / cuda / PostRev |
0.000016255999999999998 s |
0.000017215 s |
0.94 |
Concat / PartOpt / cuda / BothRev |
0.000016383999999999998 s |
0.000017152 s |
0.96 |
Concat / IPartOpt / cuda / PreRev |
0.000015616 s |
0.000017024 s |
0.92 |
Concat / IPartOpt / cuda / PostRev |
0.000015935999999999998 s |
0.000016768000000000003 s |
0.95 |
Concat / IPartOpt / cuda / BothRev |
0.000016128 s |
0.000016768999999999998 s |
0.96 |
Concat / DefOpt / cuda / PreRev |
0.000016128 s |
0.00001728 s |
0.93 |
Concat / DefOpt / cuda / PostRev |
0.000015872 s |
0.00001712 s |
0.93 |
Concat / DefOpt / cuda / BothRev |
0.000016096 s |
0.000016927999999999998 s |
0.95 |
Concat / IDefOpt / cuda / PreRev |
0.000016544 s |
0.0000176 s |
0.94 |
Concat / IDefOpt / cuda / PostRev |
0.000016193 s |
0.000016831 s |
0.96 |
Concat / IDefOpt / cuda / BothRev |
0.000016768000000000003 s |
0.000017152 s |
0.98 |
Concat / JaXPipe / tpu / Primal |
0.000001562625 s |
0.00000152845 s |
1.02 |
Concat / Jax / tpu / Primal |
0.000001532775 s |
0.000001536025 s |
1.00 |
Concat / HLOOpt / tpu / Primal |
0.000001561825 s |
0.000001532375 s |
1.02 |
Concat / PartOpt / tpu / Primal |
0.00000154775 s |
0.0000015328249999999998 s |
1.01 |
Concat / IPartOpt / tpu / Primal |
0.0000015866250000000002 s |
0.00000152755 s |
1.04 |
Concat / DefOpt / tpu / Primal |
0.0000015294749999999997 s |
0.000001544 s |
0.99 |
Concat / IDefOpt / tpu / Primal |
0.000001582825 s |
0.00000153295 s |
1.03 |
Concat / JaXPipe / tpu / Forward |
0.0000015977250000000002 s |
0.000001575425 s |
1.01 |
Concat / Jax / tpu / Forward |
0.000001606175 s |
0.0000015991 s |
1.00 |
Concat / HLOOpt / tpu / Forward |
0.0000015998 s |
0.00000158675 s |
1.01 |
Concat / PartOpt / tpu / Forward |
0.00000159675 s |
0.0000015972500000000005 s |
1.00 |
Concat / IPartOpt / tpu / Forward |
0.0000016027250000000002 s |
0.000001587375 s |
1.01 |
Concat / DefOpt / tpu / Forward |
0.00000159355 s |
0.00000161155 s |
0.99 |
Concat / IDefOpt / tpu / Forward |
0.0000015889000000000002 s |
0.0000015929 s |
1.00 |
Concat / JaXPipe / tpu / PreRev |
0.0000020494250000000003 s |
0.000002013 s |
1.02 |
Concat / JaXPipe / tpu / PostRev |
0.000002097125 s |
0.0000020712 s |
1.01 |
Concat / JaXPipe / tpu / BothRev |
0.000002048475 s |
0.00000201835 s |
1.01 |
Concat / Jax / tpu / BothRev |
0.0000020887 s |
0.000002055325 s |
1.02 |
Concat / HLOOpt / tpu / PreRev |
0.0000020600750000000003 s |
0.000002012675 s |
1.02 |
Concat / HLOOpt / tpu / PostRev |
0.0000020822750000000003 s |
0.000002068325 s |
1.01 |
Concat / HLOOpt / tpu / BothRev |
0.0000020450250000000004 s |
0.000002007525 s |
1.02 |
Concat / PartOpt / tpu / PreRev |
0.000002083225 s |
0.000002057425 s |
1.01 |
Concat / PartOpt / tpu / PostRev |
0.000002051725 s |
0.0000020110000000000003 s |
1.02 |
Concat / PartOpt / tpu / BothRev |
0.000002091 s |
0.0000020543000000000004 s |
1.02 |
Concat / IPartOpt / tpu / PreRev |
0.000002054525 s |
0.00000201685 s |
1.02 |
Concat / IPartOpt / tpu / PostRev |
0.000002087925 s |
0.0000020674 s |
1.01 |
Concat / IPartOpt / tpu / BothRev |
0.0000020478750000000003 s |
0.00000201065 s |
1.02 |
Concat / DefOpt / tpu / PreRev |
0.0000020834250000000003 s |
0.000002056325 s |
1.01 |
Concat / DefOpt / tpu / PostRev |
0.0000020520250000000005 s |
0.0000020085 s |
1.02 |
Concat / DefOpt / tpu / BothRev |
0.000002081125 s |
0.000002056375 s |
1.01 |
Concat / IDefOpt / tpu / PreRev |
0.000002048175 s |
0.000002020325 s |
1.01 |
Concat / IDefOpt / tpu / PostRev |
0.00000208605 s |
0.000002062825 s |
1.01 |
Concat / IDefOpt / tpu / BothRev |
0.0000020479 s |
0.000002019675 s |
1.01 |
Concat / JaXPipe / cpu / Primal |
0.000012551 s |
0.000006435699961002684 s |
1.95 |
Concat / Jax / cpu / Primal |
0.000012694 s |
0.000006501879979623481 s |
1.95 |
Concat / HLOOpt / cpu / Primal |
0.000012536 s |
0.000006443000111175934 s |
1.95 |
Concat / PartOpt / cpu / Primal |
0.00001292 s |
0.00000656270001854864 s |
1.97 |
Concat / IPartOpt / cpu / Primal |
0.000012396 s |
0.000006713340062560746 s |
1.85 |
Concat / DefOpt / cpu / Primal |
0.000012933 s |
0.000006703879953420255 s |
1.93 |
Concat / IDefOpt / cpu / Primal |
0.000012514 s |
0.000006299739998212317 s |
1.99 |
Concat / JaXPipe / cpu / Forward |
0.000017378999999999997 s |
0.000009630539989302633 s |
1.80 |
Concat / Jax / cpu / Forward |
0.000017204 s |
0.000010102839951287024 s |
1.70 |
Concat / HLOOpt / cpu / Forward |
0.000017114 s |
0.000009807360020204217 s |
1.75 |
Concat / PartOpt / cpu / Forward |
0.000017524 s |
0.000009883120037557092 s |
1.77 |
Concat / IPartOpt / cpu / Forward |
0.000017131 s |
0.00001014442004816374 s |
1.69 |
Concat / DefOpt / cpu / Forward |
0.000017233 s |
0.000009600979974493384 s |
1.79 |
Concat / IDefOpt / cpu / Forward |
0.000017311 s |
0.000009490459960943554 s |
1.82 |
Concat / JaXPipe / cpu / PreRev |
0.000019797 s |
0.00001145785996413906 s |
1.73 |
Concat / JaXPipe / cpu / PostRev |
0.000018857 s |
0.00001061990007656277 s |
1.78 |
Concat / JaXPipe / cpu / BothRev |
0.000019556 s |
0.00001069060004738276 s |
1.83 |
Concat / Jax / cpu / BothRev |
0.000019735 s |
0.00001097654005207005 s |
1.80 |
Concat / HLOOpt / cpu / PreRev |
0.000019711 s |
0.000011447080014477251 s |
1.72 |
Concat / HLOOpt / cpu / PostRev |
0.0000194 s |
0.000013339120014279616 s |
1.45 |
Concat / HLOOpt / cpu / BothRev |
0.000019434 s |
0.000011410279857955175 s |
1.70 |
Concat / PartOpt / cpu / PreRev |
0.000019716 s |
0.000010918520038103452 s |
1.81 |
Concat / PartOpt / cpu / PostRev |
0.000019388 s |
0.000011513339977682337 s |
1.68 |
Concat / PartOpt / cpu / BothRev |
0.000019106000000000003 s |
0.000011210639950149926 s |
1.70 |
Concat / IPartOpt / cpu / PreRev |
0.000019581 s |
0.000011497539999254512 s |
1.70 |
Concat / IPartOpt / cpu / PostRev |
0.000019156 s |
0.00001105878000089433 s |
1.73 |
Concat / IPartOpt / cpu / BothRev |
0.000019254 s |
0.000011240539970458486 s |
1.71 |
Concat / DefOpt / cpu / PreRev |
0.000019398 s |
0.000011512620076246096 s |
1.68 |
Concat / DefOpt / cpu / PostRev |
0.000019791 s |
0.00001148805995399016 s |
1.72 |
Concat / DefOpt / cpu / BothRev |
0.000019103 s |
0.000010975040077028096 s |
1.74 |
Concat / IDefOpt / cpu / PreRev |
0.000019564 s |
0.000011794299989560388 s |
1.66 |
Concat / IDefOpt / cpu / PostRev |
0.00001938 s |
0.00001099843995689298 s |
1.76 |
Concat / IDefOpt / cpu / BothRev |
0.000019074 s |
0.00001133895997554646 s |
1.68 |
const_scatter / JaXPipe / cpu / Primal |
0.0000063139999747363615 s |
0.000006160580014693551 s |
1.02 |
const_scatter / Jax / cpu / Primal |
0.000006549839972649352 s |
0.0000061382400235743264 s |
1.07 |
const_scatter / HLOOpt / cpu / Primal |
0.000006749499998477404 s |
0.000007276999967871234 s |
0.93 |
const_scatter / PartOpt / cpu / Primal |
0.000006794859982619527 s |
0.000006049840012565256 s |
1.12 |
const_scatter / IPartOpt / cpu / Primal |
0.000006164379983601975 s |
0.000006259759975364432 s |
0.98 |
const_scatter / DefOpt / cpu / Primal |
0.000007027399951766711 s |
0.000007348239942075452 s |
0.96 |
const_scatter / IDefOpt / cpu / Primal |
0.000006658000020252075 s |
0.00000687206003931351 s |
0.97 |
const_scatter / JaXPipe / cpu / Forward |
0.000010415480046503944 s |
0.000010107159941981082 s |
1.03 |
const_scatter / Jax / cpu / Forward |
0.000008943600014390541 s |
0.000008885880015441217 s |
1.01 |
const_scatter / HLOOpt / cpu / Forward |
0.000011069639986089895 s |
0.000010267419947922462 s |
1.08 |
const_scatter / PartOpt / cpu / Forward |
0.000010242500011372612 s |
0.0000099613599559234 s |
1.03 |
const_scatter / IPartOpt / cpu / Forward |
0.000011293380002825873 s |
0.000010300219983037096 s |
1.10 |
const_scatter / DefOpt / cpu / Forward |
0.000010829319990079966 s |
0.000009794319976208498 s |
1.11 |
const_scatter / IDefOpt / cpu / Forward |
0.00001050596000823134 s |
0.00001040247994751553 s |
1.01 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002898678199744 s |
0.0002957961599895 s |
0.98 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002833398800339 s |
0.0002822270399155 s |
1.00 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002852521600289 s |
0.0002834643999995 s |
1.01 |
const_scatter / Jax / cpu / BothRev |
0.0002840498600744 s |
0.0002829939000366 s |
1.00 |
const_scatter / HLOOpt / cpu / PreRev |
0.0002841759799866 s |
0.0002844214600008 s |
1.00 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002875408599538 s |
0.0002877555599661 s |
1.00 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002824972599864 s |
0.0002840586799902 s |
0.99 |
const_scatter / PartOpt / cpu / PreRev |
0.0002844311199896 s |
0.0002818456199747 s |
1.01 |
const_scatter / PartOpt / cpu / PostRev |
0.0002828918600152 s |
0.0002807304599446 s |
1.01 |
const_scatter / PartOpt / cpu / BothRev |
0.0002827749399602 s |
0.0002830758599884 s |
1.00 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002840008200109 s |
0.000283077799977 s |
1.00 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002823651200196 s |
0.0002849672000593 s |
0.99 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002854526600003 s |
0.0002838028199948 s |
1.01 |
const_scatter / DefOpt / cpu / PreRev |
0.0002820905799762 s |
0.0002814923199548 s |
1.00 |
const_scatter / DefOpt / cpu / PostRev |
0.0002844695799285 s |
0.0002810001200305 s |
1.01 |
const_scatter / DefOpt / cpu / BothRev |
0.0002851977999762 s |
0.000283605760087 s |
1.01 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002849493600206 s |
0.0002903664798395 s |
0.98 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002909237799576 s |
0.0002862955600539 s |
1.02 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002838612199684 s |
0.0002827165999951 s |
1.00 |
const_scatter / JaXPipe / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / Jax / cuda / Primal |
0.000001887 s |
0.000002464 s |
0.77 |
const_scatter / HLOOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / PartOpt / cuda / Primal |
0.000001887 s |
0.000002464 s |
0.77 |
const_scatter / IPartOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / DefOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / IDefOpt / cuda / Primal |
0.000001887 s |
0.000002463 s |
0.77 |
const_scatter / JaXPipe / cuda / Forward |
0.000009472 s |
0.00001056 s |
0.90 |
const_scatter / Jax / cuda / Forward |
0.000009728 s |
0.00001056 s |
0.92 |
const_scatter / HLOOpt / cuda / Forward |
0.000009856 s |
0.000010496 s |
0.94 |
const_scatter / PartOpt / cuda / Forward |
0.000010015 s |
0.00001056 s |
0.95 |
const_scatter / IPartOpt / cuda / Forward |
0.0000096 s |
0.000010624 s |
0.90 |
const_scatter / DefOpt / cuda / Forward |
0.000009792 s |
0.000010687 s |
0.92 |
const_scatter / IDefOpt / cuda / Forward |
0.000010048 s |
0.000010879 s |
0.92 |
const_scatter / JaXPipe / cuda / PreRev |
0.000016704 s |
0.00001664 s |
1.00 |
const_scatter / JaXPipe / cuda / PostRev |
0.000016192 s |
0.000017184 s |
0.94 |
const_scatter / JaXPipe / cuda / BothRev |
0.000016576000000000002 s |
0.000017375999999999998 s |
0.95 |
const_scatter / Jax / cuda / BothRev |
0.00001712 s |
0.000017695 s |
0.97 |
const_scatter / HLOOpt / cuda / PreRev |
0.000016736 s |
0.000016513 s |
1.01 |
const_scatter / HLOOpt / cuda / PostRev |
0.000015904000000000002 s |
0.00001696 s |
0.94 |
const_scatter / HLOOpt / cuda / BothRev |
0.000015935999999999998 s |
0.000017375000000000002 s |
0.92 |
const_scatter / PartOpt / cuda / PreRev |
0.000016672 s |
0.000017375999999999998 s |
0.96 |
const_scatter / PartOpt / cuda / PostRev |
0.000016542999999999997 s |
0.0000168 s |
0.98 |
const_scatter / PartOpt / cuda / BothRev |
0.000016352 s |
0.000017152 s |
0.95 |
const_scatter / IPartOpt / cuda / PreRev |
0.00001584 s |
0.000017056 s |
0.93 |
const_scatter / IPartOpt / cuda / PostRev |
0.000015935999999999998 s |
0.000016927999999999998 s |
0.94 |
const_scatter / IPartOpt / cuda / BothRev |
0.000016224 s |
0.000016416 s |
0.99 |
const_scatter / DefOpt / cuda / PreRev |
0.000016576000000000002 s |
0.000017183 s |
0.96 |
const_scatter / DefOpt / cuda / PostRev |
0.000024704 s |
0.000016736 s |
1.48 |
const_scatter / DefOpt / cuda / BothRev |
0.000016736 s |
0.000016768000000000003 s |
1.00 |
const_scatter / IDefOpt / cuda / PreRev |
0.000017056 s |
0.000016224 s |
1.05 |
const_scatter / IDefOpt / cuda / PostRev |
0.000017183 s |
0.000016575 s |
1.04 |
const_scatter / IDefOpt / cuda / BothRev |
0.000016608 s |
0.000016864 s |
0.98 |
const_scatter / JaXPipe / tpu / Primal |
0.0000038059 s |
0.00000379245 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.000003796925 s |
0.000003796425 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
0.0000037845 s |
0.000003792775 s |
1.00 |
const_scatter / PartOpt / tpu / Primal |
0.000003786 s |
0.00000380505 s |
0.99 |
const_scatter / IPartOpt / tpu / Primal |
0.000003822725000000001 s |
0.00000382235 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
0.0000037818 s |
0.000003791025 s |
1.00 |
const_scatter / IDefOpt / tpu / Primal |
0.000003814125 s |
0.000003789375 s |
1.01 |
const_scatter / JaXPipe / tpu / Forward |
0.000006452325 s |
0.0000064895 s |
0.99 |
const_scatter / Jax / tpu / Forward |
0.0000064509750000000005 s |
0.00000645865 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.000006459125 s |
0.000006481249999999999 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.000006440325 s |
0.000006462725 s |
1.00 |
const_scatter / IPartOpt / tpu / Forward |
0.000006457475 s |
0.000006488175000000001 s |
1.00 |
const_scatter / DefOpt / tpu / Forward |
0.000006454100000000001 s |
0.0000064513 s |
1.00 |
const_scatter / IDefOpt / tpu / Forward |
0.0000064695250000000005 s |
0.000006512999999999999 s |
0.99 |
const_scatter / JaXPipe / tpu / PreRev |
0.0000065497 s |
0.00000668195 s |
0.98 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006569675000000001 s |
0.000006673599999999999 s |
0.98 |
const_scatter / JaXPipe / tpu / BothRev |
0.000006571825 s |
0.000006703925 s |
0.98 |
const_scatter / Jax / tpu / BothRev |
0.0000065927250000000005 s |
0.0000066635 s |
0.99 |
const_scatter / HLOOpt / tpu / PreRev |
0.000006540425 s |
0.00000667115 s |
0.98 |
const_scatter / HLOOpt / tpu / PostRev |
0.000006565 s |
0.000006664799999999999 s |
0.99 |
const_scatter / HLOOpt / tpu / BothRev |
0.00000655205 s |
0.000006675999999999999 s |
0.98 |
const_scatter / PartOpt / tpu / PreRev |
0.00000658395 s |
0.0000066796750000000006 s |
0.99 |
const_scatter / PartOpt / tpu / PostRev |
0.000006569625 s |
0.0000066963 s |
0.98 |
const_scatter / PartOpt / tpu / BothRev |
0.000006563575 s |
0.0000066842 s |
0.98 |
const_scatter / IPartOpt / tpu / PreRev |
0.000006585675 s |
0.000006679924999999999 s |
0.99 |
const_scatter / IPartOpt / tpu / PostRev |
0.0000065674 s |
0.00000668675 s |
0.98 |
const_scatter / IPartOpt / tpu / BothRev |
0.00000657165 s |
0.0000066709 s |
0.99 |
const_scatter / DefOpt / tpu / PreRev |
0.000006596225 s |
0.00000668525 s |
0.99 |
const_scatter / DefOpt / tpu / PostRev |
0.000006538850000000001 s |
0.000006678425 s |
0.98 |
const_scatter / DefOpt / tpu / BothRev |
0.00000657835 s |
0.00000669215 s |
0.98 |
const_scatter / IDefOpt / tpu / PreRev |
0.0000065547 s |
0.0000066717 s |
0.98 |
const_scatter / IDefOpt / tpu / PostRev |
0.000006583675 s |
0.00000667215 s |
0.99 |
const_scatter / IDefOpt / tpu / BothRev |
0.0000065594 s |
0.00000666985 s |
0.98 |
const_scatter / JaXPipe / cpu / Primal |
0.000012608 s |
0.000006160580014693551 s |
2.05 |
const_scatter / Jax / cpu / Primal |
0.000012463 s |
0.0000061382400235743264 s |
2.03 |
const_scatter / HLOOpt / cpu / Primal |
0.000013215 s |
0.000007276999967871234 s |
1.82 |
const_scatter / PartOpt / cpu / Primal |
0.000012637 s |
0.000006049840012565256 s |
2.09 |
const_scatter / IPartOpt / cpu / Primal |
0.00001257 s |
0.000006259759975364432 s |
2.01 |
const_scatter / DefOpt / cpu / Primal |
0.000013358 s |
0.000007348239942075452 s |
1.82 |
const_scatter / IDefOpt / cpu / Primal |
0.00001325 s |
0.00000687206003931351 s |
1.93 |
const_scatter / JaXPipe / cpu / Forward |
0.000018008 s |
0.000010107159941981082 s |
1.78 |
const_scatter / Jax / cpu / Forward |
0.000016551 s |
0.000008885880015441217 s |
1.86 |
const_scatter / HLOOpt / cpu / Forward |
0.000017665 s |
0.000010267419947922462 s |
1.72 |
const_scatter / PartOpt / cpu / Forward |
0.000017836 s |
0.0000099613599559234 s |
1.79 |
const_scatter / IPartOpt / cpu / Forward |
0.000017473 s |
0.000010300219983037096 s |
1.70 |
const_scatter / DefOpt / cpu / Forward |
0.000017863 s |
0.000009794319976208498 s |
1.82 |
const_scatter / IDefOpt / cpu / Forward |
0.000017785 s |
0.00001040247994751553 s |
1.71 |
const_scatter / JaXPipe / cpu / PreRev |
0.000493533 s |
0.0002957961599895 s |
1.67 |
const_scatter / JaXPipe / cpu / PostRev |
0.000516068 s |
0.0002822270399155 s |
1.83 |
const_scatter / JaXPipe / cpu / BothRev |
0.000494024 s |
0.0002834643999995 s |
1.74 |
const_scatter / Jax / cpu / BothRev |
0.000495967 s |
0.0002829939000366 s |
1.75 |
const_scatter / HLOOpt / cpu / PreRev |
0.0005052149999999 s |
0.0002844214600008 s |
1.78 |
const_scatter / HLOOpt / cpu / PostRev |
0.000492137 s |
0.0002877555599661 s |
1.71 |
const_scatter / HLOOpt / cpu / BothRev |
0.000495821 s |
0.0002840586799902 s |
1.75 |
const_scatter / PartOpt / cpu / PreRev |
0.000490004 s |
0.0002818456199747 s |
1.74 |
const_scatter / PartOpt / cpu / PostRev |
0.000512668 s |
0.0002807304599446 s |
1.83 |
const_scatter / PartOpt / cpu / BothRev |
0.000495419 s |
0.0002830758599884 s |
1.75 |
const_scatter / IPartOpt / cpu / PreRev |
0.000491679 s |
0.000283077799977 s |
1.74 |
const_scatter / IPartOpt / cpu / PostRev |
0.000505893 s |
0.0002849672000593 s |
1.78 |
const_scatter / IPartOpt / cpu / BothRev |
0.00052582 s |
0.0002838028199948 s |
1.85 |
const_scatter / DefOpt / cpu / PreRev |
0.000516704 s |
0.0002814923199548 s |
1.84 |
const_scatter / DefOpt / cpu / PostRev |
0.000516289 s |
0.0002810001200305 s |
1.84 |
const_scatter / DefOpt / cpu / BothRev |
0.000505551 s |
0.000283605760087 s |
1.78 |
const_scatter / IDefOpt / cpu / PreRev |
0.000529566 s |
0.0002903664798395 s |
1.82 |
const_scatter / IDefOpt / cpu / PostRev |
0.000507612 s |
0.0002862955600539 s |
1.77 |
const_scatter / IDefOpt / cpu / BothRev |
0.0005055389999999 s |
0.0002827165999951 s |
1.79 |
GenDot / JaXPipe / cpu / Primal |
0.000007588860016767285 s |
0.000007377620004263008 s |
1.03 |
GenDot / Jax / cpu / Primal |
0.000006582260020877584 s |
0.000007222200038086157 s |
0.91 |
GenDot / HLOOpt / cpu / Primal |
0.000007507099971917341 s |
0.000007422399994538864 s |
1.01 |
GenDot / PartOpt / cpu / Primal |
0.000006861680003567017 s |
0.000006913960078236414 s |
0.99 |
GenDot / IPartOpt / cpu / Primal |
0.00000700878003044636 s |
0.0000068490199737425424 s |
1.02 |
GenDot / DefOpt / cpu / Primal |
0.000007734579949101317 s |
0.000007118580106180161 s |
1.09 |
GenDot / IDefOpt / cpu / Primal |
0.000007254859983731876 s |
0.000007142399899748853 s |
1.02 |
GenDot / JaXPipe / cpu / Forward |
0.00001094669998565223 s |
0.000010665599984349682 s |
1.03 |
GenDot / Jax / cpu / Forward |
0.00001037962000737025 s |
0.000010504599977139153 s |
0.99 |
GenDot / HLOOpt / cpu / Forward |
0.000010719319998315769 s |
0.000011046119925595122 s |
0.97 |
GenDot / PartOpt / cpu / Forward |
0.00001085674000933068 s |
0.00001006010003038682 s |
1.08 |
GenDot / IPartOpt / cpu / Forward |
0.000011309839992463822 s |
0.00001120744003856089 s |
1.01 |
GenDot / DefOpt / cpu / Forward |
0.00001015621998703864 s |
0.000010274079904775135 s |
0.99 |
GenDot / IDefOpt / cpu / Forward |
0.000010401079989605931 s |
0.000010486480096005835 s |
0.99 |
GenDot / JaXPipe / cpu / PreRev |
0.00001138493999860657 s |
0.000010926600007223896 s |
1.04 |
GenDot / JaXPipe / cpu / PostRev |
0.000009626620021663256 s |
0.00001013830002193572 s |
0.95 |
GenDot / JaXPipe / cpu / BothRev |
0.000011501980006869416 s |
0.000010925779952231096 s |
1.05 |
GenDot / Jax / cpu / BothRev |
0.00001076947995898081 s |
0.000009941319967765594 s |
1.08 |
GenDot / HLOOpt / cpu / PreRev |
0.000011957280003116466 s |
0.000011691660110955128 s |
1.02 |
GenDot / HLOOpt / cpu / PostRev |
0.00001259118004782067 s |
0.000012876239925390107 s |
0.98 |
GenDot / HLOOpt / cpu / BothRev |
0.000011168000010002287 s |
0.00001098990001992206 s |
1.02 |
GenDot / PartOpt / cpu / PreRev |
0.000011043379936381826 s |
0.000010831399958988184 s |
1.02 |
GenDot / PartOpt / cpu / PostRev |
0.00001019813995299046 s |
0.000010444540021126158 s |
0.98 |
GenDot / PartOpt / cpu / BothRev |
0.00001146243999755825 s |
0.000011348319931130391 s |
1.01 |
GenDot / IPartOpt / cpu / PreRev |
0.0000111389199810219 s |
0.000010319640023226383 s |
1.08 |
GenDot / IPartOpt / cpu / PostRev |
0.00001002130000415491 s |
0.000010269659978803248 s |
0.98 |
GenDot / IPartOpt / cpu / BothRev |
0.000011088120027125117 s |
0.000011077979925175896 s |
1.00 |
GenDot / DefOpt / cpu / PreRev |
0.000011281059996690602 s |
0.000010820759998750872 s |
1.04 |
GenDot / DefOpt / cpu / PostRev |
0.000011060139977416838 s |
0.000010590620040602515 s |
1.04 |
GenDot / DefOpt / cpu / BothRev |
0.00001100796002901916 s |
0.000010781660021166318 s |
1.02 |
GenDot / IDefOpt / cpu / PreRev |
0.000011087160019087605 s |
0.00001114787997721578 s |
0.99 |
GenDot / IDefOpt / cpu / PostRev |
0.000010777519983093952 s |
0.000011181120007677236 s |
0.96 |
GenDot / IDefOpt / cpu / BothRev |
0.000010418959964226816 s |
0.000010971699994115623 s |
0.95 |
GenDot / JaXPipe / cuda / Primal |
0.000002015 s |
0.000002528 s |
0.80 |
GenDot / Jax / cuda / Primal |
0.000002015 s |
0.000002528 s |
0.80 |
GenDot / HLOOpt / cuda / Primal |
0.000001984 s |
0.000002527 s |
0.79 |
GenDot / PartOpt / cuda / Primal |
0.000002015 s |
0.00000256 s |
0.79 |
GenDot / IPartOpt / cuda / Primal |
0.000002015 s |
0.00000256 s |
0.79 |
GenDot / DefOpt / cuda / Primal |
0.000001984 s |
0.000002528 s |
0.78 |
GenDot / IDefOpt / cuda / Primal |
0.000001984 s |
0.000002528 s |
0.78 |
GenDot / JaXPipe / cuda / Forward |
0.000010112 s |
0.00001088 s |
0.93 |
GenDot / Jax / cuda / Forward |
0.000009952 s |
0.000011776 s |
0.85 |
GenDot / HLOOpt / cuda / Forward |
0.000010144 s |
0.000011872 s |
0.85 |
GenDot / PartOpt / cuda / Forward |
0.000010305 s |
0.000010784 s |
0.96 |
GenDot / IPartOpt / cuda / Forward |
0.00000992 s |
0.000010816 s |
0.92 |
GenDot / DefOpt / cuda / Forward |
0.000010208 s |
0.000010656 s |
0.96 |
GenDot / IDefOpt / cuda / Forward |
0.00000992 s |
0.000010624 s |
0.93 |
GenDot / JaXPipe / cuda / PreRev |
0.000009888 s |
0.00001072 s |
0.92 |
GenDot / JaXPipe / cuda / PostRev |
0.00001024 s |
0.000010656 s |
0.96 |
GenDot / JaXPipe / cuda / BothRev |
0.000010112 s |
0.00001104 s |
0.92 |
GenDot / Jax / cuda / BothRev |
0.000010368 s |
0.000011039 s |
0.94 |
GenDot / HLOOpt / cuda / PreRev |
0.000010176 s |
0.000010753 s |
0.95 |
GenDot / HLOOpt / cuda / PostRev |
0.000010144 s |
0.000010753 s |
0.94 |
GenDot / HLOOpt / cuda / BothRev |
0.000009984 s |
0.000010752 s |
0.93 |
GenDot / PartOpt / cuda / PreRev |
0.000010112 s |
0.000010912 s |
0.93 |
GenDot / PartOpt / cuda / PostRev |
0.000011136 s |
0.000010816 s |
1.03 |
GenDot / PartOpt / cuda / BothRev |
0.000014944 s |
0.000010752 s |
1.39 |
GenDot / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000010656 s |
0.94 |
GenDot / IPartOpt / cuda / PostRev |
0.0000112 s |
0.000010912 s |
1.03 |
GenDot / IPartOpt / cuda / BothRev |
0.00001024 s |
0.000010688 s |
0.96 |
GenDot / DefOpt / cuda / PreRev |
0.000010144 s |
0.00001104 s |
0.92 |
GenDot / DefOpt / cuda / PostRev |
0.000010144 s |
0.00001072 s |
0.95 |
GenDot / DefOpt / cuda / BothRev |
0.000010209 s |
0.000010784 s |
0.95 |
GenDot / IDefOpt / cuda / PreRev |
0.000010368 s |
0.000010783 s |
0.96 |
GenDot / IDefOpt / cuda / PostRev |
0.000010143 s |
0.000010752 s |
0.94 |
GenDot / IDefOpt / cuda / BothRev |
0.000010016 s |
0.000014816 s |
0.68 |
GenDot / JaXPipe / tpu / Primal |
9.30575e-7 s |
9.20625e-7 s |
1.01 |
GenDot / Jax / tpu / Primal |
9.2535e-7 s |
9.299e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.0000015741 s |
0.0000016143 s |
0.98 |
GenDot / PartOpt / tpu / Primal |
9.255e-7 s |
9.30425e-7 s |
0.99 |
GenDot / IPartOpt / tpu / Primal |
9.30125e-7 s |
9.770499999999998e-7 s |
0.95 |
GenDot / DefOpt / tpu / Primal |
0.000001491625 s |
0.000001497 s |
1.00 |
GenDot / IDefOpt / tpu / Primal |
0.0000015746499999999998 s |
0.0000016151 s |
0.97 |
GenDot / JaXPipe / tpu / Forward |
0.000003169225 s |
0.000003060275 s |
1.04 |
GenDot / Jax / tpu / Forward |
0.000002325725 s |
0.00000231795 s |
1.00 |
GenDot / HLOOpt / tpu / Forward |
0.0000031247000000000003 s |
0.0000031134 s |
1.00 |
GenDot / PartOpt / tpu / Forward |
0.00000322485 s |
0.000003119825 s |
1.03 |
GenDot / IPartOpt / tpu / Forward |
0.0000031294 s |
0.00000311875 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.00000323135 s |
0.000003119 s |
1.04 |
GenDot / IDefOpt / tpu / Forward |
0.0000031166 s |
0.0000031241750000000005 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.0000028691 s |
0.000002942475 s |
0.98 |
GenDot / JaXPipe / tpu / PostRev |
0.00000238215 s |
0.0000023475 s |
1.01 |
GenDot / JaXPipe / tpu / BothRev |
0.000002862475 s |
0.000002937775 s |
0.97 |
GenDot / Jax / tpu / BothRev |
0.000002381275 s |
0.00000235035 s |
1.01 |
GenDot / HLOOpt / tpu / PreRev |
0.0000028965750000000003 s |
0.0000029276999999999995 s |
0.99 |
GenDot / HLOOpt / tpu / PostRev |
0.000002913875 s |
0.000002871875 s |
1.01 |
GenDot / HLOOpt / tpu / BothRev |
0.000002876575 s |
0.0000029336750000000003 s |
0.98 |
GenDot / PartOpt / tpu / PreRev |
0.000002916825 s |
0.000002884175 s |
1.01 |
GenDot / PartOpt / tpu / PostRev |
0.00000235815 s |
0.000002411675 s |
0.98 |
GenDot / PartOpt / tpu / BothRev |
0.0000029232 s |
0.0000028810000000000005 s |
1.01 |
GenDot / IPartOpt / tpu / PreRev |
0.0000028739749999999994 s |
0.0000029337 s |
0.98 |
GenDot / IPartOpt / tpu / PostRev |
0.000002386025 s |
0.000002344475 s |
1.02 |
GenDot / IPartOpt / tpu / BothRev |
0.000002869325 s |
0.00000292585 s |
0.98 |
GenDot / DefOpt / tpu / PreRev |
0.000002918825 s |
0.000002879575 s |
1.01 |
GenDot / DefOpt / tpu / PostRev |
0.000002865625 s |
0.00000294605 s |
0.97 |
GenDot / DefOpt / tpu / BothRev |
0.00000293475 s |
0.00000288235 s |
1.02 |
GenDot / IDefOpt / tpu / PreRev |
0.0000028799 s |
0.000002949175 s |
0.98 |
GenDot / IDefOpt / tpu / PostRev |
0.0000029231 s |
0.0000028782 s |
1.02 |
GenDot / IDefOpt / tpu / BothRev |
0.0000028720250000000003 s |
0.0000029348 s |
0.98 |
GenDot / JaXPipe / cpu / Primal |
0.000014775 s |
0.000007377620004263008 s |
2.00 |
GenDot / Jax / cpu / Primal |
0.000014674 s |
0.000007222200038086157 s |
2.03 |
GenDot / HLOOpt / cpu / Primal |
0.000013839 s |
0.000007422399994538864 s |
1.86 |
GenDot / PartOpt / cpu / Primal |
0.000014954 s |
0.000006913960078236414 s |
2.16 |
GenDot / IPartOpt / cpu / Primal |
0.000014344 s |
0.0000068490199737425424 s |
2.09 |
GenDot / DefOpt / cpu / Primal |
0.000014273 s |
0.000007118580106180161 s |
2.01 |
GenDot / IDefOpt / cpu / Primal |
0.000013835 s |
0.000007142399899748853 s |
1.94 |
GenDot / JaXPipe / cpu / Forward |
0.000019345 s |
0.000010665599984349682 s |
1.81 |
GenDot / Jax / cpu / Forward |
0.000019775 s |
0.000010504599977139153 s |
1.88 |
GenDot / HLOOpt / cpu / Forward |
0.000018913 s |
0.000011046119925595122 s |
1.71 |
GenDot / PartOpt / cpu / Forward |
0.000018584 s |
0.00001006010003038682 s |
1.85 |
GenDot / IPartOpt / cpu / Forward |
0.000017759 s |
0.00001120744003856089 s |
1.58 |
GenDot / DefOpt / cpu / Forward |
0.000018647 s |
0.000010274079904775135 s |
1.81 |
GenDot / IDefOpt / cpu / Forward |
0.00001884 s |
0.000010486480096005835 s |
1.80 |
GenDot / JaXPipe / cpu / PreRev |
0.000019636 s |
0.000010926600007223896 s |
1.80 |
GenDot / JaXPipe / cpu / PostRev |
0.000020278 s |
0.00001013830002193572 s |
2.00 |
GenDot / JaXPipe / cpu / BothRev |
0.000018807 s |
0.000010925779952231096 s |
1.72 |
GenDot / Jax / cpu / BothRev |
0.000020066 s |
0.000009941319967765594 s |
2.02 |
GenDot / HLOOpt / cpu / PreRev |
0.00001909 s |
0.000011691660110955128 s |
1.63 |
GenDot / HLOOpt / cpu / PostRev |
0.000019041 s |
0.000012876239925390107 s |
1.48 |
GenDot / HLOOpt / cpu / BothRev |
0.000018891000000000003 s |
0.00001098990001992206 s |
1.72 |
GenDot / PartOpt / cpu / PreRev |
0.000019081 s |
0.000010831399958988184 s |
1.76 |
GenDot / PartOpt / cpu / PostRev |
0.000019589 s |
0.000010444540021126158 s |
1.88 |
GenDot / PartOpt / cpu / BothRev |
0.000019411 s |
0.000011348319931130391 s |
1.71 |
GenDot / IPartOpt / cpu / PreRev |
0.000018663 s |
0.000010319640023226383 s |
1.81 |
GenDot / IPartOpt / cpu / PostRev |
0.000019823 s |
0.000010269659978803248 s |
1.93 |
GenDot / IPartOpt / cpu / BothRev |
0.000019467 s |
0.000011077979925175896 s |
1.76 |
GenDot / DefOpt / cpu / PreRev |
0.000018839 s |
0.000010820759998750872 s |
1.74 |
GenDot / DefOpt / cpu / PostRev |
0.000018929 s |
0.000010590620040602515 s |
1.79 |
GenDot / DefOpt / cpu / BothRev |
0.00001923 s |
0.000010781660021166318 s |
1.78 |
GenDot / IDefOpt / cpu / PreRev |
0.000018747 s |
0.00001114787997721578 s |
1.68 |
GenDot / IDefOpt / cpu / PostRev |
0.000018946 s |
0.000011181120007677236 s |
1.69 |
GenDot / IDefOpt / cpu / BothRev |
0.000018762 s |
0.000010971699994115623 s |
1.71 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000010902399981205236 s |
0.00001053083999067894 s |
1.04 |
hlo_ffi / Jax / cpu / Primal |
0.000010238440008834004 s |
0.000010049380052805646 s |
1.02 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000010035020022769458 s |
0.000010057840063382173 s |
1.00 |
hlo_ffi / PartOpt / cpu / Primal |
0.000009647539955039974 s |
0.000009655160010879626 s |
1.00 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000010785379954540986 s |
0.00001009035997412866 s |
1.07 |
hlo_ffi / DefOpt / cpu / Primal |
0.000009919499989337057 s |
0.000009545619977870956 s |
1.04 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000009905980050461947 s |
0.000010037020037998446 s |
0.99 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000014460799984590267 s |
0.00001457850004953798 s |
0.99 |
hlo_ffi / Jax / cpu / Forward |
0.000014450360022237871 s |
0.000014279340030043386 s |
1.01 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000014300620005087696 s |
0.000014677200015285052 s |
0.97 |
hlo_ffi / PartOpt / cpu / Forward |
0.000014522079973176004 s |
0.000014379120075318496 s |
1.01 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000014513780024572045 s |
0.000014442919964494649 s |
1.00 |
hlo_ffi / DefOpt / cpu / Forward |
0.000014656379998996272 s |
0.00001432873994417605 s |
1.02 |
hlo_ffi / IDefOpt / cpu / Forward |
0.00001460917998883815 s |
0.000013966859987704084 s |
1.05 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000014901479971740628 s |
0.000014706940037285676 s |
1.01 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000014721980005560908 s |
0.000014398339972103714 s |
1.02 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000014444559992625727 s |
0.000014566539884981469 s |
0.99 |
hlo_ffi / Jax / cpu / BothRev |
0.000014480099998763764 s |
0.000014863860051264056 s |
0.97 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000015056759993967716 s |
0.00001461602003473672 s |
1.03 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000016277039976557716 s |
0.000016265079939330462 s |
1.00 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000014207279982656472 s |
0.000014199220004229574 s |
1.00 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000014446560016949662 s |
0.000014724999928148464 s |
0.98 |
hlo_ffi / PartOpt / cpu / PostRev |
0.00001428392001798784 s |
0.00001431689992386964 s |
1.00 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00001459466001506371 s |
0.000013945200025773374 s |
1.05 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000014747059994988376 s |
0.000014579059989046071 s |
1.01 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000013768640055786818 s |
0.000014473160026682308 s |
0.95 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000014301519995569831 s |
0.000013991260002512718 s |
1.02 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000014653699990958555 s |
0.00001468752001528628 s |
1.00 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000013866400013284877 s |
0.000014148140035104009 s |
0.98 |
hlo_ffi / DefOpt / cpu / BothRev |
0.00001593566000337887 s |
0.00001415964003172121 s |
1.13 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000014704219966006347 s |
0.000014535159989463864 s |
1.01 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000014206679989001714 s |
0.000014654960068583024 s |
0.97 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000013864699994883268 s |
0.000014252880046115024 s |
0.97 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001983 s |
0.000002368 s |
0.84 |
hlo_ffi / Jax / cuda / Primal |
0.000001983 s |
0.000002368 s |
0.84 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001984 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001984 s |
0.000002368 s |
0.84 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001984 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001983 s |
0.0000023670000000000004 s |
0.84 |
hlo_ffi / JaXPipe / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / Jax / cuda / Forward |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002048 s |
0.000002464 s |
0.83 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / IPartOpt / cuda / Forward |
0.00000208 s |
0.000002463 s |
0.84 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / IDefOpt / cuda / Forward |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002048 s |
0.000002432 s |
0.84 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / Jax / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002047 s |
0.000002463 s |
0.83 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002047 s |
0.000002463 s |
0.83 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002047 s |
0.000002463 s |
0.83 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002047 s |
0.000002463 s |
0.83 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002048 s |
0.000002463 s |
0.83 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002047 s |
0.000002433 s |
0.84 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002047 s |
0.000002432 s |
0.84 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002048 s |
0.000002433 s |
0.84 |
hlo_ffi / JaXPipe / tpu / Primal |
9.2775e-7 s |
9.3165e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Primal |
9.50525e-7 s |
9.53075e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
9.049e-7 s |
9.0595e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Primal |
9.552e-7 s |
9.50075e-7 s |
1.01 |
hlo_ffi / IPartOpt / tpu / Primal |
9.1335e-7 s |
9.0665e-7 s |
1.01 |
hlo_ffi / DefOpt / tpu / Primal |
9.54375e-7 s |
9.52375e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Primal |
9.0795e-7 s |
9.084e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Forward |
9.49425e-7 s |
9.49675e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.819e-7 s |
9.81725e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.747e-7 s |
9.74525e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.34475e-7 s |
9.344e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.74775e-7 s |
9.74575e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.34225e-7 s |
9.345e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.746e-7 s |
9.736749999999998e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.3805e-7 s |
9.38e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.66325e-7 s |
9.656e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.627e-7 s |
9.6225e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.6505e-7 s |
9.6555e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.624e-7 s |
9.6295e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.6555e-7 s |
9.65025e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.62025e-7 s |
9.62275e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.65375e-7 s |
9.65375e-7 s |
1 |
hlo_ffi / PartOpt / tpu / PostRev |
9.6195e-7 s |
9.615750000000002e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.650749999999998e-7 s |
9.6505e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.6265e-7 s |
9.62725e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.656e-7 s |
9.64875e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.61975e-7 s |
9.62275e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.656e-7 s |
9.646e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.6245e-7 s |
9.62475e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.659e-7 s |
9.64825e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.62175e-7 s |
9.625749999999998e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.65625e-7 s |
9.65e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.62675e-7 s |
9.62625e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000017426 s |
0.00001053083999067894 s |
1.65 |
hlo_ffi / Jax / cpu / Primal |
0.000017531999999999997 s |
0.000010049380052805646 s |
1.74 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000017767 s |
0.000010057840063382173 s |
1.77 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017712 s |
0.000009655160010879626 s |
1.83 |
hlo_ffi / IPartOpt / cpu / Primal |
0.00001747 s |
0.00001009035997412866 s |
1.73 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017353 s |
0.000009545619977870956 s |
1.82 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000017629 s |
0.000010037020037998446 s |
1.76 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000024658 s |
0.00001457850004953798 s |
1.69 |
hlo_ffi / Jax / cpu / Forward |
0.000024178 s |
0.000014279340030043386 s |
1.69 |
hlo_ffi / HLOOpt / cpu / Forward |
0.00002403 s |
0.000014677200015285052 s |
1.64 |
hlo_ffi / PartOpt / cpu / Forward |
0.000040419 s |
0.000014379120075318496 s |
2.81 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000023785 s |
0.000014442919964494649 s |
1.65 |
hlo_ffi / DefOpt / cpu / Forward |
0.000024502 s |
0.00001432873994417605 s |
1.71 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000024163 s |
0.000013966859987704084 s |
1.73 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000024075 s |
0.000014706940037285676 s |
1.64 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000023526 s |
0.000014398339972103714 s |
1.63 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.00002324 s |
0.000014566539884981469 s |
1.60 |
hlo_ffi / Jax / cpu / BothRev |
0.000023786 s |
0.000014863860051264056 s |
1.60 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000024092 s |
0.00001461602003473672 s |
1.65 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000023493 s |
0.000016265079939330462 s |
1.44 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000023757 s |
0.000014199220004229574 s |
1.67 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000023687 s |
0.000014724999928148464 s |
1.61 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000024285 s |
0.00001431689992386964 s |
1.70 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000023688 s |
0.000013945200025773374 s |
1.70 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000024289 s |
0.000014579059989046071 s |
1.67 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000024056 s |
0.000014473160026682308 s |
1.66 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000023665 s |
0.000013991260002512718 s |
1.69 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000024557 s |
0.00001468752001528628 s |
1.67 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023547 s |
0.000014148140035104009 s |
1.66 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000024256 s |
0.00001415964003172121 s |
1.71 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000023354 s |
0.000014535159989463864 s |
1.61 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000023675000000000003 s |
0.000014654960068583024 s |
1.62 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.00002425 s |
0.000014252880046115024 s |
1.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0008823681999274 s |
0.0009126427998126 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.000923146200148 s |
0.0008927976001359 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009403691999978 s |
0.0010200202001215 s |
0.92 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0009075947999917 s |
0.0009099765997234 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0008975449999525 s |
0.0009122600000409 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009633425999709 s |
0.0009517657999822 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009623766000004 s |
0.0009415242000613 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0022037834000002 s |
0.0022322189999613 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0022613721999732 s |
0.0022640238001258 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0022365268000612 s |
0.0022092698000051 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0021996649999891 s |
0.0022217054000066 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0021952595999209 s |
0.0021557281999776 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022407612000279 s |
0.002132012400034 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.002167042199926 s |
0.0021401981999588 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.004876933199921 s |
0.0054930844002228 s |
0.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0060059692000322 s |
0.0055573514002389 s |
1.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0051868045999981 s |
0.0056951962000312 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0049735734000023 s |
0.005988877800155 s |
0.83 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0052108600001702 s |
0.005326209800296 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0046958593999079 s |
0.0057709669999894 s |
0.81 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0056023883998932 s |
0.004952465000133 s |
1.13 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0048550471999078 s |
0.0056679671999518 s |
0.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0058202621999953 s |
0.0052149561999613 s |
1.12 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0048991354000463 s |
0.005356651799957 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0053876819998549 s |
0.005281492999893 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0049464742000054 s |
0.005752405199928 s |
0.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0056230101999972 s |
0.0042657144002077 s |
1.32 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0049969914001849 s |
0.0053249480000886 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0052007637999849 s |
0.0053774022000652 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0050065729999005 s |
0.0053912424000372 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.0058234097999957 s |
0.0052342957998916 s |
1.11 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.005181067400008 s |
0.0053787539998666 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0052892740000061 s |
0.0056076900000334 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000282176 s |
0.000294782 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000281793 s |
0.000295519 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000288512 s |
0.000300735 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000280641 s |
0.000294431 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000281409 s |
0.000294719 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.000288609 s |
0.000302302 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000289985 s |
0.000300958 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.0005568659999999 s |
0.000582397 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.0005400009999999 s |
0.000565949 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000558626 s |
0.000582525 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.000557537 s |
0.0005827489999999 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.0005596179999999 s |
0.000581821 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000560097 s |
0.000582045 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000559522 s |
0.000582365 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001032163 s |
0.001053018 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.0009914589999999 s |
0.001010011 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001030499 s |
0.001049755 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.00099434 s |
0.001002806 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.0010142109999999 s |
0.001033755 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001039203 s |
0.001060987 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001014083 s |
0.001036155 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001033156 s |
0.001047547 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000979363 s |
0.000998715 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001031076 s |
0.0010492729999999 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001027523 s |
0.001049147 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000977699 s |
0.000997211 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.0010284839999999 s |
0.001048443 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.0010254749999999 s |
0.001050459 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.0009653159999999 s |
0.000983675 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001025252 s |
0.001050168 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001022884 s |
0.00104982 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.0010230109999999 s |
0.001053819 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.00102442 s |
0.0010519309999999 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.00012401225 s |
0.000124404 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.00012661625 s |
0.000126576 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.0001524675 s |
0.00015267225 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.0001341925 s |
0.00013367925 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.000130773 s |
0.0001313635 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.0001476145 s |
0.0001480195 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.0001506225 s |
0.0001510995 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.00021220175 s |
0.00021246825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.0002609484999999 s |
0.0002609435 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00021213875 s |
0.000212679 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.0002185204999999 s |
0.0002185557499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.0002119635 s |
0.00021224075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.00021825925 s |
0.00021877125 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.0002120779999999 s |
0.0002124265 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035422575 s |
0.000354156 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.0002569075 s |
0.0002565564999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.00035454625 s |
0.000353798 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.0002566215 s |
0.00025646925 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.00035452925 s |
0.0003540605 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029040325 s |
0.0002909392499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035429725 s |
0.0003538199999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.00035546825 s |
0.00035552825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.0002714055 s |
0.0002709835 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.0003558247499999 s |
0.00035510475 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.00035455575 s |
0.00035378575 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027190975 s |
0.0002722284999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.00035499475 s |
0.00035379125 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.0003578285 s |
0.0003576025 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.00028352575 s |
0.00028285975 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.00035812 s |
0.000357185 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.00035748475 s |
0.00035625625 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00030119375 s |
0.0003005229999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.00035719625 s |
0.00035597025 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.001912621 s |
0.0009126427998126 s |
2.10 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.001687994 s |
0.0008927976001359 s |
1.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.001804514 s |
0.0010200202001215 s |
1.77 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0018908899999999 s |
0.0009099765997234 s |
2.08 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.001773447 s |
0.0009122600000409 s |
1.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.001894947 s |
0.0009517657999822 s |
1.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.002043912 s |
0.0009415242000613 s |
2.17 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.005031897 s |
0.0022322189999613 s |
2.25 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.005118964 s |
0.0022640238001258 s |
2.26 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.005082814 s |
0.0022092698000051 s |
2.30 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.004896263 s |
0.0022217054000066 s |
2.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.004998863 s |
0.0021557281999776 s |
2.32 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005085558 s |
0.002132012400034 s |
2.39 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.005346746 s |
0.0021401981999588 s |
2.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.008291835 s |
0.0054930844002228 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.00866266 s |
0.0055573514002389 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.008291479 s |
0.0056951962000312 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.008137891 s |
0.005988877800155 s |
1.36 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0088883449999999 s |
0.005326209800296 s |
1.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.008146671 s |
0.0057709669999894 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.008310056 s |
0.004952465000133 s |
1.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.008059735 s |
0.0056679671999518 s |
1.42 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0092984259999999 s |
0.0052149561999613 s |
1.78 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0081951889999999 s |
0.005356651799957 s |
1.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.007725075 s |
0.005281492999893 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.008489073 s |
0.005752405199928 s |
1.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.008236724 s |
0.0042657144002077 s |
1.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0082963009999999 s |
0.0053249480000886 s |
1.56 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.007588538 s |
0.0053774022000652 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.009058634 s |
0.0053912424000372 s |
1.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.0087375489999999 s |
0.0052342957998916 s |
1.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0080325109999999 s |
0.0053787539998666 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.008209426 s |
0.0056076900000334 s |
1.46 |
scatter_sum / JaXPipe / cpu / Primal |
0.000007981500011737808 s |
0.000008034660004341276 s |
0.99 |
scatter_sum / Jax / cpu / Primal |
0.000007304739992832765 s |
0.000007401199927699053 s |
0.99 |
scatter_sum / HLOOpt / cpu / Primal |
0.000007686379949518597 s |
0.00000782368006184697 s |
0.98 |
scatter_sum / PartOpt / cpu / Primal |
0.000007235019993458991 s |
0.000007279739966179477 s |
0.99 |
scatter_sum / IPartOpt / cpu / Primal |
0.000007466500010195887 s |
0.000008091340077953646 s |
0.92 |
scatter_sum / DefOpt / cpu / Primal |
0.000007274020035765716 s |
0.000007440540011884877 s |
0.98 |
scatter_sum / IDefOpt / cpu / Primal |
0.000007256520038936287 s |
0.000007552500064775813 s |
0.96 |
scatter_sum / JaXPipe / cpu / Forward |
0.000012101860011171085 s |
0.000011487500087241642 s |
1.05 |
scatter_sum / Jax / cpu / Forward |
0.000011561479996089474 s |
0.0000111186600952351 s |
1.04 |
scatter_sum / HLOOpt / cpu / Forward |
0.000012332079995758247 s |
0.000011329380049573956 s |
1.09 |
scatter_sum / PartOpt / cpu / Forward |
0.00001176797999505652 s |
0.00001174510010969243 s |
1.00 |
scatter_sum / IPartOpt / cpu / Forward |
0.000012224759957462084 s |
0.00001193121990581858 s |
1.02 |
scatter_sum / DefOpt / cpu / Forward |
0.000012311980044614755 s |
0.000011192940000910312 s |
1.10 |
scatter_sum / IDefOpt / cpu / Forward |
0.000012033379998683812 s |
0.00001139780006269575 s |
1.06 |
scatter_sum / JaXPipe / cpu / PreRev |
0.00001151444001152413 s |
0.000011128400001325644 s |
1.03 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000012075419972461532 s |
0.000011160660160385304 s |
1.08 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000012127059990234556 s |
0.000011471000107121654 s |
1.06 |
scatter_sum / Jax / cpu / BothRev |
0.000011938499983443765 s |
0.0000114425599531387 s |
1.04 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000012259219975021551 s |
0.000011476399977254916 s |
1.07 |
scatter_sum / HLOOpt / cpu / PostRev |
0.00001357001995529572 s |
0.000013494479953806148 s |
1.01 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000011389800001779803 s |
0.000011364520014467417 s |
1.00 |
scatter_sum / PartOpt / cpu / PreRev |
0.000011665560014080255 s |
0.000011737500026356429 s |
0.99 |
scatter_sum / PartOpt / cpu / PostRev |
0.000012264540000614945 s |
0.000010909779975918354 s |
1.12 |
scatter_sum / PartOpt / cpu / BothRev |
0.000012362940005914424 s |
0.00001189136008179048 s |
1.04 |
scatter_sum / IPartOpt / cpu / PreRev |
0.00001190515999951458 s |
0.000011497700052132132 s |
1.04 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000011542399997779284 s |
0.000011243499975535088 s |
1.03 |
scatter_sum / IPartOpt / cpu / BothRev |
0.00001188132004244835 s |
0.00001136295997639536 s |
1.05 |
scatter_sum / DefOpt / cpu / PreRev |
0.000011074539970650222 s |
0.00001163702010671841 s |
0.95 |
scatter_sum / DefOpt / cpu / PostRev |
0.000012162379980509288 s |
0.000010856139979296132 s |
1.12 |
scatter_sum / DefOpt / cpu / BothRev |
0.000011757820020648069 s |
0.0000110379801117233 s |
1.07 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000011800980000771232 s |
0.000010984399905282773 s |
1.07 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000011917840029127546 s |
0.000011226860078750178 s |
1.06 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000012122280022595078 s |
0.000011499479951453397 s |
1.05 |
scatter_sum / JaXPipe / cuda / Primal |
0.000009632 s |
0.00001056 s |
0.91 |
scatter_sum / Jax / cuda / Primal |
0.000010208 s |
0.000010656 s |
0.96 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009472 s |
0.000010624 s |
0.89 |
scatter_sum / PartOpt / cuda / Primal |
0.000009824 s |
0.000010496 s |
0.94 |
scatter_sum / IPartOpt / cuda / Primal |
0.000009888 s |
0.000010816 s |
0.91 |
scatter_sum / DefOpt / cuda / Primal |
0.000009984 s |
0.000010816 s |
0.92 |
scatter_sum / IDefOpt / cuda / Primal |
0.0000096 s |
0.000010432 s |
0.92 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017088 s |
0.000017247 s |
0.99 |
scatter_sum / Jax / cuda / Forward |
0.000016704 s |
0.000017216 s |
0.97 |
scatter_sum / HLOOpt / cuda / Forward |
0.000016864 s |
0.000017536 s |
0.96 |
scatter_sum / PartOpt / cuda / Forward |
0.000016383999999999998 s |
0.000017375999999999998 s |
0.94 |
scatter_sum / IPartOpt / cuda / Forward |
0.000016672 s |
0.000017696 s |
0.94 |
scatter_sum / DefOpt / cuda / Forward |
0.000017056 s |
0.000017536 s |
0.97 |
scatter_sum / IDefOpt / cuda / Forward |
0.00001696 s |
0.000017247999999999998 s |
0.98 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000016864 s |
0.000017472 s |
0.97 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000016768000000000003 s |
0.000017408 s |
0.96 |
scatter_sum / JaXPipe / cuda / BothRev |
0.0000168 s |
0.00001744 s |
0.96 |
scatter_sum / Jax / cuda / BothRev |
0.000016992 s |
0.000017696 s |
0.96 |
scatter_sum / HLOOpt / cuda / PreRev |
0.00001664 s |
0.000017888000000000002 s |
0.93 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000017568000000000002 s |
0.00001712 s |
1.03 |
scatter_sum / HLOOpt / cuda / BothRev |
0.00001728 s |
0.00001728 s |
1 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017568000000000002 s |
0.000017312 s |
1.01 |
scatter_sum / PartOpt / cuda / PostRev |
0.000016255999999999998 s |
0.000016767 s |
0.97 |
scatter_sum / PartOpt / cuda / BothRev |
0.000016768000000000003 s |
0.000017344 s |
0.97 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000016833 s |
0.00001744 s |
0.97 |
scatter_sum / IPartOpt / cuda / PostRev |
0.0000168 s |
0.00001712 s |
0.98 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000016896000000000002 s |
0.000017503999999999997 s |
0.97 |
scatter_sum / DefOpt / cuda / PreRev |
0.000017216 s |
0.000017791 s |
0.97 |
scatter_sum / DefOpt / cuda / PostRev |
0.000016512 s |
0.000016832 s |
0.98 |
scatter_sum / DefOpt / cuda / BothRev |
0.000016801 s |
0.000017375999999999998 s |
0.97 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000017375999999999998 s |
0.000017824 s |
0.97 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016512 s |
0.000017729000000000003 s |
0.93 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000016927999999999998 s |
0.000017312 s |
0.98 |
scatter_sum / JaXPipe / tpu / Primal |
0.0000013428499999999998 s |
0.00000137035 s |
0.98 |
scatter_sum / Jax / tpu / Primal |
0.0000014047 s |
0.000001343475 s |
1.05 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001342975 s |
0.000001371175 s |
0.98 |
scatter_sum / PartOpt / tpu / Primal |
0.000001404825 s |
0.00000134325 s |
1.05 |
scatter_sum / IPartOpt / tpu / Primal |
0.000001342825 s |
0.0000013707749999999998 s |
0.98 |
scatter_sum / DefOpt / tpu / Primal |
0.0000014051 s |
0.000001344475 s |
1.05 |
scatter_sum / IDefOpt / tpu / Primal |
0.000001343125 s |
0.0000013708 s |
0.98 |
scatter_sum / JaXPipe / tpu / Forward |
0.00000271475 s |
0.000002754575 s |
0.99 |
scatter_sum / Jax / tpu / Forward |
0.0000027216000000000003 s |
0.00000275575 s |
0.99 |
scatter_sum / HLOOpt / tpu / Forward |
0.0000027074 s |
0.000002756875 s |
0.98 |
scatter_sum / PartOpt / tpu / Forward |
0.000002692325 s |
0.00000272645 s |
0.99 |
scatter_sum / IPartOpt / tpu / Forward |
0.0000027150000000000003 s |
0.0000027535 s |
0.99 |
scatter_sum / DefOpt / tpu / Forward |
0.00000268675 s |
0.000002719975 s |
0.99 |
scatter_sum / IDefOpt / tpu / Forward |
0.000002708325 s |
0.0000027513 s |
0.98 |
scatter_sum / JaXPipe / tpu / PreRev |
0.0000026872000000000003 s |
0.000002714775 s |
0.99 |
scatter_sum / JaXPipe / tpu / PostRev |
0.0000026924500000000004 s |
0.0000027531250000000004 s |
0.98 |
scatter_sum / JaXPipe / tpu / BothRev |
0.000002702675 s |
0.000002727975 s |
0.99 |
scatter_sum / Jax / tpu / BothRev |
0.000002750725 s |
0.0000028086750000000003 s |
0.98 |
scatter_sum / HLOOpt / tpu / PreRev |
0.0000027067750000000004 s |
0.00000274125 s |
0.99 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000027469 s |
0.0000027999 s |
0.98 |
scatter_sum / HLOOpt / tpu / BothRev |
0.0000027105 s |
0.000002731275 s |
0.99 |
scatter_sum / PartOpt / tpu / PreRev |
0.000002743025 s |
0.000002804625 s |
0.98 |
scatter_sum / PartOpt / tpu / PostRev |
0.00000269965 s |
0.0000027357500000000004 s |
0.99 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027432250000000003 s |
0.0000028022000000000005 s |
0.98 |
scatter_sum / IPartOpt / tpu / PreRev |
0.00000269875 s |
0.000002732275 s |
0.99 |
scatter_sum / IPartOpt / tpu / PostRev |
0.0000027413 s |
0.0000028057 s |
0.98 |
scatter_sum / IPartOpt / tpu / BothRev |
0.0000027086500000000003 s |
0.0000027315 s |
0.99 |
scatter_sum / DefOpt / tpu / PreRev |
0.000002748275 s |
0.0000028033249999999995 s |
0.98 |
scatter_sum / DefOpt / tpu / PostRev |
0.000002696075 s |
0.0000027280250000000004 s |
0.99 |
scatter_sum / DefOpt / tpu / BothRev |
0.0000027424 s |
0.00000280605 s |
0.98 |
scatter_sum / IDefOpt / tpu / PreRev |
0.000002698375 s |
0.00000273425 s |
0.99 |
scatter_sum / IDefOpt / tpu / PostRev |
0.00000275745 s |
0.0000028042 s |
0.98 |
scatter_sum / IDefOpt / tpu / BothRev |
0.000002698 s |
0.0000027361 s |
0.99 |
scatter_sum / JaXPipe / cpu / Primal |
0.000015575999999999998 s |
0.000008034660004341276 s |
1.94 |
scatter_sum / Jax / cpu / Primal |
0.000015297 s |
0.000007401199927699053 s |
2.07 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015371 s |
0.00000782368006184697 s |
1.96 |
scatter_sum / PartOpt / cpu / Primal |
0.000015771000000000002 s |
0.000007279739966179477 s |
2.17 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015523 s |
0.000008091340077953646 s |
1.92 |
scatter_sum / DefOpt / cpu / Primal |
0.000015516000000000002 s |
0.000007440540011884877 s |
2.09 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015504 s |
0.000007552500064775813 s |
2.05 |
scatter_sum / JaXPipe / cpu / Forward |
0.000022568 s |
0.000011487500087241642 s |
1.96 |
scatter_sum / Jax / cpu / Forward |
0.000022169 s |
0.0000111186600952351 s |
1.99 |
scatter_sum / HLOOpt / cpu / Forward |
0.000023142 s |
0.000011329380049573956 s |
2.04 |
scatter_sum / PartOpt / cpu / Forward |
0.000022289 s |
0.00001174510010969243 s |
1.90 |
scatter_sum / IPartOpt / cpu / Forward |
0.000021755 s |
0.00001193121990581858 s |
1.82 |
scatter_sum / DefOpt / cpu / Forward |
0.000022587 s |
0.000011192940000910312 s |
2.02 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022178 s |
0.00001139780006269575 s |
1.95 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000022941 s |
0.000011128400001325644 s |
2.06 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000022755 s |
0.000011160660160385304 s |
2.04 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000022192 s |
0.000011471000107121654 s |
1.93 |
scatter_sum / Jax / cpu / BothRev |
0.000022571 s |
0.0000114425599531387 s |
1.97 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022253 s |
0.000011476399977254916 s |
1.94 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000021993 s |
0.000013494479953806148 s |
1.63 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000022507 s |
0.000011364520014467417 s |
1.98 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022724 s |
0.000011737500026356429 s |
1.94 |
scatter_sum / PartOpt / cpu / PostRev |
0.000021922 s |
0.000010909779975918354 s |
2.01 |
scatter_sum / PartOpt / cpu / BothRev |
0.000022849 s |
0.00001189136008179048 s |
1.92 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000022752 s |
0.000011497700052132132 s |
1.98 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000023334 s |
0.000011243499975535088 s |
2.08 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000023075 s |
0.00001136295997639536 s |
2.03 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022213 s |
0.00001163702010671841 s |
1.91 |
scatter_sum / DefOpt / cpu / PostRev |
0.000022474 s |
0.000010856139979296132 s |
2.07 |
scatter_sum / DefOpt / cpu / BothRev |
0.000022649 s |
0.0000110379801117233 s |
2.05 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022425 s |
0.000010984399905282773 s |
2.04 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000022603 s |
0.000011226860078750178 s |
2.01 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000022199 s |
0.000011499479951453397 s |
1.93 |
slicing / JaXPipe / cpu / Primal |
0.000006309279988272465 s |
0.000006256359974941006 s |
1.01 |
slicing / Jax / cpu / Primal |
0.000005997560028845328 s |
0.000006476919879787601 s |
0.93 |
slicing / HLOOpt / cpu / Primal |
0.000006163119960547192 s |
0.000006179880001582205 s |
1.00 |
slicing / PartOpt / cpu / Primal |
0.000006597940027859295 s |
0.000006054379955457989 s |
1.09 |
slicing / IPartOpt / cpu / Primal |
0.00000621353994574747 s |
0.000006397199995262781 s |
0.97 |
slicing / DefOpt / cpu / Primal |
0.00000620097999671998 s |
0.000005949320056970464 s |
1.04 |
slicing / IDefOpt / cpu / Primal |
0.00000602572002208035 s |
0.000006439940079872031 s |
0.94 |
slicing / JaXPipe / cpu / Forward |
0.000009693099982541754 s |
0.000009397520007041748 s |
1.03 |
slicing / Jax / cpu / Forward |
0.000009359220011901926 s |
0.000009175039958790875 s |
1.02 |
slicing / HLOOpt / cpu / Forward |
0.000009788240049601882 s |
0.00000989017997198971 s |
0.99 |
slicing / PartOpt / cpu / Forward |
0.000008824699971228256 s |
0.00000902590001714998 s |
0.98 |
slicing / IPartOpt / cpu / Forward |
0.000009736800002428936 s |
0.000009671999978309033 s |
1.01 |
slicing / DefOpt / cpu / Forward |
0.000008835419957904378 s |
0.00000922690007428173 s |
0.96 |
slicing / IDefOpt / cpu / Forward |
0.000008935620016927714 s |
0.000009310740024375264 s |
0.96 |
slicing / JaXPipe / cpu / PreRev |
0.000010273079979015166 s |
0.00001027858004817972 s |
1.00 |
slicing / JaXPipe / cpu / PostRev |
0.000009867540047707734 s |
0.000009631540033296916 s |
1.02 |
slicing / JaXPipe / cpu / BothRev |
0.00001056209996022517 s |
0.000010070120042655615 s |
1.05 |
slicing / Jax / cpu / BothRev |
0.000009614640002837403 s |
0.00000994826003079652 s |
0.97 |
slicing / HLOOpt / cpu / PreRev |
0.000010237060005238165 s |
0.000009710159956739516 s |
1.05 |
slicing / HLOOpt / cpu / PostRev |
0.000012351439927442698 s |
0.000011341420031385496 s |
1.09 |
slicing / HLOOpt / cpu / BothRev |
0.000009744579983816948 s |
0.00000971812001807848 s |
1.00 |
slicing / PartOpt / cpu / PreRev |
0.000009561539982314571 s |
0.00001009372001135489 s |
0.95 |
slicing / PartOpt / cpu / PostRev |
0.00001006799999231589 s |
0.000009648800096329067 s |
1.04 |
slicing / PartOpt / cpu / BothRev |
0.000010081580021505943 s |
0.000010046099978353596 s |
1.00 |
slicing / IPartOpt / cpu / PreRev |
0.00000933187996452034 s |
0.000009653820034145613 s |
0.97 |
slicing / IPartOpt / cpu / PostRev |
0.000010096319983858849 s |
0.000010048899948742472 s |
1.00 |
slicing / IPartOpt / cpu / BothRev |
0.00000975726002252486 s |
0.000010373479999543634 s |
0.94 |
slicing / DefOpt / cpu / PreRev |
0.000009683100042821024 s |
0.000009390980030730134 s |
1.03 |
slicing / DefOpt / cpu / PostRev |
0.000010493939989828504 s |
0.000009702339957584628 s |
1.08 |
slicing / DefOpt / cpu / BothRev |
0.0000098249999791733 s |
0.00000979103999270592 s |
1.00 |
slicing / IDefOpt / cpu / PreRev |
0.000009651419995861944 s |
0.000009572320104780374 s |
1.01 |
slicing / IDefOpt / cpu / PostRev |
0.000009616200022719568 s |
0.000010150820035050856 s |
0.95 |
slicing / IDefOpt / cpu / BothRev |
0.00000989856001069711 s |
0.000009666499990999 s |
1.02 |
slicing / JaXPipe / cuda / Primal |
0.000001888 s |
0.000002303 s |
0.82 |
slicing / Jax / cuda / Primal |
0.000001888 s |
0.000002304 s |
0.82 |
slicing / HLOOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / PartOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / IPartOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / DefOpt / cuda / Primal |
0.000001887 s |
0.000002303 s |
0.82 |
slicing / IDefOpt / cuda / Primal |
0.000001887 s |
0.000002304 s |
0.82 |
slicing / JaXPipe / cuda / Forward |
0.000010113 s |
0.000010272 s |
0.98 |
slicing / Jax / cuda / Forward |
0.000009888 s |
0.00001008 s |
0.98 |
slicing / HLOOpt / cuda / Forward |
0.000009728 s |
0.000010304 s |
0.94 |
slicing / PartOpt / cuda / Forward |
0.00000976 s |
0.000010592 s |
0.92 |
slicing / IPartOpt / cuda / Forward |
0.000009696 s |
0.00000992 s |
0.98 |
slicing / DefOpt / cuda / Forward |
0.000009824 s |
0.00001024 s |
0.96 |
slicing / IDefOpt / cuda / Forward |
0.000009728 s |
0.00001088 s |
0.89 |
slicing / JaXPipe / cuda / PreRev |
0.000009856 s |
0.0000104 s |
0.95 |
slicing / JaXPipe / cuda / PostRev |
0.000009984 s |
0.000010336 s |
0.97 |
slicing / JaXPipe / cuda / BothRev |
0.000010113 s |
0.0000104 s |
0.97 |
slicing / Jax / cuda / BothRev |
0.000009888 s |
0.000010208 s |
0.97 |
slicing / HLOOpt / cuda / PreRev |
0.000010113 s |
0.000010528 s |
0.96 |
slicing / HLOOpt / cuda / PostRev |
0.000010111 s |
0.000010591 s |
0.95 |
slicing / HLOOpt / cuda / BothRev |
0.000010272 s |
0.000010496 s |
0.98 |
slicing / PartOpt / cuda / PreRev |
0.000009696 s |
0.000010368 s |
0.94 |
slicing / PartOpt / cuda / PostRev |
0.000009696 s |
0.0000104 s |
0.93 |
slicing / PartOpt / cuda / BothRev |
0.000010048 s |
0.000010815 s |
0.93 |
slicing / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000010912 s |
0.92 |
slicing / IPartOpt / cuda / PostRev |
0.000010017 s |
0.00001072 s |
0.93 |
slicing / IPartOpt / cuda / BothRev |
0.000010016 s |
0.000010368 s |
0.97 |
slicing / DefOpt / cuda / PreRev |
0.00001008 s |
0.00001024 s |
0.98 |
slicing / DefOpt / cuda / PostRev |
0.000009792 s |
0.000010432 s |
0.94 |
slicing / DefOpt / cuda / BothRev |
0.000010016 s |
0.000010496 s |
0.95 |
slicing / IDefOpt / cuda / PreRev |
0.000009984 s |
0.000010464 s |
0.95 |
slicing / IDefOpt / cuda / PostRev |
0.000010367 s |
0.000010656 s |
0.97 |
slicing / IDefOpt / cuda / BothRev |
0.000009951 s |
0.000010305 s |
0.97 |
slicing / JaXPipe / tpu / Primal |
9.703e-7 s |
0.00000103045 s |
0.94 |
slicing / Jax / tpu / Primal |
9.70825e-7 s |
9.65925e-7 s |
1.01 |
slicing / HLOOpt / tpu / Primal |
9.698e-7 s |
0.000001024375 s |
0.95 |
slicing / PartOpt / tpu / Primal |
9.682250000000002e-7 s |
9.59525e-7 s |
1.01 |
slicing / IPartOpt / tpu / Primal |
9.713e-7 s |
0.0000010249250000000002 s |
0.95 |
slicing / DefOpt / tpu / Primal |
9.66875e-7 s |
9.62125e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
9.74825e-7 s |
0.000001022275 s |
0.95 |
slicing / JaXPipe / tpu / Forward |
0.0000014127999999999998 s |
0.000001408625 s |
1.00 |
slicing / Jax / tpu / Forward |
0.00000142015 s |
0.000001476525 s |
0.96 |
slicing / HLOOpt / tpu / Forward |
0.000001519075 s |
0.0000015182499999999998 s |
1.00 |
slicing / PartOpt / tpu / Forward |
0.00000143895 s |
0.0000014921 s |
0.96 |
slicing / IPartOpt / tpu / Forward |
0.0000015165999999999998 s |
0.000001516925 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.000001438125 s |
0.00000149725 s |
0.96 |
slicing / IDefOpt / tpu / Forward |
0.0000015174499999999998 s |
0.0000015184500000000005 s |
1.00 |
slicing / JaXPipe / tpu / PreRev |
0.0000023921 s |
0.0000025648 s |
0.93 |
slicing / JaXPipe / tpu / PostRev |
0.0000025228250000000003 s |
0.000002508975 s |
1.01 |
slicing / JaXPipe / tpu / BothRev |
0.000002401275 s |
0.0000025731 s |
0.93 |
slicing / Jax / tpu / BothRev |
0.00000254135 s |
0.00000252785 s |
1.01 |
slicing / HLOOpt / tpu / PreRev |
0.0000023974 s |
0.00000257755 s |
0.93 |
slicing / HLOOpt / tpu / PostRev |
0.0000025454500000000003 s |
0.00000252875 s |
1.01 |
slicing / HLOOpt / tpu / BothRev |
0.00000240535 s |
0.0000025746000000000003 s |
0.93 |
slicing / PartOpt / tpu / PreRev |
0.00000254185 s |
0.000002530725 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.0000023941 s |
0.0000025674 s |
0.93 |
slicing / PartOpt / tpu / BothRev |
0.000002547425 s |
0.000002532 s |
1.01 |
slicing / IPartOpt / tpu / PreRev |
0.000002389925 s |
0.000002570325 s |
0.93 |
slicing / IPartOpt / tpu / PostRev |
0.0000025467 s |
0.000002529525 s |
1.01 |
slicing / IPartOpt / tpu / BothRev |
0.00000239855 s |
0.0000025746000000000003 s |
0.93 |
slicing / DefOpt / tpu / PreRev |
0.00000253935 s |
0.000002531425 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.00000240175 s |
0.000002575075 s |
0.93 |
slicing / DefOpt / tpu / BothRev |
0.000002541325 s |
0.000002524525 s |
1.01 |
slicing / IDefOpt / tpu / PreRev |
0.000002409375 s |
0.000002573175 s |
0.94 |
slicing / IDefOpt / tpu / PostRev |
0.000002540325 s |
0.000002527475 s |
1.01 |
slicing / IDefOpt / tpu / BothRev |
0.000002399925 s |
0.000002570325 s |
0.93 |
slicing / JaXPipe / cpu / Primal |
0.000012851 s |
0.000006256359974941006 s |
2.05 |
slicing / Jax / cpu / Primal |
0.000012538 s |
0.000006476919879787601 s |
1.94 |
slicing / HLOOpt / cpu / Primal |
0.000012453 s |
0.000006179880001582205 s |
2.02 |
slicing / PartOpt / cpu / Primal |
0.000012503 s |
0.000006054379955457989 s |
2.07 |
slicing / IPartOpt / cpu / Primal |
0.000012413 s |
0.000006397199995262781 s |
1.94 |
slicing / DefOpt / cpu / Primal |
0.000012436 s |
0.000005949320056970464 s |
2.09 |
slicing / IDefOpt / cpu / Primal |
0.000012487 s |
0.000006439940079872031 s |
1.94 |
slicing / JaXPipe / cpu / Forward |
0.000016479 s |
0.000009397520007041748 s |
1.75 |
slicing / Jax / cpu / Forward |
0.000016247999999999998 s |
0.000009175039958790875 s |
1.77 |
slicing / HLOOpt / cpu / Forward |
0.000016349 s |
0.00000989017997198971 s |
1.65 |
slicing / PartOpt / cpu / Forward |
0.00001614 s |
0.00000902590001714998 s |
1.79 |
slicing / IPartOpt / cpu / Forward |
0.000016468 s |
0.000009671999978309033 s |
1.70 |
slicing / DefOpt / cpu / Forward |
0.000016612 s |
0.00000922690007428173 s |
1.80 |
slicing / IDefOpt / cpu / Forward |
0.000016213000000000002 s |
0.000009310740024375264 s |
1.74 |
slicing / JaXPipe / cpu / PreRev |
0.000017507 s |
0.00001027858004817972 s |
1.70 |
slicing / JaXPipe / cpu / PostRev |
0.000017371 s |
0.000009631540033296916 s |
1.80 |
slicing / JaXPipe / cpu / BothRev |
0.000017164 s |
0.000010070120042655615 s |
1.70 |
slicing / Jax / cpu / BothRev |
0.000017155 s |
0.00000994826003079652 s |
1.72 |
slicing / HLOOpt / cpu / PreRev |
0.000016516 s |
0.000009710159956739516 s |
1.70 |
slicing / HLOOpt / cpu / PostRev |
0.000016913000000000002 s |
0.000011341420031385496 s |
1.49 |
slicing / HLOOpt / cpu / BothRev |
0.000017177999999999997 s |
0.00000971812001807848 s |
1.77 |
slicing / PartOpt / cpu / PreRev |
0.000017468999999999998 s |
0.00001009372001135489 s |
1.73 |
slicing / PartOpt / cpu / PostRev |
0.000016612 s |
0.000009648800096329067 s |
1.72 |
slicing / PartOpt / cpu / BothRev |
0.000017283 s |
0.000010046099978353596 s |
1.72 |
slicing / IPartOpt / cpu / PreRev |
0.000016802999999999998 s |
0.000009653820034145613 s |
1.74 |
slicing / IPartOpt / cpu / PostRev |
0.000016725 s |
0.000010048899948742472 s |
1.66 |
slicing / IPartOpt / cpu / BothRev |
0.000016751999999999998 s |
0.000010373479999543634 s |
1.61 |
slicing / DefOpt / cpu / PreRev |
0.000016849 s |
0.000009390980030730134 s |
1.79 |
slicing / DefOpt / cpu / PostRev |
0.000017221 s |
0.000009702339957584628 s |
1.77 |
slicing / DefOpt / cpu / BothRev |
0.000016818 s |
0.00000979103999270592 s |
1.72 |
slicing / IDefOpt / cpu / PreRev |
0.000017052000000000002 s |
0.000009572320104780374 s |
1.78 |
slicing / IDefOpt / cpu / PostRev |
0.000017213 s |
0.000010150820035050856 s |
1.70 |
slicing / IDefOpt / cpu / BothRev |
0.000017135 s |
0.000009666499990999 s |
1.77 |
sum / JaXPipe / cpu / Primal |
0.000007505560024583246 s |
0.000007971080012794119 s |
0.94 |
sum / Jax / cpu / Primal |
0.000007456480025211931 s |
0.000007625819998793304 s |
0.98 |
sum / HLOOpt / cpu / Primal |
0.000007590360019094077 s |
0.000008017239943001186 s |
0.95 |
sum / PartOpt / cpu / Primal |
0.000007594680009788135 s |
0.000007410100042761769 s |
1.02 |
sum / IPartOpt / cpu / Primal |
0.000008110639973892831 s |
0.000008181239918485517 s |
0.99 |
sum / DefOpt / cpu / Primal |
0.000007500300016545225 s |
0.000007373279931925935 s |
1.02 |
sum / IDefOpt / cpu / Primal |
0.000007867040021665162 s |
0.000007935899957374203 s |
0.99 |
sum / JaXPipe / cpu / Forward |
0.000011876720018335618 s |
0.000011452859980636275 s |
1.04 |
sum / Jax / cpu / Forward |
0.0000110155799666245 s |
0.0000106855801641359 s |
1.03 |
sum / HLOOpt / cpu / Forward |
0.000011646959983409031 s |
0.000011622879992501113 s |
1.00 |
sum / PartOpt / cpu / Forward |
0.000011131020009997884 s |
0.000011153219929838087 s |
1.00 |
sum / IPartOpt / cpu / Forward |
0.000011223839965168737 s |
0.00001095907997296308 s |
1.02 |
sum / DefOpt / cpu / Forward |
0.000010977800002365257 s |
0.000011301559970888776 s |
0.97 |
sum / IDefOpt / cpu / Forward |
0.000011461520007287616 s |
0.000011096039925178047 s |
1.03 |
sum / JaXPipe / cpu / PreRev |
0.000011182479975104796 s |
0.00001128660000176751 s |
0.99 |
sum / JaXPipe / cpu / PostRev |
0.00001082474001123046 s |
0.000010986259912897368 s |
0.99 |
sum / JaXPipe / cpu / BothRev |
0.00001075096000022313 s |
0.000011202440018678316 s |
0.96 |
sum / Jax / cpu / BothRev |
0.000010770759990919032 s |
0.000010707339952205074 s |
1.01 |
sum / HLOOpt / cpu / PreRev |
0.000010849079981198883 s |
0.000011084920060966397 s |
0.98 |
sum / HLOOpt / cpu / PostRev |
0.00001267797999389586 s |
0.000013082000023132423 s |
0.97 |
sum / HLOOpt / cpu / BothRev |
0.000010638519988788177 s |
0.000010698379956011197 s |
0.99 |
sum / PartOpt / cpu / PreRev |
0.000011138760019093752 s |
0.000010895840041484915 s |
1.02 |
sum / PartOpt / cpu / PostRev |
0.000010747119995357936 s |
0.000010930300031759544 s |
0.98 |
sum / PartOpt / cpu / BothRev |
0.000010633939964463937 s |
0.000011242800028412602 s |
0.95 |
sum / IPartOpt / cpu / PreRev |
0.000011079780033469432 s |
0.000010910799974226392 s |
1.02 |
sum / IPartOpt / cpu / PostRev |
0.000011012079985448509 s |
0.00001021207994199358 s |
1.08 |
sum / IPartOpt / cpu / BothRev |
0.000011232640008529416 s |
0.000010417019948363305 s |
1.08 |
sum / DefOpt / cpu / PreRev |
0.00001078114001757058 s |
0.00001097947999369353 s |
0.98 |
sum / DefOpt / cpu / PostRev |
0.000010924480011453853 s |
0.000010882719961955444 s |
1.00 |
sum / DefOpt / cpu / BothRev |
0.000010494279986232868 s |
0.00001088356009859126 s |
0.96 |
sum / IDefOpt / cpu / PreRev |
0.000011190279983566143 s |
0.000011024900013580918 s |
1.02 |
sum / IDefOpt / cpu / PostRev |
0.000010794699974212563 s |
0.000010991760118486127 s |
0.98 |
sum / IDefOpt / cpu / BothRev |
0.000010756859983303005 s |
0.000010506819999136496 s |
1.02 |
sum / JaXPipe / cuda / Primal |
0.000002047 s |
0.000002464 s |
0.83 |
sum / Jax / cuda / Primal |
0.000002048 s |
0.000002463 s |
0.83 |
sum / HLOOpt / cuda / Primal |
0.000002047 s |
0.000002463 s |
0.83 |
sum / PartOpt / cuda / Primal |
0.000002048 s |
0.000002463 s |
0.83 |
sum / IPartOpt / cuda / Primal |
0.000002048 s |
0.000002463 s |
0.83 |
sum / DefOpt / cuda / Primal |
0.000002048 s |
0.000002463 s |
0.83 |
sum / IDefOpt / cuda / Primal |
0.000002047 s |
0.000002463 s |
0.83 |
sum / JaXPipe / cuda / Forward |
0.000010208 s |
0.000010623 s |
0.96 |
sum / Jax / cuda / Forward |
0.000010304 s |
0.000010464 s |
0.98 |
sum / HLOOpt / cuda / Forward |
0.000010016 s |
0.000009952 s |
1.01 |
sum / PartOpt / cuda / Forward |
0.000009984 s |
0.000010592 s |
0.94 |
sum / IPartOpt / cuda / Forward |
0.000009984 s |
0.000010496 s |
0.95 |
sum / DefOpt / cuda / Forward |
0.000009824 s |
0.000010464 s |
0.94 |
sum / IDefOpt / cuda / Forward |
0.000009793 s |
0.000010496 s |
0.93 |
sum / JaXPipe / cuda / PreRev |
0.000010111 s |
0.00001024 s |
0.99 |
sum / JaXPipe / cuda / PostRev |
0.000009216 s |
0.000010592 s |
0.87 |
sum / JaXPipe / cuda / BothRev |
0.000009824 s |
0.000010304 s |
0.95 |
sum / Jax / cuda / BothRev |
0.000009888 s |
0.000010208 s |
0.97 |
sum / HLOOpt / cuda / PreRev |
0.000009984 s |
0.000010432 s |
0.96 |
sum / HLOOpt / cuda / PostRev |
0.000009376 s |
0.000010368 s |
0.90 |
sum / HLOOpt / cuda / BothRev |
0.000009664 s |
0.000010111 s |
0.96 |
sum / PartOpt / cuda / PreRev |
0.000010144 s |
0.0000104 s |
0.98 |
sum / PartOpt / cuda / PostRev |
0.000009696 s |
0.000010272 s |
0.94 |
sum / PartOpt / cuda / BothRev |
0.000009697 s |
0.00000944 s |
1.03 |
sum / IPartOpt / cuda / PreRev |
0.000009792 s |
0.000010496 s |
0.93 |
sum / IPartOpt / cuda / PostRev |
0.000009184 s |
0.000010272 s |
0.89 |
sum / IPartOpt / cuda / BothRev |
0.000009984 s |
0.000010368 s |
0.96 |
sum / DefOpt / cuda / PreRev |
0.000010144 s |
0.00001088 s |
0.93 |
sum / DefOpt / cuda / PostRev |
0.000009792 s |
0.000010976 s |
0.89 |
sum / DefOpt / cuda / BothRev |
0.00000976 s |
0.000010592 s |
0.92 |
sum / IDefOpt / cuda / PreRev |
0.000009696 s |
0.000010944 s |
0.89 |
sum / IDefOpt / cuda / PostRev |
0.00001008 s |
0.000010336 s |
0.98 |
sum / IDefOpt / cuda / BothRev |
0.00000992 s |
0.000010656 s |
0.93 |
sum / JaXPipe / tpu / Primal |
5.1055e-7 s |
5.03125e-7 s |
1.01 |
sum / Jax / tpu / Primal |
5.4715e-7 s |
5.4745e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.1055e-7 s |
5.036e-7 s |
1.01 |
sum / PartOpt / tpu / Primal |
5.4675e-7 s |
5.47275e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.1025e-7 s |
5.0345e-7 s |
1.01 |
sum / DefOpt / tpu / Primal |
5.46975e-7 s |
5.47275e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.10025e-7 s |
5.0305e-7 s |
1.01 |
sum / JaXPipe / tpu / Forward |
0.0000015586749999999995 s |
0.000001551925 s |
1.00 |
sum / Jax / tpu / Forward |
0.00000150165 s |
0.00000149655 s |
1.00 |
sum / HLOOpt / tpu / Forward |
0.0000015337 s |
0.000001529625 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.0000014954 s |
0.00000149105 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.0000015343750000000002 s |
0.00000152995 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.00000149345 s |
0.0000014889749999999998 s |
1.00 |
sum / IDefOpt / tpu / Forward |
0.0000015328 s |
0.0000015351 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.0000010012999999999998 s |
0.0000010405 s |
0.96 |
sum / JaXPipe / tpu / PostRev |
0.00000103985 s |
0.0000010893 s |
0.95 |
sum / JaXPipe / tpu / BothRev |
0.00000100555 s |
0.000001042775 s |
0.96 |
sum / Jax / tpu / BothRev |
0.0000010401 s |
0.000001087625 s |
0.96 |
sum / HLOOpt / tpu / PreRev |
0.00000101485 s |
0.00000104035 s |
0.98 |
sum / HLOOpt / tpu / PostRev |
0.0000010402 s |
0.0000010899499999999998 s |
0.95 |
sum / HLOOpt / tpu / BothRev |
0.0000010161 s |
0.00000103875 s |
0.98 |
sum / PartOpt / tpu / PreRev |
0.0000010457750000000002 s |
0.0000010843 s |
0.96 |
sum / PartOpt / tpu / PostRev |
0.00000100665 s |
0.000001041575 s |
0.97 |
sum / PartOpt / tpu / BothRev |
0.000001051525 s |
0.000001088925 s |
0.97 |
sum / IPartOpt / tpu / PreRev |
0.000001002625 s |
0.000001046425 s |
0.96 |
sum / IPartOpt / tpu / PostRev |
0.0000010504 s |
0.0000010828999999999998 s |
0.97 |
sum / IPartOpt / tpu / BothRev |
0.00000100675 s |
0.000001040425 s |
0.97 |
sum / DefOpt / tpu / PreRev |
0.0000010382 s |
0.000001083325 s |
0.96 |
sum / DefOpt / tpu / PostRev |
0.000001024725 s |
0.000001040375 s |
0.98 |
sum / DefOpt / tpu / BothRev |
0.00000104945 s |
0.000001094025 s |
0.96 |
sum / IDefOpt / tpu / PreRev |
0.000001022925 s |
0.0000010473499999999998 s |
0.98 |
sum / IDefOpt / tpu / PostRev |
0.0000010409 s |
0.0000010966750000000002 s |
0.95 |
sum / IDefOpt / tpu / BothRev |
0.000001 s |
0.0000010399750000000002 s |
0.96 |
sum / JaXPipe / cpu / Primal |
0.000014456 s |
0.000007971080012794119 s |
1.81 |
sum / Jax / cpu / Primal |
0.000014248 s |
0.000007625819998793304 s |
1.87 |
sum / HLOOpt / cpu / Primal |
0.000014802 s |
0.000008017239943001186 s |
1.85 |
sum / PartOpt / cpu / Primal |
0.000014864 s |
0.000007410100042761769 s |
2.01 |
sum / IPartOpt / cpu / Primal |
0.000013988 s |
0.000008181239918485517 s |
1.71 |
sum / DefOpt / cpu / Primal |
0.000014335 s |
0.000007373279931925935 s |
1.94 |
sum / IDefOpt / cpu / Primal |
0.000014295 s |
0.000007935899957374203 s |
1.80 |
sum / JaXPipe / cpu / Forward |
0.00001986 s |
0.000011452859980636275 s |
1.73 |
sum / Jax / cpu / Forward |
0.000019868 s |
0.0000106855801641359 s |
1.86 |
sum / HLOOpt / cpu / Forward |
0.000019453 s |
0.000011622879992501113 s |
1.67 |
sum / PartOpt / cpu / Forward |
0.000019739 s |
0.000011153219929838087 s |
1.77 |
sum / IPartOpt / cpu / Forward |
0.000019513 s |
0.00001095907997296308 s |
1.78 |
sum / DefOpt / cpu / Forward |
0.000019671 s |
0.000011301559970888776 s |
1.74 |
sum / IDefOpt / cpu / Forward |
0.00001968 s |
0.000011096039925178047 s |
1.77 |
sum / JaXPipe / cpu / PreRev |
0.000018843 s |
0.00001128660000176751 s |
1.67 |
sum / JaXPipe / cpu / PostRev |
0.000018568 s |
0.000010986259912897368 s |
1.69 |
sum / JaXPipe / cpu / BothRev |
0.00001831 s |
0.000011202440018678316 s |
1.63 |
sum / Jax / cpu / BothRev |
0.000018486 s |
0.000010707339952205074 s |
1.73 |
sum / HLOOpt / cpu / PreRev |
0.000018669 s |
0.000011084920060966397 s |
1.68 |
sum / HLOOpt / cpu / PostRev |
0.000018508 s |
0.000013082000023132423 s |
1.41 |
sum / HLOOpt / cpu / BothRev |
0.000018357 s |
0.000010698379956011197 s |
1.72 |
sum / PartOpt / cpu / PreRev |
0.000018668 s |
0.000010895840041484915 s |
1.71 |
sum / PartOpt / cpu / PostRev |
0.000018394 s |
0.000010930300031759544 s |
1.68 |
sum / PartOpt / cpu / BothRev |
0.000018463 s |
0.000011242800028412602 s |
1.64 |
sum / IPartOpt / cpu / PreRev |
0.000018097 s |
0.000010910799974226392 s |
1.66 |
sum / IPartOpt / cpu / PostRev |
0.000018488 s |
0.00001021207994199358 s |
1.81 |
sum / IPartOpt / cpu / BothRev |
0.000018504 s |
0.000010417019948363305 s |
1.78 |
sum / DefOpt / cpu / PreRev |
0.00001869 s |
0.00001097947999369353 s |
1.70 |
sum / DefOpt / cpu / PostRev |
0.0000185 s |
0.000010882719961955444 s |
1.70 |
sum / DefOpt / cpu / BothRev |
0.000018454000000000003 s |
0.00001088356009859126 s |
1.70 |
sum / IDefOpt / cpu / PreRev |
0.00001847 s |
0.000011024900013580918 s |
1.68 |
sum / IDefOpt / cpu / PostRev |
0.000018327 s |
0.000010991760118486127 s |
1.67 |
sum / IDefOpt / cpu / BothRev |
0.000018359 s |
0.000010506819999136496 s |
1.75 |
value_and_grad / JaXPipe / cpu / Primal |
0.000014504560012937872 s |
0.000013755879972450204 s |
1.05 |
value_and_grad / Jax / cpu / Primal |
0.000013792740028293335 s |
0.000013791200035484508 s |
1.00 |
value_and_grad / HLOOpt / cpu / Primal |
0.000013658100006068709 s |
0.000013499440010491526 s |
1.01 |
value_and_grad / PartOpt / cpu / Primal |
0.000013133639959050924 s |
0.00001282425997487735 s |
1.02 |
value_and_grad / IPartOpt / cpu / Primal |
0.000013559520011767743 s |
0.0000132059399766149 s |
1.03 |
value_and_grad / DefOpt / cpu / Primal |
0.000013488160011547734 s |
0.00001333509995674831 s |
1.01 |
value_and_grad / IDefOpt / cpu / Primal |
0.000012985880011910922 s |
0.000013777679923805408 s |
0.94 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033056 s |
0.000033727 s |
0.98 |
value_and_grad / Jax / cuda / Primal |
0.000033408 s |
0.000033375000000000005 s |
1.00 |
value_and_grad / HLOOpt / cuda / Primal |
0.00003248 s |
0.000033632 s |
0.97 |
value_and_grad / PartOpt / cuda / Primal |
0.000032417 s |
0.000033695 s |
0.96 |
value_and_grad / IPartOpt / cuda / Primal |
0.000033119999999999995 s |
0.00003408 s |
0.97 |
value_and_grad / DefOpt / cuda / Primal |
0.000032992 s |
0.000033759999999999995 s |
0.98 |
value_and_grad / IDefOpt / cuda / Primal |
0.000033408 s |
0.000034208 s |
0.98 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000022787 s |
0.000013755879972450204 s |
1.66 |
value_and_grad / Jax / cpu / Primal |
0.000021908 s |
0.000013791200035484508 s |
1.59 |
value_and_grad / HLOOpt / cpu / Primal |
0.000022252 s |
0.000013499440010491526 s |
1.65 |
value_and_grad / PartOpt / cpu / Primal |
0.000022418 s |
0.00001282425997487735 s |
1.75 |
value_and_grad / IPartOpt / cpu / Primal |
0.000022653 s |
0.0000132059399766149 s |
1.72 |
value_and_grad / DefOpt / cpu / Primal |
0.000022396 s |
0.00001333509995674831 s |
1.68 |
value_and_grad / IDefOpt / cpu / Primal |
0.000022485 s |
0.000013777679923805408 s |
1.63 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001444389 s |
0.001465816 s |
0.99 |
jaxmd20 / Jax / cuda / Primal |
0.001514213 s |
0.001531639 s |
0.99 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001338886 s |
0.00137385 s |
0.97 |
jaxmd20 / PartOpt / cuda / Primal |
0.001412485 s |
0.0013668089999999 s |
1.03 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001305348 s |
0.00134812 s |
0.97 |
jaxmd20 / DefOpt / cuda / Primal |
0.00092202 s |
0.000938491 s |
0.98 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000952835 s |
0.000962875 s |
0.99 |
jaxmd20 / JaXPipe / cuda / Forward |
0.001562661 s |
0.001631768 s |
0.96 |
jaxmd20 / Jax / cuda / Forward |
0.001780325 s |
0.001852822 s |
0.96 |
jaxmd20 / HLOOpt / cuda / Forward |
0.001636294 s |
0.001709686 s |
0.96 |
jaxmd20 / PartOpt / cuda / Forward |
0.0016340539999999 s |
0.001719518 s |
0.95 |
jaxmd20 / IPartOpt / cuda / Forward |
0.001651591 s |
0.0017157329999999 s |
0.96 |
jaxmd20 / DefOpt / cuda / Forward |
0.001669126 s |
0.001707192 s |
0.98 |
jaxmd20 / IDefOpt / cuda / Forward |
0.0016205879999999 s |
0.0017267369999999 s |
0.94 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.002743687 s |
0.002786098 s |
0.98 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005380689 s |
0.005518528 s |
0.98 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.002795712 s |
0.002824722 s |
0.99 |
jaxmd20 / Jax / cuda / BothRev |
0.005387663 s |
0.005536183 s |
0.97 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.0027522329999999 s |
0.002873778 s |
0.96 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005395277 s |
0.005524066 s |
0.98 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.002789256 s |
0.002818 s |
0.99 |
jaxmd20 / PartOpt / cuda / PreRev |
0.002826826 s |
0.002889521 s |
0.98 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005511861 s |
0.0056870079999999 s |
0.97 |
jaxmd20 / PartOpt / cuda / BothRev |
0.002755752 s |
0.002860852 s |
0.96 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.00281009 s |
0.002924271 s |
0.96 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.0054388 s |
0.005666755 s |
0.96 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.0027673359999999 s |
0.002837265 s |
0.98 |
jaxmd20 / DefOpt / cuda / PreRev |
0.002850314 s |
0.002912848 s |
0.98 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002768168 s |
0.002859119 s |
0.97 |
jaxmd20 / DefOpt / cuda / BothRev |
0.0027654809999999 s |
0.002833778 s |
0.98 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.002826665 s |
0.0028988649999999 s |
0.98 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.002329127 s |
0.002351861 s |
0.99 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0027525529999999 s |
0.002841298 s |
0.97 |
jaxmd20 / JaXPipe / tpu / Primal |
0.009278036875 s |
0.009279790625 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.009275595 s |
0.009277066875 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.00915694625 s |
0.009156288125 s |
1.00 |
jaxmd20 / PartOpt / tpu / Primal |
0.00919631 s |
0.0091969675 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.00919851875 s |
0.009199265 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.008796035625 s |
0.00879701125 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.008693321875 s |
0.008693685625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.017410738125 s |
0.01741432 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.01873232875 s |
0.018728843125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.0174027725 s |
0.017404439375 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.01740809 s |
0.01740671 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.017413765625 s |
0.0174164325 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.0174099218749999 s |
0.017410795625 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.0174090787499999 s |
0.017411913125 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.02544478625 s |
0.025446514375 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.02154976375 s |
0.021859200625 s |
0.99 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.025442028125 s |
0.02544817125 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.021858575625 s |
0.0218556706249999 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.0255665743749999 s |
0.025562918125 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.02071068125 s |
0.0207051775 s |
1.00 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.02566761 s |
0.02566237625 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.025473980625 s |
0.025475254375 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.021504533125 s |
0.02150513125 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.02556725875 s |
0.025568385 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.025451670625 s |
0.025453468125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.021515989375 s |
0.02151529 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.02554441375 s |
0.0255439025 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.025471826875 s |
0.02547305875 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.01881327625 s |
0.018810044375 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.0255602325 s |
0.025561076875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.025455016875 s |
0.025455626875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.0183256643749999 s |
0.0183291187499999 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.02554731875 s |
0.025542485625 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.066463017 s |
0.089487879 s |
0.74 |
jaxmd40 / Jax / cpu / Primal |
0.066942678 s |
0.0878360639999999 s |
0.76 |
jaxmd40 / HLOOpt / cpu / Primal |
0.079671177 s |
0.112789086 s |
0.71 |
jaxmd40 / PartOpt / cpu / Primal |
0.055826796 s |
0.080333378 s |
0.69 |
jaxmd40 / IPartOpt / cpu / Primal |
0.065624944 s |
0.084480979 s |
0.78 |
jaxmd40 / DefOpt / cpu / Primal |
0.084477014 s |
0.114900792 s |
0.74 |
jaxmd40 / IDefOpt / cpu / Primal |
0.076211854 s |
0.111910519 s |
0.68 |
jaxmd40 / JaXPipe / cpu / Forward |
0.152159538 s |
0.201633292 s |
0.75 |
jaxmd40 / Jax / cpu / Forward |
0.077673815 s |
0.108089939 s |
0.72 |
jaxmd40 / HLOOpt / cpu / Forward |
0.149718927 s |
0.206739723 s |
0.72 |
jaxmd40 / PartOpt / cpu / Forward |
0.149829033 s |
0.200147787 s |
0.75 |
jaxmd40 / IPartOpt / cpu / Forward |
0.15207049 s |
0.201814104 s |
0.75 |
jaxmd40 / DefOpt / cpu / Forward |
0.156317131 s |
0.20460292 s |
0.76 |
jaxmd40 / IDefOpt / cpu / Forward |
0.150011141 s |
0.197395045 s |
0.76 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.2235867409999999 s |
0.273711546 s |
0.82 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.129416963 s |
0.17284138 s |
0.75 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.210752327 s |
0.271164212 s |
0.78 |
jaxmd40 / Jax / cpu / BothRev |
0.1292295269999999 s |
0.168613278 s |
0.77 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.213229091 s |
0.264947864 s |
0.80 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.173448369 s |
0.227558183 s |
0.76 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.243698861 s |
0.296226275 s |
0.82 |
jaxmd40 / PartOpt / cpu / PreRev |
0.231183327 s |
0.268889165 s |
0.86 |
jaxmd40 / PartOpt / cpu / PostRev |
0.130596623 s |
0.1784504489999999 s |
0.73 |
jaxmd40 / PartOpt / cpu / BothRev |
0.244417935 s |
0.314284117 s |
0.78 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.222225399 s |
0.258440415 s |
0.86 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.130521647 s |
0.157997828 s |
0.83 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.23208774 s |
0.301796951 s |
0.77 |
jaxmd40 / DefOpt / cpu / PreRev |
0.218941858 s |
0.266499049 s |
0.82 |
jaxmd40 / DefOpt / cpu / PostRev |
0.1680480389999999 s |
0.223556878 s |
0.75 |
jaxmd40 / DefOpt / cpu / BothRev |
0.256210166 s |
0.277931194 s |
0.92 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.232196358 s |
0.259998777 s |
0.89 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.167476895 s |
0.222898555 s |
0.75 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.231823435 s |
0.301388099 s |
0.77 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.701906729 s |
1.701157012 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.704738632 s |
1.7031430600000002 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.715349354 s |
1.714958551 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.697039665 s |
1.69272958 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.6949681230000002 s |
1.6922887880000002 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.665589964 s |
1.664518708 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.91293939 s |
1.913615274 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.038888586875 s |
3.038750500625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.0394803525 s |
3.03936722125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.121676485625 s |
3.121648468125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.0602713125 s |
3.060112531875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.0606538918750004 s |
3.06032794375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.10243289875 s |
2.10245704 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
2.948220215 s |
2.9484067300000003 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
5.901824107 s |
7.522609622 s |
0.78 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
5.802870412 s |
7.451253786 s |
0.78 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
5.900959729999999 s |
7.300774703 s |
0.81 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.034269198 s |
7.5237576 s |
0.80 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
5.923156271 s |
7.461139831 s |
0.79 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.30789797 s |
3.231166343 s |
0.71 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.365659994 s |
7.826565502 s |
0.81 |
This comment was automatically generated by workflow using github-action-benchmark.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.