forked from ruby/ruby
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] master from ruby:master #706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We want to use [linear scan register allocation](https://bernsteinbear.com/blog/linear-scan/), but a prerequisite is having a CFG available. Previously LIR only had a linear block of instructions, this PR introduces a CFG to the LIR backend. I've done my best to ensure that the "hot path" machine code we generate is the same (as I was testing I noticed that side exit machine code was being dumped in a different order). This PR doesn't make any changes to the existing register allocator, it simply introduces a CFG to LIR. The basic blocks in the LIR CFG always start with a label (the first instruction is a label) and the last 0, 1, or 2 instructions will be jump instructions. No other jump instructions should appear mid-block.
This PR is a follow-up to #15816. There, I introduced the `GuardSuperMethodEntry` HIR instruction and that needed the LEP. The LEP was also used by `GetBlockHandler`. Consequently, the codegen for `invokesuper` ended up loading the LEP twice. By introducing a new HIR instruction, we can load the LEP once and use it in both `GetBlockHandler` and `GuardSuperMethodEntry`. I also updated `IsBlockGiven`, which conditionally loaded the LEP. To ensure we only use `GetLEP` in the cases we need it, I lifted most of the `IsBlockGiven` handler to HIR. As an added benefit, this addressed a TODO that @tekknolagi had written: when `block_given?` is called outside of a method we can rewrite to a constant `false`. We could use `GetLEP` in the handling of `Defined`, but that looked a bit more involved and I wanted to keep this PR focused, so I'm suggesting we handle that as future work.
…tructions (#15915) Do a sort of "partial static single information (SSI)" form that learns types of operands from branch instructions. A branchif, for example, tells us that in the truthy path, we know the operand is not nil, and not false. Similarly, in the falsy path, we know the operand is either nil or false. Add a RefineType instruction to attach this information. This PR does this in SSA construction because it's pretty straightforward, but we can also do a more aggressive version of this that can learn information about e.g. int ranges from other checks later in the optimization pipeline.
Closes: Shopify#863 Compile `getblockparam` insn to `GetBlockParam` HIR so that we can handle it in ZJIT. ## Benchmark ### lobsters <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 778ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (58.4% of total 16,091,748): Hash#fetch: 3,237,974 (20.1%) Regexp#match?: 708,838 ( 4.4%) Hash#key?: 702,565 ( 4.4%) String#sub!: 489,843 ( 3.0%) Set#include?: 402,395 ( 2.5%) String#<<: 396,364 ( 2.5%) String#start_with?: 379,338 ( 2.4%) Hash#delete: 331,679 ( 2.1%) String.new: 308,268 ( 1.9%) Integer#===: 279,074 ( 1.7%) Symbol#end_with?: 255,538 ( 1.6%) Kernel#is_a?: 250,000 ( 1.6%) Process.clock_gettime: 221,598 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,057 ( 1.4%) String#downcase: 213,127 ( 1.3%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 195,248 ( 1.2%) Time#subsec: 192,277 ( 1.2%) Time#utc?: 188,500 ( 1.2%) Top-20 calls to C functions from JIT code (83.4% of total 126,501,142): rb_vm_opt_send_without_block: 35,338,443 (27.9%) rb_vm_send: 10,126,272 ( 8.0%) rb_hash_aref: 9,221,146 ( 7.3%) rb_vm_env_write: 8,615,394 ( 6.8%) rb_zjit_writebarrier_check_immediate: 7,666,061 ( 6.1%) rb_vm_getinstancevariable: 5,902,473 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,775,750 ( 3.8%) rb_obj_is_kind_of: 3,718,303 ( 2.9%) rb_vm_invokesuper: 2,705,394 ( 2.1%) rb_hash_aset: 2,422,892 ( 1.9%) rb_vm_setinstancevariable: 2,385,262 ( 1.9%) rb_vm_opt_getconstant_path: 2,321,875 ( 1.8%) Hash#fetch: 1,819,675 ( 1.4%) fetch: 1,418,299 ( 1.1%) rb_vm_invokeblock: 1,387,466 ( 1.1%) rb_str_buf_append: 1,378,634 ( 1.1%) rb_ec_ary_new_from_values: 1,338,599 ( 1.1%) rb_class_allocate_instance: 1,300,827 ( 1.0%) rb_hash_new_with_size: 906,352 ( 0.7%) rb_vm_sendforward: 799,626 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 5,166,211): iseq: 5,163,389 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 526,119): optimized_send: 479,643 (91.2%) null: 42,176 ( 8.0%) optimized_block_call: 4,300 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,365,999): cfunc: 2,251,438 (95.2%) alias: 111,257 ( 4.7%) attrset: 3,304 ( 0.1%) Top-3 instructions with uncategorized fallback reason (100.0% of total 2,214,821): invokeblock: 1,387,466 (62.6%) sendforward: 799,626 (36.1%) opt_send_without_block: 27,729 ( 1.3%) Top-20 send fallback reasons (100.0% of total 50,357,201): send_without_block_polymorphic: 18,307,466 (36.4%) singleton_class_seen: 9,310,336 (18.5%) send_not_optimized_method_type: 5,166,211 (10.3%) send_without_block_no_profiles: 4,756,165 ( 9.4%) one_or_more_complex_arg_pass: 2,906,412 ( 5.8%) send_no_profiles: 2,864,323 ( 5.7%) super_not_optimized_method_type: 2,365,999 ( 4.7%) uncategorized: 2,214,821 ( 4.4%) send_without_block_megamorphic: 581,552 ( 1.2%) send_without_block_not_optimized_method_type_optimized: 483,943 ( 1.0%) send_without_block_not_optimized_need_permission: 390,364 ( 0.8%) send_polymorphic: 329,064 ( 0.7%) too_many_args_for_lir: 173,570 ( 0.3%) super_target_complex_args_pass: 131,841 ( 0.3%) super_complex_args_pass: 111,056 ( 0.2%) super_polymorphic: 86,986 ( 0.2%) argc_param_mismatch: 48,546 ( 0.1%) send_without_block_not_optimized_method_type: 42,176 ( 0.1%) send_without_block_direct_keyword_mismatch: 37,484 ( 0.1%) obj_to_string_not_string: 34,865 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,385,262): not_monomorphic: 2,162,525 (90.7%) not_t_object: 125,178 ( 5.2%) too_complex: 97,538 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,027,586): not_monomorphic: 5,776,418 (95.8%) too_complex: 251,168 ( 4.2%) Top-3 definedivar fallback reasons (100.0% of total 406,027): not_monomorphic: 397,876 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,029 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,387,466): monomorphic_iseq: 700,051 (50.5%) polymorphic: 513,455 (37.0%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,762 ( 0.5%) no_profiles: 5,425 ( 0.4%) Top-9 popular complex argument-parameter features not optimized (100.0% of total 3,353,961): param_kw_opt: 1,408,663 (42.0%) param_forwardable: 697,209 (20.8%) param_block: 632,488 (18.9%) param_rest: 346,363 (10.3%) param_kwrest: 139,856 ( 4.2%) caller_kw_splat: 79,861 ( 2.4%) caller_splat: 43,585 ( 1.3%) caller_blockarg: 5,826 ( 0.2%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 188,362): exception_handler: 188,362 (100.0%) Top-7 unhandled YARV insns (100.0% of total 184,408): getblockparam: 95,129 (51.6%) invokesuperforward: 81,668 (44.3%) getconstant: 3,318 ( 1.8%) setblockparam: 2,837 ( 1.5%) checkmatch: 929 ( 0.5%) expandarray: 360 ( 0.2%) once: 167 ( 0.1%) Top-3 unhandled HIR insns (100.0% of total 237,876): throw: 199,380 (83.8%) invokebuiltin: 35,775 (15.0%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,592,861): guard_type_failure: 6,993,070 (44.8%) guard_shape_failure: 6,862,785 (44.0%) block_param_proxy_not_iseq_or_ifunc: 1,006,781 ( 6.5%) unhandled_hir_insn: 237,876 ( 1.5%) compile_error: 188,362 ( 1.2%) unhandled_yarv_insn: 184,408 ( 1.2%) block_param_proxy_modified: 29,130 ( 0.2%) patchpoint_stable_constant_names: 22,145 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,788 ( 0.1%) fixnum_mult_overflow: 10,866 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 423 ( 0.0%) obj_to_string_fallback: 330 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 114 ( 0.0%) send_count: 152,442,683 dynamic_send_count: 50,357,201 (33.0%) optimized_send_count: 102,085,482 (67.0%) dynamic_setivar_count: 2,385,262 ( 1.6%) dynamic_getivar_count: 6,027,586 ( 4.0%) dynamic_definedivar_count: 406,027 ( 0.3%) iseq_optimized_send_count: 39,671,621 (26.0%) inline_cfunc_optimized_send_count: 42,053,762 (27.6%) inline_iseq_optimized_send_count: 3,462,562 ( 2.3%) non_variadic_cfunc_optimized_send_count: 9,195,248 ( 6.0%) variadic_cfunc_optimized_send_count: 7,702,289 ( 5.1%) compiled_iseq_count: 5,552 failed_iseq_count: 0 compile_time: 1,926ms profile_time: 20ms gc_time: 27ms invalidation_time: 531ms vm_write_pc_count: 132,750,117 vm_write_sp_count: 132,750,117 vm_write_locals_count: 128,780,465 vm_write_stack_count: 128,780,465 vm_write_to_parent_iseq_local_count: 694,799 vm_read_from_parent_iseq_local_count: 14,812,747 guard_type_count: 159,813,452 guard_type_exit_ratio: 4.4% guard_shape_count: 0 code_region_bytes: 29,425,664 zjit_alloc_bytes: 44,592,776 total_mem_bytes: 74,018,440 side_exit_count: 15,592,861 total_insn_count: 938,453,078 vm_insn_count: 167,693,539 zjit_insn_count: 770,759,539 ratio_in_zjit: 82.1% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 725ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (58.2% of total 16,004,664): Hash#fetch: 3,185,115 (19.9%) Regexp#match?: 708,806 ( 4.4%) Hash#key?: 702,551 ( 4.4%) String#sub!: 489,841 ( 3.1%) Set#include?: 396,625 ( 2.5%) String#<<: 396,279 ( 2.5%) String#start_with?: 379,337 ( 2.4%) Hash#delete: 331,667 ( 2.1%) String.new: 307,248 ( 1.9%) Integer#===: 279,054 ( 1.7%) Symbol#end_with?: 255,538 ( 1.6%) Kernel#is_a?: 246,961 ( 1.5%) Process.clock_gettime: 221,588 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,059 ( 1.4%) String#downcase: 213,109 ( 1.3%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 192,211 ( 1.2%) Time#subsec: 189,240 ( 1.2%) String#to_sym: 185,947 ( 1.2%) Top-20 calls to C functions from JIT code (83.4% of total 126,772,007): rb_vm_opt_send_without_block: 35,829,863 (28.3%) rb_vm_send: 10,108,894 ( 8.0%) rb_hash_aref: 9,009,231 ( 7.1%) rb_vm_env_write: 8,571,665 ( 6.8%) rb_zjit_writebarrier_check_immediate: 7,702,599 ( 6.1%) rb_vm_getinstancevariable: 5,930,325 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,764,439 ( 3.8%) rb_obj_is_kind_of: 3,722,865 ( 2.9%) rb_vm_invokesuper: 2,687,484 ( 2.1%) rb_hash_aset: 2,421,186 ( 1.9%) rb_vm_setinstancevariable: 2,355,461 ( 1.9%) rb_vm_opt_getconstant_path: 2,295,528 ( 1.8%) Hash#fetch: 1,779,524 ( 1.4%) fetch: 1,405,591 ( 1.1%) rb_vm_invokeblock: 1,385,989 ( 1.1%) rb_str_buf_append: 1,369,177 ( 1.1%) rb_ec_ary_new_from_values: 1,337,865 ( 1.1%) rb_class_allocate_instance: 1,295,755 ( 1.0%) rb_hash_new_with_size: 902,684 ( 0.7%) rb_vm_sendforward: 798,572 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 4,902,716): iseq: 4,899,894 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 526,064): optimized_send: 479,589 (91.2%) null: 42,176 ( 8.0%) optimized_block_call: 4,299 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,350,245): cfunc: 2,239,567 (95.3%) alias: 107,374 ( 4.6%) attrset: 3,304 ( 0.1%) Top-3 instructions with uncategorized fallback reason (100.0% of total 2,216,683): invokeblock: 1,385,989 (62.5%) sendforward: 798,572 (36.0%) opt_send_without_block: 32,122 ( 1.4%) Top-20 send fallback reasons (99.9% of total 50,810,802): send_without_block_polymorphic: 18,668,686 (36.7%) singleton_class_seen: 9,323,039 (18.3%) send_not_optimized_method_type: 4,902,716 ( 9.6%) send_without_block_no_profiles: 4,824,297 ( 9.5%) send_no_profiles: 2,853,944 ( 5.6%) one_or_more_complex_arg_pass: 2,829,717 ( 5.6%) super_not_optimized_method_type: 2,350,245 ( 4.6%) uncategorized: 2,216,683 ( 4.4%) send_without_block_megamorphic: 723,037 ( 1.4%) send_polymorphic: 544,026 ( 1.1%) send_without_block_not_optimized_method_type_optimized: 483,888 ( 1.0%) send_without_block_not_optimized_need_permission: 390,364 ( 0.8%) too_many_args_for_lir: 172,809 ( 0.3%) super_target_complex_args_pass: 128,824 ( 0.3%) super_complex_args_pass: 111,053 ( 0.2%) super_polymorphic: 87,851 ( 0.2%) argc_param_mismatch: 50,382 ( 0.1%) send_without_block_not_optimized_method_type: 42,176 ( 0.1%) obj_to_string_not_string: 34,861 ( 0.1%) send_without_block_direct_keyword_mismatch: 32,436 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,355,461): not_monomorphic: 2,132,746 (90.5%) not_t_object: 125,163 ( 5.3%) too_complex: 97,531 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,055,438): not_monomorphic: 5,806,179 (95.9%) too_complex: 249,259 ( 4.1%) Top-3 definedivar fallback reasons (100.0% of total 405,302): not_monomorphic: 397,150 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,030 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,385,989): monomorphic_iseq: 688,167 (49.7%) polymorphic: 523,864 (37.8%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,761 ( 0.5%) no_profiles: 5,424 ( 0.4%) Top-9 popular complex argument-parameter features not optimized (100.0% of total 3,234,958): param_kw_opt: 1,381,881 (42.7%) param_forwardable: 685,939 (21.2%) param_block: 640,948 (19.8%) param_rest: 327,046 (10.1%) param_kwrest: 120,209 ( 3.7%) caller_kw_splat: 38,970 ( 1.2%) caller_splat: 34,029 ( 1.1%) caller_blockarg: 5,826 ( 0.2%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 187,347): exception_handler: 187,347 (100.0%) Top-6 unhandled YARV insns (100.0% of total 89,278): invokesuperforward: 81,667 (91.5%) getconstant: 3,318 ( 3.7%) setblockparam: 2,837 ( 3.2%) checkmatch: 929 ( 1.0%) expandarray: 360 ( 0.4%) once: 167 ( 0.2%) Top-3 unhandled HIR insns (100.0% of total 236,977): throw: 198,481 (83.8%) invokebuiltin: 35,775 (15.1%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,458,443): guard_type_failure: 6,918,397 (44.8%) guard_shape_failure: 6,859,686 (44.4%) block_param_proxy_not_iseq_or_ifunc: 1,008,346 ( 6.5%) unhandled_hir_insn: 236,977 ( 1.5%) compile_error: 187,347 ( 1.2%) unhandled_yarv_insn: 89,278 ( 0.6%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 22,145 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,787 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 423 ( 0.0%) obj_to_string_fallback: 330 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 86 ( 0.0%) send_count: 151,889,096 dynamic_send_count: 50,810,802 (33.5%) optimized_send_count: 101,078,294 (66.5%) dynamic_setivar_count: 2,355,461 ( 1.6%) dynamic_getivar_count: 6,055,438 ( 4.0%) dynamic_definedivar_count: 405,302 ( 0.3%) iseq_optimized_send_count: 39,470,508 (26.0%) inline_cfunc_optimized_send_count: 41,381,565 (27.2%) inline_iseq_optimized_send_count: 3,370,961 ( 2.2%) non_variadic_cfunc_optimized_send_count: 9,210,651 ( 6.1%) variadic_cfunc_optimized_send_count: 7,644,609 ( 5.0%) compiled_iseq_count: 5,552 failed_iseq_count: 0 compile_time: 1,809ms profile_time: 15ms gc_time: 21ms invalidation_time: 526ms vm_write_pc_count: 132,774,559 vm_write_sp_count: 132,774,559 vm_write_locals_count: 128,748,998 vm_write_stack_count: 128,748,998 vm_write_to_parent_iseq_local_count: 693,262 vm_read_from_parent_iseq_local_count: 14,737,431 guard_type_count: 158,811,089 guard_type_exit_ratio: 4.4% guard_shape_count: 0 code_region_bytes: 29,458,432 zjit_alloc_bytes: 44,650,569 total_mem_bytes: 74,109,001 side_exit_count: 15,458,443 total_insn_count: 934,491,306 vm_insn_count: 166,025,364 zjit_insn_count: 768,465,942 ratio_in_zjit: 82.2% ``` </details> ### rails-bench <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 1254ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.8% of total 39,182,033): Hash#key?: 3,141,634 ( 8.0%) Regexp#match?: 2,420,227 ( 6.2%) Hash#fetch: 2,245,557 ( 5.7%) Array#any?: 1,157,418 ( 3.0%) Hash#delete: 1,114,346 ( 2.8%) Integer#===: 1,098,163 ( 2.8%) String.new: 1,004,713 ( 2.6%) MatchData#[]: 831,442 ( 2.1%) String#b: 797,913 ( 2.0%) String#to_sym: 680,943 ( 1.7%) Kernel#dup: 680,022 ( 1.7%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.6%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.5%) Symbol#end_with?: 598,807 ( 1.5%) String#force_encoding: 593,535 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (75.2% of total 260,204,372): rb_vm_opt_send_without_block: 52,620,850 (20.2%) rb_hash_aref: 22,920,184 ( 8.8%) rb_vm_env_write: 19,484,445 ( 7.5%) rb_vm_send: 16,570,926 ( 6.4%) rb_zjit_writebarrier_check_immediate: 13,628,686 ( 5.2%) rb_vm_getinstancevariable: 12,378,112 ( 4.8%) rb_ivar_get_at_no_ractor_check: 12,208,856 ( 4.7%) rb_vm_invokesuper: 8,086,664 ( 3.1%) rb_hash_aset: 5,043,532 ( 1.9%) rb_obj_is_kind_of: 4,431,294 ( 1.7%) rb_vm_invokeblock: 4,036,483 ( 1.6%) Hash#key?: 3,141,634 ( 1.2%) rb_vm_opt_getconstant_path: 3,051,909 ( 1.2%) rb_class_allocate_instance: 2,878,743 ( 1.1%) rb_hash_new_with_size: 2,873,398 ( 1.1%) rb_ec_ary_new_from_values: 2,584,790 ( 1.0%) rb_str_concat_literals: 2,450,752 ( 0.9%) Regexp#match?: 2,420,227 ( 0.9%) rb_obj_alloc: 2,419,180 ( 0.9%) rb_vm_setinstancevariable: 2,357,067 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,761): iseq: 8,518,290 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 790,792): optimized_send: 608,036 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,860): cfunc: 6,640,181 (99.3%) attrset: 49,679 ( 0.7%) Top-3 instructions with uncategorized fallback reason (100.0% of total 5,911,882): invokeblock: 4,036,483 (68.3%) sendforward: 1,871,601 (31.7%) opt_send_without_block: 3,798 ( 0.1%) Top-20 send fallback reasons (100.0% of total 83,186,524): send_without_block_polymorphic: 33,814,235 (40.6%) send_not_optimized_method_type: 8,550,761 (10.3%) send_without_block_no_profiles: 8,405,471 (10.1%) super_not_optimized_method_type: 6,689,860 ( 8.0%) uncategorized: 5,911,882 ( 7.1%) one_or_more_complex_arg_pass: 5,502,146 ( 6.6%) send_no_profiles: 4,700,820 ( 5.7%) send_polymorphic: 3,318,564 ( 4.0%) send_without_block_not_optimized_need_permission: 1,274,177 ( 1.5%) singleton_class_seen: 1,101,973 ( 1.3%) too_many_args_for_lir: 905,412 ( 1.1%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 608,036 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,141 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,358 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,067): not_monomorphic: 2,255,283 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,378,137): not_monomorphic: 12,378,137 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,483): monomorphic_iseq: 2,189,057 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,225 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-9 popular complex argument-parameter features not optimized (100.0% of total 7,096,505): param_kw_opt: 1,834,705 (25.9%) param_forwardable: 1,824,953 (25.7%) param_block: 1,792,214 (25.3%) param_rest: 861,894 (12.1%) caller_kw_splat: 297,937 ( 4.2%) caller_splat: 283,669 ( 4.0%) param_kwrest: 200,208 ( 2.8%) caller_blockarg: 752 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-7 unhandled YARV insns (100.0% of total 1,899,393): getblockparam: 898,862 (47.3%) invokesuperforward: 498,993 (26.3%) getconstant: 400,945 (21.1%) expandarray: 49,985 ( 2.6%) setblockparam: 49,972 ( 2.6%) checkmatch: 480 ( 0.0%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,151): throw: 232,560 (86.7%) invokebuiltin: 35,591 (13.3%) Top-19 side exit reasons (100.0% of total 9,609,677): guard_shape_failure: 2,498,160 (26.0%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (20.7%) unhandled_yarv_insn: 1,899,393 (19.8%) guard_type_failure: 1,722,167 (17.9%) compile_error: 391,562 ( 4.1%) unhandled_newarray_send_pack: 298,017 ( 3.1%) unhandled_hir_insn: 268,151 ( 2.8%) patchpoint_method_redefined: 200,632 ( 2.1%) unhandled_block_arg: 151,295 ( 1.6%) block_param_proxy_modified: 124,245 ( 1.3%) guard_less_failure: 50,126 ( 0.5%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,366 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 429 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 48 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 328,547,991 dynamic_send_count: 83,186,524 (25.3%) optimized_send_count: 245,361,467 (74.7%) dynamic_setivar_count: 2,357,067 ( 0.7%) dynamic_getivar_count: 12,378,137 ( 3.8%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,424,465 (28.4%) inline_cfunc_optimized_send_count: 98,338,280 (29.9%) inline_iseq_optimized_send_count: 9,338,763 ( 2.8%) non_variadic_cfunc_optimized_send_count: 26,452,910 ( 8.1%) variadic_cfunc_optimized_send_count: 17,807,049 ( 5.4%) compiled_iseq_count: 2,887 failed_iseq_count: 0 compile_time: 877ms profile_time: 32ms gc_time: 11ms invalidation_time: 15ms vm_write_pc_count: 284,341,923 vm_write_sp_count: 284,341,923 vm_write_locals_count: 272,137,494 vm_write_stack_count: 272,137,494 vm_write_to_parent_iseq_local_count: 1,079,867 vm_read_from_parent_iseq_local_count: 30,816,135 guard_type_count: 313,667,907 guard_type_exit_ratio: 0.5% guard_shape_count: 0 code_region_bytes: 14,417,920 zjit_alloc_bytes: 19,075,183 total_mem_bytes: 33,493,103 side_exit_count: 9,609,677 total_insn_count: 1,706,360,231 vm_insn_count: 124,793,155 zjit_insn_count: 1,581,567,076 ratio_in_zjit: 92.7% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 1136ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.8% of total 39,182,033): Hash#key?: 3,141,634 ( 8.0%) Regexp#match?: 2,420,227 ( 6.2%) Hash#fetch: 2,245,557 ( 5.7%) Array#any?: 1,157,418 ( 3.0%) Hash#delete: 1,114,346 ( 2.8%) Integer#===: 1,098,163 ( 2.8%) String.new: 1,004,713 ( 2.6%) MatchData#[]: 831,442 ( 2.1%) String#b: 797,913 ( 2.0%) String#to_sym: 680,943 ( 1.7%) Kernel#dup: 680,022 ( 1.7%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.6%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.5%) Symbol#end_with?: 598,807 ( 1.5%) String#force_encoding: 593,535 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (74.8% of total 261,805,313): rb_vm_opt_send_without_block: 52,621,173 (20.1%) rb_hash_aref: 22,920,184 ( 8.8%) rb_vm_env_write: 19,484,925 ( 7.4%) rb_vm_send: 16,571,020 ( 6.3%) rb_zjit_writebarrier_check_immediate: 13,780,332 ( 5.3%) rb_vm_getinstancevariable: 12,378,114 ( 4.7%) rb_ivar_get_at_no_ractor_check: 12,208,856 ( 4.7%) rb_vm_invokesuper: 8,086,666 ( 3.1%) rb_hash_aset: 5,043,537 ( 1.9%) rb_obj_is_kind_of: 4,431,299 ( 1.7%) rb_vm_invokeblock: 4,036,481 ( 1.5%) Hash#key?: 3,141,634 ( 1.2%) rb_vm_opt_getconstant_path: 3,051,909 ( 1.2%) rb_class_allocate_instance: 2,878,746 ( 1.1%) rb_hash_new_with_size: 2,873,398 ( 1.1%) rb_ec_ary_new_from_values: 2,585,224 ( 1.0%) rb_str_concat_literals: 2,450,752 ( 0.9%) Regexp#match?: 2,420,227 ( 0.9%) rb_obj_alloc: 2,419,182 ( 0.9%) rb_vm_setinstancevariable: 2,357,067 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,761): iseq: 8,518,290 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 790,792): optimized_send: 608,036 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,860): cfunc: 6,640,181 (99.3%) attrset: 49,679 ( 0.7%) Top-3 instructions with uncategorized fallback reason (100.0% of total 5,911,883): invokeblock: 4,036,481 (68.3%) sendforward: 1,871,601 (31.7%) opt_send_without_block: 3,801 ( 0.1%) Top-20 send fallback reasons (100.0% of total 83,186,941): send_without_block_polymorphic: 33,814,528 (40.6%) send_not_optimized_method_type: 8,550,761 (10.3%) send_without_block_no_profiles: 8,405,497 (10.1%) super_not_optimized_method_type: 6,689,860 ( 8.0%) uncategorized: 5,911,883 ( 7.1%) one_or_more_complex_arg_pass: 5,502,147 ( 6.6%) send_no_profiles: 4,700,820 ( 5.7%) send_polymorphic: 3,318,658 ( 4.0%) send_without_block_not_optimized_need_permission: 1,274,177 ( 1.5%) singleton_class_seen: 1,101,973 ( 1.3%) too_many_args_for_lir: 905,412 ( 1.1%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 608,036 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,141 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,360 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,067): not_monomorphic: 2,255,283 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,378,139): not_monomorphic: 12,378,139 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,481): monomorphic_iseq: 2,189,057 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,223 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-9 popular complex argument-parameter features not optimized (100.0% of total 7,096,506): param_kw_opt: 1,834,706 (25.9%) param_forwardable: 1,824,953 (25.7%) param_block: 1,792,214 (25.3%) param_rest: 861,894 (12.1%) caller_kw_splat: 297,937 ( 4.2%) caller_splat: 283,669 ( 4.0%) param_kwrest: 200,208 ( 2.8%) caller_blockarg: 752 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-6 unhandled YARV insns (100.0% of total 1,000,531): invokesuperforward: 498,993 (49.9%) getconstant: 400,945 (40.1%) expandarray: 49,985 ( 5.0%) setblockparam: 49,972 ( 5.0%) checkmatch: 480 ( 0.0%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,154): throw: 232,560 (86.7%) invokebuiltin: 35,594 (13.3%) Top-19 side exit reasons (100.0% of total 8,710,811): guard_shape_failure: 2,498,161 (28.7%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (22.8%) guard_type_failure: 1,722,168 (19.8%) unhandled_yarv_insn: 1,000,531 (11.5%) compile_error: 391,562 ( 4.5%) unhandled_newarray_send_pack: 298,017 ( 3.4%) unhandled_hir_insn: 268,154 ( 3.1%) patchpoint_method_redefined: 200,632 ( 2.3%) unhandled_block_arg: 151,295 ( 1.7%) block_param_proxy_modified: 124,245 ( 1.4%) guard_less_failure: 50,126 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,366 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 429 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 39 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 328,747,903 dynamic_send_count: 83,186,941 (25.3%) optimized_send_count: 245,560,962 (74.7%) dynamic_setivar_count: 2,357,067 ( 0.7%) dynamic_getivar_count: 12,378,139 ( 3.8%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,623,831 (28.5%) inline_cfunc_optimized_send_count: 98,338,311 (29.9%) inline_iseq_optimized_send_count: 9,338,766 ( 2.8%) non_variadic_cfunc_optimized_send_count: 26,453,005 ( 8.0%) variadic_cfunc_optimized_send_count: 17,807,049 ( 5.4%) compiled_iseq_count: 2,888 failed_iseq_count: 0 compile_time: 858ms profile_time: 29ms gc_time: 59ms invalidation_time: 15ms vm_write_pc_count: 285,990,091 vm_write_sp_count: 285,990,091 vm_write_locals_count: 272,886,376 vm_write_stack_count: 272,886,376 vm_write_to_parent_iseq_local_count: 1,079,877 vm_read_from_parent_iseq_local_count: 30,816,135 guard_type_count: 314,169,071 guard_type_exit_ratio: 0.5% guard_shape_count: 0 code_region_bytes: 14,401,536 zjit_alloc_bytes: 19,128,598 total_mem_bytes: 33,530,134 side_exit_count: 8,710,811 total_insn_count: 1,705,461,649 vm_insn_count: 121,244,824 zjit_insn_count: 1,584,216,825 ratio_in_zjit: 92.9% ``` </details>
…5928) This is a follow up to #15816. Since I was only optimizing `invokesuper` for monomorphic cases, I could track that with a boolean value (actually, `Option` in this case). But, `TypeDistribution` is a better way to track this information and will put us on better footing if we end up handling polymorphic cases.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )