Skip to content

Conversation

@iluuu1994
Copy link
Member

Instead of relying on the optimizer to retain FREE/FE_FREE opcodes, extend live-ranges to the end of the try block.

Just some experimentation to solve issues that arise with match blocks. This attempts to solve the issue described in #5448, which arises with the following scenario:

foo() + throw new Exception();
$_main:
     ; (lines=8, args=0, vars=0, tmps=4)
     ; (before optimizer)
     ; /home/ilutov/Developer/php-src/test.php:1-3
     ; return  [] RANGE[0..0]
0000 INIT_FCALL_BY_NAME 0 string("foo")
0001 V0 = DO_FCALL_BY_NAME
0002 V1 = NEW 0 string("Exception")
0003 DO_FCALL
0004 THROW V1
0005 T3 = ADD V0 bool(true)
0006 FREE T3
0007 RETURN int(1)
LIVE RANGES:
     0: 0002 - 0005 (tmp/var)
     1: 0003 - 0004 (new)

Because THROW will jump out of the current context, it needs to free all the temporary VARs (and TMPVARs) that would normally be consumed by the skipped opcodes. We calculate live-ranges for this purpose. A live-range defines the "time" (sequence of opcodes) in which the VAR holds a valid value which has to be freed if exited prematurely. Live-ranges are calculated by scanning the opcodes backwards to find the first (i.e. lowest) usage, and the first (i.e. lowest) definition of the VAR. In this case, V0 (the result of the foo() function call) lives from opcode 0002 to 0005 (exclusive). The THROW opcode at position 0004 will look at which values are live at its position, and won't be used after (i.e. don't outlive) the try block, if available. This will free V0 in this case.

Normally, DCE would eliminate everything after THROW since it is unreachable, removing ADD V0 bool(true) which is the only use of V0. This would break calculation of the live-range of this V0 with the current implementation. As such, the VAR would no longer be freed. This issue only arose with throw expressions because in statement context no VARs are live (with some exceptions live switch conditions, loop iterators, etc). The current solution (#5450) is to treat throw expressions as non-terminators when building the CFG. This will persist the unreachable opcodes that follow.

With match blocks, we will once again experience this issue, even for throw statements.

foo() + match (1) {
    1 {
        throw new Exception();
    }
}

With DCE, this is essentially equivalent to foo() + throw new Exception(). Our current solution does not work for this case however, because the throw statement is treated as a terminator. We might also disable splitting CFG nodes for all throw statements in match blocks, that might actually be the easiest solution.

I've tried a different approach in this PR. Instead of relying on the retention of the dead consuming opcodes, the live-range of unused VARs may be extended until the end of the current try block, if available. The idea is that the consuming opcode will only be eliminated if the current context is guaranteed to exit. A exception will always at least skip over the opcodes in the current try block. We also rely on the fact that a VAR can't begin outside of a try block and end inside of it, and vice versa. The upside is that we may get rid of some other hack that retain dead opcodes exclusively for the sake of live-range calculation.

There are a few problems with this approach too.

  1. It is incompatible with zend_optimize_temporary_variables(). zend_optimize_temporary_variables() will reuse the VAR of a definition with no use. This will eliminate the live-range for the first definition. This algorithm would need to be adjusted.
  2. This approach is incompatible with two (or more) consuming uses of a VAR in different branches.
      V1 = ...
      JMPZ true false
    true:
      echo V1
    false:
      // throw
      echo V1
    
    The THROW must not eliminate the use of V1 in the lower branch, because that will shorten its live-range to the first usage. I'm unaware of such a sequence of opcodes being emitted in PHP, but it's easy to imagine. To my understanding, all other cases can eliminate the used opcodes.
  3. The COALESCE opcode only conditionally initializes the result, in case the null branch is skipped. If the null branch contains a throw, we will optimize away the second definition of the VAR, elongating the live-range. However, in the null branch the VAR isn't actually live, leading to a use-of-uninitialized-value. We could initialize result in COALESCE to IS_UNDEF, but that will lead to a slight performance regression. I'm not yet sure what the best way is to solve this.

A similar problem exists for return, continue, break, goto, etc. in match blocks. These will need to emit FREE opcodes for VARs that are live. This is similar to #5448. That's a different issue to tackle.

Instead of relying on the optimizer to retain FREE/FE_FREE opcodes, extend
live-ranges to the end of the try block.
@iluuu1994
Copy link
Member Author

iluuu1994 commented Apr 1, 2025

When I talked to Bob, he wasn't particularly fond of this approach. I have no immediate plans to pursue this.

@iluuu1994 iluuu1994 closed this Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant