-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Background
The first requirement for an almost-matching-function to be promoted to an effective match is that the lists of instructions in orig and recomp have the same length. Call these lists A for orig and B for recomp. We also have a list of "opcodes" provided by difflib.SequenceMatcher that describe how to turn A into B.
With this information, relocate_instructions() from fixes.py looks for situations where an instruction deleted from A and inserted to some position in B does not alter the behavior of the function. This function takes a naive approach that is not without problems: #322
How it works now
- Read the list of inserted lines in
B. Discard special lines that are not instructions such as entries to jump tables or data tables (both used inswitchstatements). - Read deleted indices in
Aand find a lineXthat matches the one fromBexactly. Aset()is used here so ordering is not guaranteed. - Get the set of registers
Rthat are modified or used byX. - Find the deletion index and the insertion index in
A. To allow for swaps where the deleted line is before or after the inserted line, put these indices in orderi, j. (Recent bugfix: Fix: Did not check all intermediate instructions when relocating #323) - Check whether any instructions in
A[i : j]modify the registers inR. If not, this is an allowed swap. - Remove the deleted line from consideration for future swaps. Note that the starting list
Ais not modified, just the list of deletions.
Limitations of the current design
- Considers blocks of deleted or inserted lines as single lines without regard for context.
- Matches instruction lines exactly without considering stack pointer adjustment.
- When considering a potential swap, we read from a
setof deleted lines in no specific order. The line we get may from a completely irrelevant index. - Does not consider alterations to the
ebpregister forpushorpopinstructions. - Does not consider FPU registers when checking intermediate lines.
- Does not consider local variables or memory locations when checking intermediate lines.
- Does not consider
replacediff opcodes where a block of instructions inAis replaced with one fromB(not necessarily equal size) in the same position.
What we can do about it
I think the first step is to come up with a set of counterexamples that demonstrate the acknowledged limitations. We can add these as xfail tests before attempting to refactor this code.