Skip to content

Explicit CFG for ARM predicated instructions #51

@brk

Description

@brk

Should predicated instructions be explicitly represented with a basic block in the control flow graph?

This source for Euclid's GCD:

int target(int a, int b) {
  while (a != b) {
    if (a > b) {
      a = a - b;
    } else {
      b = b - a;
    }
  }
  return a;
}

becomes this -O2 assembly:

000104f0 <target>:
   104f0:	    e1500001 	cmp	r0, r1
   104f4:	    012fff1e 	bxeq	lr
   104f8:	/-> e0512000 	subs	r2, r1, r0
   104fc:	|   b1a02001 	movlt	r2, r1
   10500:	|   b0400001 	sublt	r0, r0, r1
   10504:	|   e1500002 	cmp	r0, r2
   10508:	|   e1a01002 	mov	r1, r2
   1050c:	\-- 1afffff9 	bne	104f8 <target+0x8>
   10510:	    e12fff1e 	bx	lr

which gets a base CFG with only three nodes: 104f0, 104f8 (which is a self-loop), and 10510. I would naively have expected the bxeq and the mov-sub pair to each get a basic block.

The lifted source I'd like to aim for, eventually, would be (modulo variable names):

int c;
if (a == b) return a;
do {
  c = b - a;
  if (c < 0) {
    c = b;
    a = a - b;
  }
  b = c;
} while (a == c);
return a;

There are, of course, several open questions about how we might represent the provenance for such phantom CFG nodes, and also how to reliably drive patching from edits to the lifted source corresponding to predicated instructions.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions