Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -79,3 +79,4 @@
*.tgz binary
*.war binary
*.zip binary

6 changes: 6 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/pcodeparse.y
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ extern int pcodeerror(const char *str );
UserOpSymbol *useropsym;
LabelSymbol *labelsym;
StartSymbol *startsym;
OffsetSymbol *offsetsym;
EndSymbol *endsym;
Next2Symbol *next2sym;
OperandSymbol *operandsym;
Expand Down Expand Up @@ -78,6 +79,7 @@ extern int pcodeerror(const char *str );
%token <varsym> VARSYM
%token <operandsym> OPERANDSYM
%token <startsym> STARTSYM
%token <offsetsym> OFFSETSYM
%token <endsym> ENDSYM
%token <next2sym> NEXT2SYM
%token <labelsym> LABELSYM
Expand Down Expand Up @@ -225,6 +227,7 @@ label: '<' LABELSYM '>' { $$ = $2; }
specificsymbol: VARSYM { $$ = $1; }
| OPERANDSYM { $$ = $1; }
| STARTSYM { $$ = $1; }
| OFFSETSYM { $$ = $1; }
| ENDSYM { $$ = $1; }
| NEXT2SYM { $$ = $1; }
;
Expand Down Expand Up @@ -752,6 +755,9 @@ int4 PcodeSnippet::lex(void)
case SleighSymbol::start_symbol:
yylval.startsym = (StartSymbol *)sym;
return STARTSYM;
case SleighSymbol::offset_symbol:
yylval.offsetsym = (OffsetSymbol *)sym;
return OFFSETSYM;
case SleighSymbol::end_symbol:
yylval.endsym = (EndSymbol *)sym;
return ENDSYM;
Expand Down
8 changes: 8 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/semantics.cc
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ uintb ConstTpl::fix(const ParserWalker &walker) const
switch(type) {
case j_start:
return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address
case j_offset:
return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address
Comment on lines +124 to +125

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment for case j_offset appears to be a copy-paste from case j_start. This is misleading as it doesn't mention that this case handles the operand_offset symbol. Please update the comment for clarity.

  case j_offset:
    return walker.getAddr().getOffset(); // Fill in operand offset placeholder with instruction address

case j_next:
Comment on lines +124 to 126
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

operand_offset currently resolves to the instruction start.

ConstTpl::fix() returns walker.getAddr().getOffset() for j_offset, which is identical to j_start. As a result the new operand_offset symbol never yields the operand’s byte offset; it just repeats the instruction start address. That breaks the feature everywhere this constant is consumed. Use the parser walker’s operand-offset accessor (same one used by OffsetInstructionValue) instead:

-  case j_offset:
-    return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address
+  case j_offset:
+    return walker.getOffset(-1); // Fill in operand-offset placeholder with actual operand position
🤖 Prompt for AI Agents
In Ghidra/Features/Decompiler/src/decompile/cpp/semantics.cc around lines
124-126, ConstTpl::fix() currently returns walker.getAddr().getOffset() for
j_offset which yields the instruction start (same as j_start) instead of the
operand byte offset; replace that return to call the parser walker's
operand-offset accessor (the same accessor used by OffsetInstructionValue) so
operand_offset resolves to the operand’s actual byte offset rather than the
instruction start.

return walker.getNaddr().getOffset(); // Fill in next address placeholder with real address
case j_next2:
Expand Down Expand Up @@ -350,6 +352,9 @@ void ConstTpl::saveXml(ostream &s) const
case j_start:
s << "start\"/>";
break;
case j_offset:
s << "operand_offset\"/>";
break;
case j_next:
s << "next\"/>";
break;
Expand Down Expand Up @@ -408,6 +413,9 @@ void ConstTpl::restoreXml(const Element *el,const AddrSpaceManager *manage)
else if (typestring=="start") {
type = j_start;
}
else if (typestring=="operand_offset") {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Accept a legacy alias when restoring XML by recognizing both "operand_offset" and the older "offset" string to improve backward compatibility. [best practice]

Suggested change
else if (typestring=="operand_offset") {
else if (typestring=="operand_offset" || typestring=="offset") {
Why Change? ⭐

The change simply extends the existing string comparison to accept a legacy alias ("offset") in addition to the new "operand_offset" token.

  • Syntax: The expression (typestring == "operand_offset" || typestring == "offset") is valid C++ for std::string comparisons.
  • Context: restoreXml already uses typestring (const string&) to select the ConstTpl::type, and saveXml emits "operand_offset". Accepting "offset" only affects parsing and maps it to the same enum value j_offset.
  • Safety: No new dependencies or calls are introduced. There is no modification of surrounding logic or state, and no ambiguity introduced because both tokens now explicitly map to the same enum case.
  • Assumptions: typestring is std::string as shown in the surrounding code; there is no other branch that relies on "offset" meaning something different in this same restoreXml switch. Given the PR diff, this is a backward-compatibility improvement and does not change runtime behavior for existing "operand_offset" XML.
    Therefore the improved code is syntactically correct, executable, and safe to merge.

type = j_offset;
}
else if (typestring=="next") {
type = j_next;
}
Expand Down
2 changes: 1 addition & 1 deletion Ghidra/Features/Decompiler/src/decompile/cpp/semantics.hh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ class ConstTpl {
public:
enum const_type { real=0, handle=1, j_start=2, j_next=3, j_next2=4, j_curspace=5,
j_curspace_size=6, spaceid=7, j_relative=8,
j_flowref=9, j_flowref_size=10, j_flowdest=11, j_flowdest_size=12 };
j_flowref=9, j_flowref_size=10, j_flowdest=11, j_flowdest_size=12, j_offset=13 };
enum v_field { v_space=0, v_offset=1, v_size=2, v_offset_plus=3 };
private:
const_type type;
Expand Down
2 changes: 2 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/slgh_compile.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1817,6 +1817,8 @@ void SleighCompile::predefinedSymbols(void)
symtab.addSymbol(spacesym);
StartSymbol *startsym = new StartSymbol("inst_start",getConstantSpace());
symtab.addSymbol(startsym);
OffsetSymbol *offsetsym = new OffsetSymbol("operand_offset",getConstantSpace());
symtab.addSymbol(offsetsym);
EndSymbol *endsym = new EndSymbol("inst_next",getConstantSpace());
symtab.addSymbol(endsym);
Next2Symbol *next2sym = new Next2Symbol("inst_next2",getConstantSpace());
Expand Down
4 changes: 4 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/slghparse.y
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ extern int sleigherror(const char *str );
LabelSymbol *labelsym;
SubtableSymbol *subtablesym;
StartSymbol *startsym;
OffsetSymbol *offsetsym;
EndSymbol *endsym;
Next2Symbol *next2sym;
OperandSymbol *operandsym;
Expand Down Expand Up @@ -123,6 +124,7 @@ extern int sleigherror(const char *str );
%token <varlistsym> VARLISTSYM
%token <operandsym> OPERANDSYM
%token <startsym> STARTSYM
%token <offsetsym> OFFSETSYM
%token <endsym> ENDSYM
%token <next2sym> NEXT2SYM
%token <macrosym> MACROSYM
Expand Down Expand Up @@ -504,6 +506,7 @@ specificsymbol: VARSYM { $$ = $1; }
| SPECSYM { $$ = $1; }
| OPERANDSYM { $$ = $1; }
| STARTSYM { $$ = $1; }
| OFFSETSYM { $$ = $1; }
| ENDSYM { $$ = $1; }
| NEXT2SYM { $$ = $1; }
;
Expand Down Expand Up @@ -579,6 +582,7 @@ anysymbol: SPACESYM { $$ = $1; }
| VARLISTSYM { $$ = $1; }
| OPERANDSYM { $$ = $1; }
| STARTSYM { $$ = $1; }
| OFFSETSYM { $$ = $1; }
| ENDSYM { $$ = $1; }
| NEXT2SYM { $$ = $1; }
| BITSYM { $$ = $1; }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,8 @@ PatternExpression *PatternExpression::restoreExpression(const Element *el,Transl
res = new OperandValue();
else if (nm == "start_exp")
res = new StartInstructionValue();
else if (nm == "offset_exp")
res = new OperandOffsetValue();
else if (nm == "end_exp")
res = new EndInstructionValue();
else if (nm == "plus_exp")
Expand Down
14 changes: 14 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/slghpatexpress.hh
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,20 @@ public:
virtual void saveXml(ostream &s) const { s << "<start_exp/>"; }
virtual void restoreXml(const Element *el,Translate *trans) {}
};

class OperandOffsetValue : public PatternValue {
public:
OperandOffsetValue(void) {}
virtual intb getValue(ParserWalker &walker) const {
return (intb)walker.getOffset(-1);
}
virtual TokenPattern genMinPattern(const vector<TokenPattern> &ops) const { return TokenPattern(); }
virtual TokenPattern genPattern(intb val) const { return TokenPattern(); }
virtual intb minValue(void) const { return (intb)0; }
virtual intb maxValue(void) const { return (intb)0; }
virtual void saveXml(ostream &s) const { s << "<offset_exp/>"; }
virtual void restoreXml(const Element *el,Translate *trans) {}
};

class EndInstructionValue : public PatternValue {
public:
Expand Down
3 changes: 3 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/slghscan.l
Original file line number Diff line number Diff line change
Expand Up @@ -431,6 +431,9 @@ int4 find_symbol(void) {
case SleighSymbol::start_symbol:
sleighlval.startsym = (StartSymbol *)sym;
return STARTSYM;
case SleighSymbol::offset_symbol:
sleighlval.offsetsym = (OffsetSymbol *)sym;
return OFFSETSYM;
case SleighSymbol::end_symbol:
sleighlval.endsym = (EndSymbol *)sym;
return ENDSYM;
Expand Down
66 changes: 66 additions & 0 deletions Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.cc
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,8 @@ void SymbolTable::restoreSymbolHeader(const Element *el)
sym = new OperandSymbol();
else if (el->getName() == "start_sym_head")
sym = new StartSymbol();
else if (el->getName() == "offset_sym_head")
sym = new OffsetSymbol();
else if (el->getName() == "end_sym_head")
sym = new EndSymbol();
else if (el->getName() == "next2_sym_head")
Expand Down Expand Up @@ -1196,6 +1198,70 @@ void StartSymbol::restoreXml(const Element *el,SleighBase *trans)
patexp->layClaim();
}

OffsetSymbol::OffsetSymbol(const string &nm,AddrSpace *cspc) : SpecificSymbol(nm)

{
const_space = cspc;
patexp = new OperandOffsetValue();
patexp->layClaim();
}

OffsetSymbol::~OffsetSymbol(void)

{
if (patexp != (PatternExpression *)0)
PatternExpression::release(patexp);
}

VarnodeTpl *OffsetSymbol::getVarnode(void) const

{ // Returns current operand offset as a constant
ConstTpl spc(const_space);
ConstTpl off(ConstTpl::j_offset);
ConstTpl sz_zero;
return new VarnodeTpl(spc,off,sz_zero);
}

void OffsetSymbol::getFixedHandle(FixedHandle &hand,ParserWalker &walker) const

{
hand.space = walker.getCurSpace();
hand.offset_space = (AddrSpace *)0;
hand.offset_offset = walker.getAddr().getOffset(); // Get starting address of instruction

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is a copy from StartSymbol and is a bit generic. To improve clarity, it would be better to make it specific to operand_offset.

  hand.offset_offset = walker.getAddr().getOffset(); // Get instruction offset for operand_offset symbol

hand.size = hand.space->getAddrSize();
}

void OffsetSymbol::print(ostream &s,ParserWalker &walker) const

{
intb val = (intb) walker.getAddr().getOffset();
s << "0x" << std::hex << val << std::dec;
}
Comment on lines +1226 to +1239
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Operand offset needs operand-relative displacement

OffsetSymbol::getFixedHandle() and print() still return the instruction start (walker.getAddr().getOffset()), so operand_offset collapses to start and never exposes the intended operand displacement. Use the operand-offset value (e.g., via patexp->getValue(walker) or by adding the operand’s relative offset to the instruction base) so the symbol actually reports the current operand’s offset bytes. Right now every use of operand_offset will be wrong.

🤖 Prompt for AI Agents
In Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.cc around lines
1226-1239, both OffsetSymbol::getFixedHandle() and OffsetSymbol::print()
currently use walker.getAddr().getOffset() (instruction start) which causes
operand_offset to equal the instruction base; replace that with the
operand-relative displacement so the symbol reflects the operand offset: in
getFixedHandle() compute the offset by adding the operand's relative offset to
the instruction base (e.g. base = walker.getAddr().getOffset(); rel =
patexp->getValue(walker) or whatever mechanism provides the operand-relative
value; hand.offset_offset = base + rel) and in print() use the same computed
operand offset (not walker.getAddr().getOffset()) when formatting the hex value.


void OffsetSymbol::saveXml(ostream &s) const

{
s << "<offset_sym";
SleighSymbol::saveXmlHeader(s);
s << "/>\n";
}

void OffsetSymbol::saveXmlHeader(ostream &s) const

{
s << "<offset_sym_head";
SleighSymbol::saveXmlHeader(s);
s << "/>\n";
}

void OffsetSymbol::restoreXml(const Element *el,SleighBase *trans)

{
const_space = trans->getConstantSpace();
patexp = new OperandOffsetValue();
patexp->layClaim();
}

EndSymbol::EndSymbol(const string &nm,AddrSpace *cspc) : SpecificSymbol(nm)

{
Expand Down
19 changes: 18 additions & 1 deletion Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.hh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ class SleighSymbol {
public:
enum symbol_type { space_symbol, token_symbol, userop_symbol, value_symbol, valuemap_symbol,
name_symbol, varnode_symbol, varnodelist_symbol, operand_symbol,
start_symbol, end_symbol, next2_symbol, subtable_symbol, macro_symbol, section_symbol,
start_symbol, offset_symbol, end_symbol, next2_symbol, subtable_symbol, macro_symbol, section_symbol,
bitrange_symbol, context_symbol, epsilon_symbol, label_symbol,
dummy_symbol };
private:
Expand Down Expand Up @@ -376,6 +376,23 @@ public:
virtual void restoreXml(const Element *el,SleighBase *trans);
};

class OffsetSymbol : public SpecificSymbol {
AddrSpace *const_space;
PatternExpression *patexp;
public:
OffsetSymbol(void) { patexp = (PatternExpression *)0; } // For use with restoreXml
OffsetSymbol(const string &nm,AddrSpace *cspc);
virtual ~OffsetSymbol(void);
virtual VarnodeTpl *getVarnode(void) const;
virtual PatternExpression *getPatternExpression(void) const { return patexp; }
virtual void getFixedHandle(FixedHandle &hand,ParserWalker &walker) const;
virtual void print(ostream &s,ParserWalker &walker) const;
virtual symbol_type getType(void) const { return offset_symbol; }
virtual void saveXml(ostream &s) const;
virtual void saveXmlHeader(ostream &s) const;
virtual void restoreXml(const Element *el,SleighBase *trans);
};

class EndSymbol : public SpecificSymbol {
AddrSpace *const_space;
PatternExpression *patexp;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,7 @@ specific_symbol[String purpose] returns [SpecificSymbol symbol]
if (sym == null) {
unknownSymbolError($s.getText(), find($s), "start, end, next2, operand, epsilon, or varnode", purpose);
} else if(sym.getType() != symbol_type.start_symbol
&& sym.getType() != symbol_type.offset_symbol
&& sym.getType() != symbol_type.end_symbol
&& sym.getType() != symbol_type.next2_symbol
&& sym.getType() != symbol_type.operand_symbol
Expand Down Expand Up @@ -839,6 +840,7 @@ pattern_symbol[String purpose] returns [PatternExpression expr]
}
$expr = os.getPatternExpression();
} else if(sym.getType() == symbol_type.start_symbol
|| sym.getType() == symbol_type.offset_symbol
|| sym.getType() == symbol_type.end_symbol
|| sym.getType() == symbol_type.next2_symbol
|| sym.getType() == symbol_type.epsilon_symbol
Expand Down Expand Up @@ -872,6 +874,7 @@ pattern_symbol2[String purpose] returns [PatternExpression expr]
if (sym == null) {
unknownSymbolError($s.getText(), find($s), "start, end, next2, operand, epsilon, or varnode", purpose);
} else if(sym.getType() == symbol_type.start_symbol
|| sym.getType() == symbol_type.offset_symbol
|| sym.getType() == symbol_type.end_symbol
|| sym.getType() == symbol_type.next2_symbol
|| sym.getType() == symbol_type.operand_symbol
Expand Down Expand Up @@ -943,6 +946,7 @@ cstatement[VectorSTL<ContextChange> r]
|| sym.getType() == symbol_type.name_symbol
|| sym.getType() == symbol_type.varnodelist_symbol
|| sym.getType() == symbol_type.start_symbol
|| sym.getType() == symbol_type.offset_symbol
|| sym.getType() == symbol_type.end_symbol
|| sym.getType() == symbol_type.next2_symbol
|| sym.getType() == symbol_type.operand_symbol
Expand Down Expand Up @@ -1170,6 +1174,7 @@ assignment returns [VectorSTL<OpTpl> value]
if (sym == null) {
$value = pcode.newOutput(find(id), false, e, $id.getText());
} else if(sym.getType() != symbol_type.start_symbol
&& sym.getType() != symbol_type.offset_symbol
&& sym.getType() != symbol_type.end_symbol
&& sym.getType() != symbol_type.next2_symbol
&& sym.getType() != symbol_type.operand_symbol
Expand Down Expand Up @@ -1486,6 +1491,7 @@ expr_apply returns [Object value]
pcode.reportError(find($t), "macro invocation not allowed as expression");
}
} else if(sym.getType() == symbol_type.start_symbol
|| sym.getType() == symbol_type.offset_symbol
|| sym.getType() == symbol_type.end_symbol
|| sym.getType() == symbol_type.next2_symbol
|| sym.getType() == symbol_type.operand_symbol
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -570,6 +570,9 @@ else if (sym instanceof VarnodeSymbol) {
else if (sym instanceof StartSymbol) {
// Ignore. We handle inst_start in semantic processing
}
else if (sym instanceof OffsetSymbol) {
// Ignore. We handle inst_start in semantic processing
}
Comment on lines +573 to +575

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment here seems to be a copy-paste error from the StartSymbol case. It should refer to operand_offset instead of inst_start to avoid confusion.

Suggested change
else if (sym instanceof OffsetSymbol) {
// Ignore. We handle inst_start in semantic processing
}
else if (sym instanceof OffsetSymbol) {
// Ignore. We handle operand_offset in semantic processing
}

else if (sym instanceof EndSymbol) {
// Ignore. We handle inst_next in semantic processing
}
Expand Down
Loading