-
Notifications
You must be signed in to change notification settings - Fork 0
Clone introduce operand offset #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Klaus Kämpf <[email protected]>
Signed-off-by: Klaus Kämpf <[email protected]>
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Summary of ChangesHello @visz11, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the 'operand_offset' symbol into Ghidra's Sleigh language, providing a new mechanism for Sleigh specifications to access the offset of the current instruction. This enhancement allows for more precise and flexible definition of instruction semantics and pattern matching, particularly useful in scenarios where the instruction's relative position is a critical factor. The changes involve comprehensive updates to the Sleigh parser, compiler, semantic representation, and symbol management across both the C++ and Java codebases. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughAdds a new OffsetSymbol (“operand_offset”) across Sleigh/decompiler: grammar tokens, lexers, parsers, pattern-expression nodes, symbol classes, XML (save/restore), predefined symbol registration, and p-code/assembler integration. Extends ConstTpl with j_offset and corresponding XML tags/handling. Minor whitespace tweak in .gitattributes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Sleigh Author
participant SC as SleighCompile
participant ST as SymbolTable
participant XML as XML (save/restore)
note over SC,ST: Build time
Dev->>SC: Compile SLEIGH
SC->>ST: Register predefined "operand_offset" (OffsetSymbol)
ST-->>Dev: Symbol available in language
participant P as Parser/Lexer
participant PE as PatternExpression
participant RT as Runtime Eval
note over P,RT: Parse + Evaluate
Dev->>P: Use [operand_offset] in patterns
P->>PE: Restore "offset_exp"
PE-->>P: Offset(Operand) PatternValue
P->>RT: Build instruction semantics
RT->>RT: walker.getOffset(-1)
RT-->>Dev: Concrete offset value
sequenceDiagram
autonumber
participant Cpp as Decompiler C++ side
participant Java as Java pcodeCPort/Sleigh
participant IO as XML
note over Cpp,Java: Cross-language ConstTpl
Cpp->>IO: saveXml type="operand_offset" (j_offset)
Java->>IO: saveXml type="offset" (j_offset)
IO-->>Cpp: restoreXml -> j_offset
IO-->>Java: restoreXml -> j_offset
note over Cpp,Java: Both sides understand operand offset constants
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new Sleigh symbol, operand_offset, which appears to be an alias for inst_start. The changes are extensive, touching many files across both the C++ and Java codebases, including parsers, semantic analysis, and serialization logic. While the overall approach of cloning existing inst_start logic seems sound, there are several critical and high-severity issues related to copy-paste errors, particularly in the Java serialization code, which could lead to incorrect behavior. I've also pointed out several medium-severity issues regarding misleading comments and Javadoc that should be addressed to improve code clarity and maintainability.
| public void saveXml(PrintStream s) { | ||
| s.append("<start_sym"); | ||
| saveSleighSymbolXmlHeader(s); | ||
| s.println("/>"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be a copy-paste error from StartSymbol.java. The XML tag for saveXml should be <offset_sym... instead of <start_sym.... This is a serialization bug and will cause incorrect output.
| public void saveXml(PrintStream s) { | |
| s.append("<start_sym"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.println("/>"); | |
| } | |
| public void saveXml(PrintStream s) { | |
| s.append("<offset_sym"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.println("/>"); | |
| } |
| public void saveXmlHeader(PrintStream s) { | ||
| s.append("<start_sym_head"); | ||
| saveSleighSymbolXmlHeader(s); | ||
| s.append("/>\n"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to saveXml, this saveXmlHeader method has a copy-paste error from StartSymbol.java. The XML tag should be <offset_sym_head... instead of <start_sym_head.... This will cause incorrect serialization.
| public void saveXmlHeader(PrintStream s) { | |
| s.append("<start_sym_head"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.append("/>\n"); | |
| } | |
| public void saveXmlHeader(PrintStream s) { | |
| s.append("<offset_sym_head"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.append("/>\n"); | |
| } |
| case j_offset: | ||
| s.append("offset\"/>"); | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an inconsistency in the XML serialization between the Java and C++ implementations for j_offset. The Java code uses "offset", while the C++ code (semantics.cc) uses "operand_offset". This can cause compatibility issues. It's best to use the more descriptive "operand_offset" in both places.
| case j_offset: | |
| s.append("offset\"/>"); | |
| break; | |
| case j_offset: | |
| s.append("operand_offset\"/>"); | |
| break; |
| else if (typestring.equals("offset")) { | ||
| type = const_type.j_offset; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To match the C++ implementation and ensure XML compatibility, restoreXml should expect "operand_offset" instead of "offset". This is consistent with the suggested change in saveXml.
| else if (typestring.equals("offset")) { | |
| type = const_type.j_offset; | |
| } | |
| else if (typestring.equals("operand_offset")) { | |
| type = const_type.j_offset; | |
| } |
| case j_offset: | ||
| return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment for case j_offset appears to be a copy-paste from case j_start. This is misleading as it doesn't mention that this case handles the operand_offset symbol. Please update the comment for clarity.
case j_offset:
return walker.getAddr().getOffset(); // Fill in operand offset placeholder with instruction address| { | ||
| hand.space = walker.getCurSpace(); | ||
| hand.offset_space = (AddrSpace *)0; | ||
| hand.offset_offset = walker.getAddr().getOffset(); // Get starting address of instruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| else if (sym instanceof OffsetSymbol) { | ||
| // Ignore. We handle inst_start in semantic processing | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here seems to be a copy-paste error from the StartSymbol case. It should refer to operand_offset instead of inst_start to avoid confusion.
| else if (sym instanceof OffsetSymbol) { | |
| // Ignore. We handle inst_start in semantic processing | |
| } | |
| else if (sym instanceof OffsetSymbol) { | |
| // Ignore. We handle operand_offset in semantic processing | |
| } |
| /** | ||
| * | ||
| * | ||
| * The offset value of the current instructions address | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /** | ||
| * | ||
| * | ||
| * TripleSymbol with semantic value equal to offset of instruction's | ||
| * current address | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc for this class could be improved. It has an extra blank line and refers to TripleSymbol, which is a superclass but SpecificSymbol would be more precise. A more descriptive comment would be beneficial.
/**
* Symbol representing the offset of the current instruction's address,
* referred to as 'operand_offset' in Sleigh.
*/| else if (typestring=="start") { | ||
| type = j_start; | ||
| } | ||
| else if (typestring=="operand_offset") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Accept a legacy alias when restoring XML by recognizing both "operand_offset" and the older "offset" string to improve backward compatibility. [best practice]
| else if (typestring=="operand_offset") { | |
| else if (typestring=="operand_offset" || typestring=="offset") { |
Why Change? ⭐
The change simply extends the existing string comparison to accept a legacy alias ("offset") in addition to the new "operand_offset" token.
- Syntax: The expression (typestring == "operand_offset" || typestring == "offset") is valid C++ for std::string comparisons.
- Context: restoreXml already uses typestring (const string&) to select the ConstTpl::type, and saveXml emits "operand_offset". Accepting "offset" only affects parsing and maps it to the same enum value j_offset.
- Safety: No new dependencies or calls are introduced. There is no modification of surrounding logic or state, and no ambiguity introduced because both tokens now explicitly map to the same enum case.
- Assumptions: typestring is std::string as shown in the surrounding code; there is no other branch that relies on "offset" meaning something different in this same restoreXml switch. Given the PR diff, this is a backward-compatibility improvement and does not change runtime behavior for existing "operand_offset" XML.
Therefore the improved code is syntactically correct, executable, and safe to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/assembler/sleigh/SleighAssemblerBuilder.java (1)
573-575: Update comment to reflect offset handling.The logic is correct to ignore
OffsetSymbolduring grammar construction, mirroring the treatment ofStartSymbol. However, the comment should be updated for clarity.Apply this diff to update the comment:
else if (sym instanceof OffsetSymbol) { - // Ignore. We handle inst_start in semantic processing + // Ignore. We handle operand_offset in semantic processing }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.gitattributes(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/pcodeparse.y(4 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/semantics.cc(3 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/semantics.hh(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slgh_compile.cc(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghparse.y(4 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghpatexpress.cc(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghpatexpress.hh(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghscan.l(1 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.cc(2 hunks)Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.hh(2 hunks)Ghidra/Framework/SoftwareModeling/src/main/antlr/ghidra/sleigh/grammar/SleighCompiler.g(6 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/assembler/sleigh/SleighAssemblerBuilder.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/expression/OffsetInstructionValue.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/expression/PatternExpression.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/symbol/OffsetSymbol.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/symbol/SymbolTable.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/semantics/ConstTpl.java(3 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slgh_compile/SleighCompile.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slgh_compile/Yylval.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghpatexpress/OffsetInstructionValue.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghpatexpress/PatternExpression.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/OffsetSymbol.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/SymbolTable.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/symbol_type.java(1 hunks)Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/program/model/lang/PcodeParser.java(1 hunks)
🔇 Additional comments (23)
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/expression/PatternExpression.java (1)
52-53: LGTM!The addition of
offset_exphandling follows the established pattern for instruction-value expressions. The placement betweenstart_expandend_expis logical and maintains consistency with similar expression types.Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/symbol/SymbolTable.java (1)
153-154: LGTM!The addition of
offset_sym_headhandling is consistent with the existing symbol restoration pattern. The placement betweenstart_sym_headandend_sym_headmaintains logical grouping with other instruction-related symbols.Ghidra/Features/Decompiler/src/decompile/cpp/semantics.hh (1)
37-37: LGTM!The addition of
j_offset=13to theconst_typeenum follows the sequential numbering convention and extends the constant template types to support offset-based operands.Ghidra/Features/Decompiler/src/decompile/cpp/slghpatexpress.cc (1)
481-482: LGTM!The addition of
offset_exphandling in the C++ implementation mirrors the Java version and follows the established pattern for restoring pattern expressions. The placement maintains consistency with the instruction-value expression hierarchy.Ghidra/Framework/SoftwareModeling/src/main/antlr/ghidra/sleigh/grammar/SleighCompiler.g (6)
345-345: LGTM!The addition of
offset_symbolto thespecific_symbolrule enables offset symbols to be recognized as valid specific symbols during parsing, consistent with other instruction-related symbols.
843-843: LGTM!The addition of
offset_symboltopattern_symbolallows offset symbols to be used in pattern expressions, maintaining consistency with other instruction-value symbols.
877-877: LGTM!The addition of
offset_symboltopattern_symbol2ensures offset symbols can be used in all pattern expression contexts, completing the integration with the pattern expression framework.
949-949: LGTM!The addition of
offset_symbolto theglobalsetcontext enables offset symbols to participate in context-setting operations, consistent with other specific symbol types.
1177-1177: LGTM!The addition of
offset_symbolto assignment statement handling allows offset symbols to be used as lvalues, maintaining parity with other instruction-related symbols.
1494-1494: LGTM!The addition of
offset_symbolto the function application handling enables offset symbols to be used in subpiece operations, completing the semantic integration.Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghpatexpress/OffsetInstructionValue.java (1)
28-67: LGTM!The
OffsetInstructionValueclass is well-structured and follows the established pattern for instruction-value classes. The implementation correctly:
- Extends
PatternValuewith proper constructor delegation- Retrieves the offset via
pos.getOffset(-1)- Returns appropriate min/max values (0) for a runtime-determined offset
- Implements XML serialization with the
offset_exptagGhidra/Features/Decompiler/src/decompile/cpp/slghpatexpress.hh (1)
157-169: LGTM!The
OperandOffsetValueclass is the C++ counterpart to the JavaOffsetInstructionValueand follows the established pattern. The implementation correctly:
- Derives from
PatternValuewith appropriate virtual method overrides- Retrieves offset via
walker.getOffset(-1), consistent with the Java implementation- Returns appropriate min/max values (0) for a runtime-determined offset
- Uses the
offset_expXML tag for serialization consistencyGhidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghpatexpress/PatternExpression.java (1)
87-89: LGTM! Consistent pattern for offset expression handling.The addition of the "offset_exp" branch follows the same pattern as other instruction value types (start_exp, end_exp, etc.), creating an OffsetInstructionValue and delegating restoration to the standard flow.
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/SymbolTable.java (1)
300-302: LGTM! Proper symbol header restoration.The "offset_sym_head" handling is correctly positioned between start_sym_head and end_sym_head, following the established pattern for symbol restoration.
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/semantics/ConstTpl.java (2)
49-50: LGTM! Enum extension is properly placed.Adding j_offset at the end of the existing constants is safe and avoids breaking existing ordinal values.
301-303: LGTM! XML serialization/deserialization correctly implemented.The XML handling for j_offset is consistent with other constant types, using "offset" as the type string in both save and restore operations.
Also applies to: 358-360
Ghidra/Features/Decompiler/src/decompile/cpp/pcodeparse.y (1)
42-42: LGTM! Parser integration follows established patterns.The OFFSETSYM token is properly declared, added to the union, integrated into grammar rules, and mapped in the lexer—all consistent with how other symbol types (STARTSYM, ENDSYM, etc.) are handled.
Also applies to: 82-82, 230-230, 758-760
Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.hh (2)
30-30: LGTM! Enum value logically positioned.Placing offset_symbol between start_symbol and end_symbol is consistent with the semantic relationship of these instruction-relative symbols.
379-394: LGTM! Class declaration follows established patterns.The OffsetSymbol class properly extends SpecificSymbol and declares all necessary methods and members consistent with similar classes (StartSymbol, EndSymbol).
Ghidra/Features/Decompiler/src/decompile/cpp/slghparse.y (1)
62-62: LGTM! Consistent parser integration.The OFFSETSYM integration mirrors the pcodeparse.y changes and correctly includes it in both specificsymbol and anysymbol grammar rules, enabling proper recognition across different parsing contexts.
Also applies to: 127-127, 509-509, 585-585
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/app/plugin/processors/sleigh/symbol/OffsetSymbol.java (2)
44-46: LGTM! Core methods properly implemented.The pattern expression, fixed handle retrieval, and printing methods are correctly implemented, providing the expected offset-based semantics.
Also applies to: 52-57, 63-66, 69-71
76-80: LGTM! XML restoration uses correct element name.The restoreXml method correctly uses "offset_sym" which matches the expected element name in the restoration flow.
Note: This implementation appears to lack saveXml and saveXmlHeader methods. Verify whether these are inherited from the base class or if they need explicit implementation.
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/OffsetSymbol.java (1)
52-57: LGTM! Core implementation is correct.The constructor, VarnodeTpl generation, getFixedHandle, print, and restoreXml methods are properly implemented, providing the expected offset-based functionality consistent with other instruction-relative symbols.
Also applies to: 68-73, 75-80, 103-107
| case j_offset: | ||
| return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address | ||
| case j_next: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operand_offset currently resolves to the instruction start.
ConstTpl::fix() returns walker.getAddr().getOffset() for j_offset, which is identical to j_start. As a result the new operand_offset symbol never yields the operand’s byte offset; it just repeats the instruction start address. That breaks the feature everywhere this constant is consumed. Use the parser walker’s operand-offset accessor (same one used by OffsetInstructionValue) instead:
- case j_offset:
- return walker.getAddr().getOffset(); // Fill in starting address placeholder with real address
+ case j_offset:
+ return walker.getOffset(-1); // Fill in operand-offset placeholder with actual operand position🤖 Prompt for AI Agents
In Ghidra/Features/Decompiler/src/decompile/cpp/semantics.cc around lines
124-126, ConstTpl::fix() currently returns walker.getAddr().getOffset() for
j_offset which yields the instruction start (same as j_start) instead of the
operand byte offset; replace that return to call the parser walker's
operand-offset accessor (the same accessor used by OffsetInstructionValue) so
operand_offset resolves to the operand’s actual byte offset rather than the
instruction start.
|
|
||
| { | ||
| hand.space = walker.getCurSpace(); | ||
| hand.offset_space = (AddrSpace *)0; | ||
| hand.offset_offset = walker.getAddr().getOffset(); // Get starting address of instruction | ||
| hand.size = hand.space->getAddrSize(); | ||
| } | ||
|
|
||
| void OffsetSymbol::print(ostream &s,ParserWalker &walker) const | ||
|
|
||
| { | ||
| intb val = (intb) walker.getAddr().getOffset(); | ||
| s << "0x" << std::hex << val << std::dec; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operand offset needs operand-relative displacement
OffsetSymbol::getFixedHandle() and print() still return the instruction start (walker.getAddr().getOffset()), so operand_offset collapses to start and never exposes the intended operand displacement. Use the operand-offset value (e.g., via patexp->getValue(walker) or by adding the operand’s relative offset to the instruction base) so the symbol actually reports the current operand’s offset bytes. Right now every use of operand_offset will be wrong.
🤖 Prompt for AI Agents
In Ghidra/Features/Decompiler/src/decompile/cpp/slghsymbol.cc around lines
1226-1239, both OffsetSymbol::getFixedHandle() and OffsetSymbol::print()
currently use walker.getAddr().getOffset() (instruction start) which causes
operand_offset to equal the instruction base; replace that with the
operand-relative displacement so the symbol reflects the operand offset: in
getFixedHandle() compute the offset by adding the operand's relative offset to
the instruction base (e.g. base = walker.getAddr().getOffset(); rel =
patexp->getValue(walker) or whatever mechanism provides the operand-relative
value; hand.offset_offset = base + rel) and in print() use the same computed
operand offset (not walker.getAddr().getOffset()) when formatting the hex value.
| public void saveXml(PrintStream s) { | ||
| s.append("<start_sym"); | ||
| saveSleighSymbolXmlHeader(s); | ||
| s.println("/>"); | ||
| } | ||
|
|
||
| @Override | ||
| public void saveXmlHeader(PrintStream s) { | ||
| s.append("<start_sym_head"); | ||
| saveSleighSymbolXmlHeader(s); | ||
| s.append("/>\n"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Wrong XML tags used for OffsetSymbol serialization.
The saveXml and saveXmlHeader methods emit "start_sym" and "start_sym_head" tags, but this is an OffsetSymbol, not a StartSymbol. This copy-paste error will cause serialization/deserialization failures:
- SymbolTable.java (line 300-302) expects "offset_sym_head" during restoration
- The Java implementation (sleigh.symbol OffsetSymbol.java, line 77) uses "offset_sym"
- getType() at line 49 correctly returns offset_symbol, confirming this should not use start_sym tags
Apply this diff to fix the XML tag names:
@Override
public void saveXml(PrintStream s) {
- s.append("<start_sym");
+ s.append("<offset_sym");
saveSleighSymbolXmlHeader(s);
s.println("/>");
}
@Override
public void saveXmlHeader(PrintStream s) {
- s.append("<start_sym_head");
+ s.append("<offset_sym_head");
saveSleighSymbolXmlHeader(s);
s.append("/>\n");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public void saveXml(PrintStream s) { | |
| s.append("<start_sym"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.println("/>"); | |
| } | |
| @Override | |
| public void saveXmlHeader(PrintStream s) { | |
| s.append("<start_sym_head"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.append("/>\n"); | |
| } | |
| public void saveXml(PrintStream s) { | |
| s.append("<offset_sym"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.println("/>"); | |
| } | |
| @Override | |
| public void saveXmlHeader(PrintStream s) { | |
| s.append("<offset_sym_head"); | |
| saveSleighSymbolXmlHeader(s); | |
| s.append("/>\n"); | |
| } |
🤖 Prompt for AI Agents
In
Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcodeCPort/slghsymbol/OffsetSymbol.java
around lines 89 to 100, the XML tag names are incorrect (they use "start_sym"
and "start_sym_head"); change them to "offset_sym" and "offset_sym_head"
respectively so serialization matches SymbolTable and the Java sleigh.symbol
implementation — keep the existing self-closing syntax and newline behavior
(saveXml should emit "<offset_sym .../>" and saveXmlHeader should emit
"<offset_sym_head .../>\n") to ensure correct deserialization.
|
/refacto-visz |
|
Refacto is reviewing this PR. Please wait for the review comments to be posted. |
Pull Request Feedback 🔍
|
|
CodeAnt AI finished reviewing your PR. |
Code Review: Symbol Integration Enhancement👍 Well Done
📁 Selected files for review (26)
🎯 Custom Instructions
📝 Additional Comments
|
CodeAnt-AI Description
Expose operand_offset (instruction address offset) to Sleigh language users
What Changed
operand_offsetthat evaluates to the current instruction's address offset and can be used in Sleigh patterns, expressions, and pcode specifications.operand_offsetin language files and grammar; it is persisted/restored in XML as offset_exp/offset_sym so specs using it parse and compile correctly.operand_offsetis rendered as a hex value (e.g., 0x...) representing the instruction start address.Impact
✅ Can reference instruction address offset in Sleigh patterns✅ Fewer Sleigh spec parse/compile errors when using operand_offset✅ Disassembly/decompiler can show instruction offset where specified💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit
New Features
Style