Skip to content

Conversation

@arvi18
Copy link

@arvi18 arvi18 commented Apr 26, 2025

At the moment, when collapsing block, the ruleBlockIfNoExit eagerly selects the first noexit branch it find. This merge request introduces some basic heuristics when both branch are noexit to improve the decompiled code's readability.

heuristic 1 : smallest block depth first

This first heuristic compares the depth in term of generated code block and select the smallest depth.
This tries to flatten the code by preferring small sequential "if return" to nested if blocks.

before after
h1-before h1-after

heuristic 2 : return last

If only one of the branch will end with a return in the final code, place it last.
This makes to code closer to what a human would write.

before after
h2-before h2-after

heuristic 3 : only return

If both branch end with a return in the final code and one contains only the return op code, select it.
This also tries to flatten the code as a block containing only a return is most probably a fast logic skip.

before
h3-before
after
h3-after

These heuristics are tried in order (1 to 3) stopping at the first that matches. If none match, we fallback to the previous behavior (selecting the first branch) to only introduce wanted changes.

Here is the full decompiled code of libc before and after this merge request to compare the diffs on a large binary.

Summary by CodeRabbit

  • New Features

    • Added the ability to query the nesting depth of control-flow blocks in the decompiler, providing insight into structural hierarchy.
    • Introduced a method to retrieve the number of operations in a basic block.
    • Added a way to identify standard return operations during decompilation.
  • Improvements

    • Enhanced the selection logic for identifying the best no-exit clause in conditional blocks, improving control-flow analysis.

@coderabbitai
Copy link

coderabbitai bot commented Apr 26, 2025

Walkthrough

The changes introduce new methods to compute and query the nesting depth of control flow blocks in the decompiler's internal representation. Several block classes now override a virtual getBlockDepth() method, providing a uniform interface for structural introspection. A method to determine the number of operations in a basic block is also added. Additionally, logic for selecting the best "no-exit" clause in conditional blocks is improved through a new heuristic-based helper method. A utility to identify standard return operations is introduced in the Pcode operation class. All changes are additive and do not alter existing control flow.

Changes

File(s) Change Summary
.../decompile/cpp/block.cc, .../decompile/cpp/block.hh Added/overrode methods to compute block depth for various block types; added method to get basic block op count.
.../decompile/cpp/blockaction.cc, .../decompile/cpp/blockaction.hh Improved "if no-exit" rule with new heuristic selection logic; added helper method for selecting best no-exit clause.
.../decompile/cpp/op.hh Added method to determine if a Pcode operation is a standard return.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Decompiler
    participant FlowBlock
    participant BlockGraph
    participant BlockList/BlockCondition/BlockSwitch/etc.

    User->>Decompiler: Request block nesting depth
    Decompiler->>FlowBlock: getBlockDepth()
    FlowBlock-->>Decompiler: Return 0 or overridden value
    Decompiler->>BlockGraph: getBlockDepth()
    BlockGraph->>BlockGraph: getInnerBlockDepth()
    BlockGraph->>Contained Blocks: getBlockDepth() (recursive)
    Contained Blocks-->>BlockGraph: Return depth
    BlockGraph-->>Decompiler: Return max depth + 1
    Decompiler-->>User: Return overall block depth
Loading
sequenceDiagram
    participant CollapseStructure
    participant FlowBlock
    participant selectBestNoExit

    CollapseStructure->>FlowBlock: Examine outgoing edges for no-exit clauses
    FlowBlock-->>CollapseStructure: Return candidate clauses
    CollapseStructure->>selectBestNoExit: Compare two clauses
    selectBestNoExit-->>CollapseStructure: Return best clause index
    CollapseStructure->>FlowBlock: Apply selection logic and update condition
Loading

Poem

In tunnels deep, the blocks now know
How far their nested roots will grow.
With wisdom keen, the rabbit peers,
Through graphs and clauses, depth appears.
Returns are standard, exits clear—
The warren’s logic shines this year!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @arvi18, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses an issue in the decompiler where the ruleBlockIfNoExit function would eagerly select the first no-exit branch when collapsing blocks. The changes introduce heuristics to improve the readability of the decompiled code when both branches are no-exit. These heuristics prioritize smaller block depths, placing returns last, and selecting branches containing only a return opcode. The heuristics are applied in order, falling back to the original behavior if none match. The changes involve modifications to block.cc, block.hh, blockaction.cc, blockaction.hh, and op.hh to implement these heuristics and calculate block depths.

Highlights

  • Heuristics for No-Exit Branch Selection: Introduces three heuristics to improve the selection of no-exit branches in ruleBlockIfNoExit: smallest block depth first, return last, and only return.
  • Block Depth Calculation: Adds methods to calculate the depth of code blocks (getBlockDepth) in various block types (BlockGraph, BlockList, BlockCondition, BlockSwitch) to support the first heuristic.
  • Return Statement Detection: Adds isStandardReturn method to PcodeOp to identify return operations for the second and third heuristics.

Changelog

Click here to see the changelog
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.cc
    • Added getInnerBlockDepth method to BlockGraph to calculate the maximum depth of child blocks (lines 1326-1337).
    • Added getOpSize method to BlockBasic to return the number of PcodeOps contained in the block (lines 2522-2526).
    • Added getBlockDepth method to BlockList to return the inner block depth (lines 2770-2775).
    • Added getBlockDepth method to BlockCondition to return the inner block depth (lines 2861-2866).
    • Added getBlockDepth method to BlockSwitch to calculate block depth based on case block depths (lines 3222-3232).
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.hh
    • Added virtual getBlockDepth method to FlowBlock (line 184).
    • Added getInnerBlockDepth and getBlockDepth methods to BlockGraph (lines 303-304).
    • Added getOpSize method to BlockBasic (line 407).
    • Added getBlockDepth method to BlockList (line 508).
    • Added getBlockDepth method to BlockCondition (line 538).
    • Added getBlockDepth method to BlockSwitch (line 684).
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc
    • Modified ruleBlockIfNoExit to incorporate heuristics for selecting the best no-exit branch (lines 1484-1488).
    • Added selectBestNoExit method to compare two no-exit branches based on block depth, return statements, and block content (lines 1504-1534).
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh
    • Added selectBestNoExit declaration to CollapseStructure class (line 220).
  • Ghidra/Features/Decompiler/src/decompile/cpp/op.hh
    • Added isStandardReturn method to PcodeOp to check if the op is a standard return (line 180).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


When 'if's paths both end,
A choice must now be made,
Depth, return, or less.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces heuristics to improve the readability of decompiled code when both branches of an if statement are noexit. The heuristics prioritize smaller block depth, placing returns last, and selecting branches with only return opcodes. The changes seem well-structured and the provided examples are helpful in understanding the intent. The before/after examples are very helpful to illustrate the impact of the changes.

Summary of Findings

  • Missing null check: In BlockBasic::getOpSize, there is no check to ensure that op is not empty before calling op.size(). This could lead to a crash if the block has no operations. While this might be an uncommon scenario, it's good to add a check for robustness.
  • Potential integer overflow: In BlockSwitch::getBlockDepth, the calculation maxDepth + 2 could potentially lead to an integer overflow if maxDepth is sufficiently large. While the likelihood of this occurring in practice might be low, it's worth considering adding a check to prevent this.

Merge Readiness

The pull request introduces valuable heuristics to improve decompiled code readability. While the changes are generally well-structured, there are a couple of potential issues identified that should be addressed before merging. I recommend addressing the critical and high severity issues before merging. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.

Comment on lines +2522 to +2526
int4 BlockBasic::getOpSize(void)

{
return op.size();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is no check to ensure that op is not empty before calling op.size(). This could lead to a crash if the block has no operations. While this might be an uncommon scenario, it's good to add a check for robustness.

Suggested change
int4 BlockBasic::getOpSize(void)
{
return op.size();
}
{
if (op.empty()) {
return 0; // Or some other appropriate value
}
return op.size();
}

maxDepth=depth;
}
}
return maxDepth+2; // +1 for switch block and +1 for case/default block

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The calculation maxDepth + 2 could potentially lead to an integer overflow if maxDepth is sufficiently large. While the likelihood of this occurring in practice might be low, it's worth considering adding a check to prevent this.

Suggested change
return maxDepth+2; // +1 for switch block and +1 for case/default block
{
if (maxDepth > INT_MAX - 2) { // INT_MAX is defined in <limits.h>
return INT_MAX; // Or handle the overflow in some other way
}
return maxDepth + 2; // +1 for switch block and +1 for case/default block
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc (1)

1464-1498: Guard logic in ruleBlockIfNoExit is harder to reason about

The current two-pass approach (bestIndex initialised to -1, then potentially overwritten in the second match) works but:

  1. Calls selectBestNoExit() even when only one branch qualifies, doing an unnecessary depth walk.
  2. Relies on side-effects of the loop index to detect “second match”, which hampers readability.

A simpler, single-pass selection avoids both issues:

-  for(i=0;i<2;++i) {
-    clauseblock = bl->getOut(i);
-    ...
-    if (bestIndex==-1){
-      bestIndex=i;
-    }else{ // both match
-      bestIndex = selectBestNoExit(bl->getOut(0),bl->getOut(1));
-    }
-  }
+  FlowBlock* candidates[2] = {nullptr,nullptr};
+  int candCount = 0;
+  for(i=0;i<2;++i) {
+    clauseblock = bl->getOut(i);
+    if (!/* all the same qualification tests */) continue;
+    candidates[candCount++] = clauseblock;
+  }
+
+  if (candCount == 0) return false;
+  bestIndex = (candCount == 1)
+                ? (candidates[0] == bl->getOut(0) ? 0 : 1)
+                : selectBestNoExit(bl->getOut(0), bl->getOut(1));

This removes hidden coupling on i and prevents redundant work.

Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh (1)

220-221: Minor: mark selectBestNoExit as static

The helper does not rely on this state; making it static (or even moving it to an anonymous namespace inside the .cc) clarifies that it is purely functional and avoids the accidental capture of member data in the future.

-  int4 selectBestNoExit(FlowBlock *clause0,FlowBlock *clause1);
+  static int4 selectBestNoExit(FlowBlock *clause0, FlowBlock *clause1);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2abfd4e and 5f927f3.

📒 Files selected for processing (5)
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.cc (5 hunks)
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.hh (6 hunks)
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc (2 hunks)
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh (1 hunks)
  • Ghidra/Features/Decompiler/src/decompile/cpp/op.hh (1 hunks)
🔇 Additional comments (6)
Ghidra/Features/Decompiler/src/decompile/cpp/block.cc (5)

1326-1338: Implemented getInnerBlockDepth method to support block depth querying

This new method calculates the maximum block depth across all sub-blocks in a BlockGraph, which is a key part of the improved heuristics for selecting between no-exit branches.


2522-2526: Added getOpSize method to count operations in a basic block

Simple but useful accessor that returns the number of operations in a basic block, which will help in determining if a branch contains "only return" for the new heuristics.


2770-2775: Added getBlockDepth implementation for BlockList

This implementation specifies that list blocks join blocks together but don't increase depth themselves, maintaining a flat hierarchy when appropriate.


2861-2866: Added getBlockDepth implementation for BlockCondition

Similar to BlockList, this implementation specifies that condition blocks also join blocks together without increasing depth.


3222-3232: Added getBlockDepth implementation for BlockSwitch

This implementation correctly accounts for the nesting hierarchy of switch constructs by adding 2 to the maximum depth found in case blocks - one for the switch block itself and one for the case/default blocks.

Ghidra/Features/Decompiler/src/decompile/cpp/op.hh (1)

179-180: Well-implemented helper method for identifying standard returns.

The new isStandardReturn() method clearly distinguishes standard return operations from other return-like operations with special halt conditions. This addition supports the "Return last" heuristic mentioned in the PR objectives, allowing the decompiler to identify when a branch ends with a standard return operation.

Comment on lines +303 to +304
virtual int4 getInnerBlockDepth(); ///< Return max depth of child blocks
virtual int4 getBlockDepth() {return getInnerBlockDepth()+1;}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Propagate const/override to all implementations

The newly added overrides in BlockGraph, BlockList, BlockCondition, and BlockSwitch follow the non-const prototype.
After adjusting the base declaration, remember to update these:

-virtual int4 getInnerBlockDepth();
-int4 getBlockDepth() { return getInnerBlockDepth() + 1; }
+int4 getInnerBlockDepth() const;
+int4 getBlockDepth() const override { return getInnerBlockDepth() + 1; }

and similarly for the other three classes.

Failing to add override can silently create a new unrelated virtual instead of overriding, should the signature drift in the future.

Also applies to: 508-509, 538-539, 684-685

Comment on lines +184 to 185
virtual int4 getBlockDepth(void) {return 0;} ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Declare getBlockDepth() as const and mark overrides explicitly

getBlockDepth() is a query that does not mutate state, yet it’s non-const in the base class and all overrides.
Keeping it non-const prevents calling it on const FlowBlock objects, forces const_casts elsewhere, and diverges from the surrounding accessors (e.g. getType()).

-  virtual int4 getBlockDepth(void) {return 0;}        ///< Return the depth in code block of \b this
+  virtual int4 getBlockDepth(void) const { return 0; } ///< Return the depth in code block of \b this

Then, in every derived class add the const qualifier and the override specifier, e.g.

-  virtual int4 getBlockDepth() { return getInnerBlockDepth() + 1; }
+  int4 getBlockDepth() const override { return getInnerBlockDepth() + 1; }

Besides catching accidental signature mismatches at compile-time, this makes the intent explicit and enables further optimisation by the compiler.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
virtual int4 getBlockDepth(void) {return 0;} ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream
virtual int4 getBlockDepth(void) const { return 0; } ///< Return the depth in code block of \b this
void saveXmlEdges(ostream &s) const; ///< Save edge information to an XML stream

Comment on lines +407 to 408
int4 getOpSize(void); ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

BlockBasic::getOpSize() should be const

The helper only inspects the op list; making it const allows calling it on const-qualified FlowBlock pointers (as done in selectBestNoExit) without a const_cast.

-int4 getOpSize(void);
+int4 getOpSize(void) const;

Remember to update the definition in block.cc.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
int4 getOpSize(void); ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);
int4 getOpSize(void) const; ///< Number of PcodeOps contained in \b this block
static bool noInterveningStatement(PcodeOp *first,int4 path,PcodeOp *last);

Comment on lines +1504 to 1534
int4 CollapseStructure::selectBestNoExit(FlowBlock *clause0,FlowBlock *clause1)

{
// select lowest block depth
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
if (depth0<depth1)return 0;
if (depth1<depth0)return 1;

// same depth, prefer non return
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;

// prefer block containing only return op
if(isRet0){ // both are return
FlowBlock* fb;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
}
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
}
graph.newBlockIf(bl,clauseblock);
return true;
}
return false;

// fall back to previous behavior
return 0;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

selectBestNoExit – unchecked casts and missing headers

  1. static_cast<BlockCopy*>(clauseX) is used without verifying that clauseX really is a t_copy. If an unforeseen FlowBlock::block_type slips through, the cast is UB. Add a guard:
if (clause0->getType() == FlowBlock::t_copy) {
  auto *cpy = static_cast<BlockCopy*>(clause0);
  ...
}
  1. BlockCopy / BlockBasic are referenced but block.hh is not included in blockaction.cc. Today this works only because a chain of includes drags the header in indirectly; add an explicit include to avoid accidental breakage:
#include "block.hh"
  1. Consider early-returning when the depths differ to avoid computing return properties needlessly.

  2. A nit: int4 depth0 = ... can under-flow if getBlockDepth() ever returns -1 to signal “unknown”. Guard with assert(depth >= 0) or treat -1 as “worst” to prevent negative comparison surprises.

@visz11
Copy link
Collaborator

visz11 commented Jul 22, 2025

/refacto-test

@refacto-test
Copy link

refacto-test bot commented Jul 22, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

// if (clauseblock->isInteriorGotoTarget()) {
// bl->setGotoBranch(i);
// return true;
// }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Potential Null Pointer Dereference detected at blockaction.cc:1483-1484.

if (bestIndex==-1){
	bestIndex=i;
}

The code assumes i is a valid index at this point, but there's no explicit check that i has been initialized or is within valid range before using it. While the surrounding code context suggests i should be valid here (it's used in a loop where i is initialized), adding an explicit bounds check would prevent potential null pointer dereferences if the code flow changes in the future.

Fix Recommendation:

if (bestIndex==-1 && i >= 0 && i < bl->sizeOut()){
	bestIndex=i;
}

@visz11
Copy link
Collaborator

visz11 commented Sep 11, 2025

/refacto-test-arvi

1 similar comment
@visz11
Copy link
Collaborator

visz11 commented Sep 11, 2025

/refacto-test-arvi

@refacto-visz
Copy link

refacto-visz bot commented Sep 11, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

@refacto-visz
Copy link

refacto-visz bot commented Sep 11, 2025

Code Review: Branch Selection Logic in Decompiler

👍 Well Done
Enhanced Branch Selection

Improved logic for selecting optimal NoExit branch increases decompilation reliability.

Structured Depth Analysis

Block depth tracking enables more predictable control flow handling.

Comprehensive Selection Logic

New algorithm thoroughly evaluates branch quality with multiple selection criteria.

📌 Files Processed
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/op.hh
📝 Additional Comments
Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc (5)
Missing Error Logging

No diagnostic logging when branch selection fails. Adding logging would improve debugging of unexpected NoExit branch conditions.

Standards:

  • ISO-IEC-25010-Reliability-Recoverability
  • SRE-Observability
Redundant Depth Check

Block depth calculation may be expensive as it traverses the block hierarchy. Consider caching depth values if selectBestNoExit() is called frequently in performance-critical paths. This would avoid redundant depth calculations.

Standards:

  • ISO-IEC-25010-Performance-Resource-Utilization
  • Algorithm-Opt-Memoization
Conditional Optimization

Multiple calls to lastOp() could be optimized by storing the result in a variable. Consider caching lastOp() results to avoid redundant method calls, reducing function call overhead in this performance-sensitive decompiler path.

Standards:

  • ISO-IEC-25010-Performance-Time-Behaviour
  • Algorithm-Opt-Variable-Reuse
Inconsistent Return Style

Inconsistent spacing in return statements (some have space after 'if' and before 'return', others don't). Maintaining consistent style improves readability and reduces cognitive load during maintenance.

Standards:

  • Clean-Code-Formatting
  • Clean-Code-Style-Consistency
Reusable Function Opportunity

Duplicate code pattern checking block type and size for both clauses. Extracting a helper function would reduce duplication and improve maintainability by centralizing this logic.

Standards:

  • Clean-Code-DRY
  • Design-Pattern-Extract-Method

Comment on lines +1522 to +1524
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Null Dereference

The subBlock(0) call might return null, which isn't checked before dereferencing. If null, fb->getType() will cause a null pointer dereference crash.

Suggested change
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
}
Standards
  • ISO-IEC-25010-Reliability-Fault-Tolerance
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • DbC-Preconditions

Comment on lines +1526 to +1528
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate Null Dereference

Similar to previous issue, subBlock(0) call might return null with no null check before dereferencing. Potential crash when accessing fb->getType().

Suggested change
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
}
Standards
  • ISO-IEC-25010-Reliability-Fault-Tolerance
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • DbC-Preconditions

Comment on lines +1514 to +1517
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return Block Check

Logic prefers non-return blocks, but the return values are reversed. When clause0 is a return block and clause1 isn't, it returns 1 (selecting clause1), but the comment says 'prefer non return'.

Suggested change
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1; // clause1 is non-return, prefer it
if(isRet1 && !isRet0) return 0; // clause0 is non-return, prefer it
Standards
  • Logic-Verification-Boundary-Conditions
  • Algorithm-Correctness-Conditional-Logic

Comment on lines +1520 to 1529
if(isRet0){ // both are return
FlowBlock* fb;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
}
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex Conditional Logic

Nested conditionals with repeated pattern and early returns create complex control flow. Extracting helper methods for block evaluation would improve readability and maintainability by reducing complexity.

Standards
  • Clean-Code-Functions
  • Design-Pattern-Extract-Method

@visz11
Copy link
Collaborator

visz11 commented Sep 15, 2025

/refacto-test-arvi

@refacto-visz
Copy link

refacto-visz bot commented Sep 15, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

@refacto-visz
Copy link

refacto-visz bot commented Sep 15, 2025

Code Review: Branch Selection Algorithm

👍 Well Done
Improved Branch Selection Logic

Enhanced decision process for selecting optimal branches improves control flow structure.

Block Depth Tracking

Added getBlockDepth methods provide consistent depth calculation across block types.

📌 Files Processed
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/op.hh
📝 Additional Comments
Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc (8)
Null Pointer Risk

Double dereference of lastOp() without intermediate null check. If first check passes but second fails, potential null pointer dereference occurs.

Standards:

  • CWE-476
  • CWE-690
Inconsistent Return Logic

The function returns 0 if clause0 is a simple return but doesn't check clause1 first. If both are simple returns, it will always select clause0, creating inconsistent selection logic.

Standards:

  • Algorithm-Correctness-Decision-Logic
  • Logic-Verification-Consistency
  • Business-Rule-Validation
Unchecked Cast Risk

Unchecked cast from FlowBlock to BlockCopy. If type check passes but cast fails, could cause memory corruption or crash.

Standards:

  • CWE-704
  • CWE-843
Default Return Value

Function returns 0 as fallback without clear documentation why first branch is preferred by default. Could lead to inconsistent decompilation results in edge cases.

Standards:

  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Maturity
Redundant Condition Check

Comment indicates both are return but only isRet0 is checked. Consider validating isRet1 is also true to ensure comment accuracy and prevent logical errors.

Standards:

  • ISO-IEC-25010-Functional-Correctness-Appropriateness
Comparison Chain Efficiency

The selection logic performs multiple comparisons with redundant null checks. Consider optimizing by caching lastOp() results and combining related checks to reduce branching and potential null pointer dereferences. This would improve both performance and code robustness in the hot path.

Standards:

  • ISO-IEC-25010-Performance-Resource-Utilization
  • Algorithm-Opt-Branch-Reduction
Magic Number Usage

Direct return of magic numbers 0 and 1 reduces code readability. Consider using named constants to clarify the meaning of these branch index values throughout the selection logic.

Standards:

  • Clean-Code-Meaningful-Names
  • Clean-Code-Magic-Numbers
Early Return Opportunity

The selectBestNoExit method implements a series of checks without early returns for some conditions. Using consistent early returns would improve readability and reduce cognitive complexity.

Standards:

  • Clean-Code-Function-Design
  • Clean-Code-Control-Flow
Ghidra/Features/Decompiler/src/decompile/cpp/block.cc (1)
Method Naming Clarity

Method name getOpSize() doesn't clearly communicate it returns the count of operations. A name like getOperationCount() would better express the method's purpose and return value type.

Standards:

  • Clean-Code-Meaningful-Names
  • Clean-Code-Intention-Revealing

Comment on lines +1507 to +1510
// select lowest block depth
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
if (depth0<depth1)return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Return Value

The function lacks a return value when depth0 equals depth1 and both blocks are not returns. If execution reaches line 1532 without hitting any return statement, undefined behavior occurs.

Suggested change
// select lowest block depth
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
if (depth0<depth1)return 0;
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
if (depth0<depth1) return 0;
if (depth1<depth0) return 1;
// If depths are equal, continue to next checks
Standards
  • Algorithm-Correctness-Control-Flow
  • Logic-Verification-Completeness
  • Mathematical-Accuracy-Return-Values

@visz11
Copy link
Collaborator

visz11 commented Sep 22, 2025

/refacto-test

@refacto-test
Copy link

refacto-test bot commented Sep 22, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

@refacto-test
Copy link

refacto-test bot commented Sep 22, 2025

Code Review: Branch Selection Logic in Decompiler

👍 Well Done
Improved Branch Selection

Enhanced control flow logic with systematic branch selection criteria improves decompiler output quality.

Block Depth Tracking

Added proper block depth tracking enables more accurate control flow representation.

📌 Files Processed
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.cc
  • Ghidra/Features/Decompiler/src/decompile/cpp/block.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.hh
  • Ghidra/Features/Decompiler/src/decompile/cpp/op.hh
📝 Additional Comments
Ghidra/Features/Decompiler/src/decompile/cpp/blockaction.cc (7)
Missing Return Value

The isRet0 block doesn't have a default return value if none of the conditions are met. This could lead to falling through to the default return 0 at line 1533, potentially selecting a suboptimal branch.

Standards:

  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Maturity
Redundant Type Checking

Nearly identical type checking and condition evaluation is performed twice for both clause0 and clause1. This pattern creates code duplication and potentially redundant type checks. Consider refactoring into a helper function to improve performance and maintainability.

Standards:

  • ISO-IEC-25010-Performance-Efficiency-Resource-Utilization
  • Optimization-Pattern-Code-Reuse
  • Algorithmic-Complexity-Duplicate-Evaluation
Missing Error Handling

The code performs type casting without validating if subBlock(0) exists or returns a valid pointer. If subBlock(0) returns null or an invalid pointer, dereferencing it with getType() could cause undefined behavior or crashes.

Standards:

  • Algorithm-Correctness-Error-Handling
  • Logic-Verification-Null-Check
  • Business-Rule-Input-Validation
Redundant Variable Declaration

Variable 'fb' is declared but only used within conditional blocks. This creates potential logical confusion as the variable appears to have broader scope than needed and could be reused across different conditional blocks with different meanings.

Standards:

  • Algorithm-Correctness-Variable-Scope
  • Logic-Verification-Data-Flow
Consistent Return Style

Return statement lacks space after control flow keyword. Adding space between 'return' and '0' would improve consistency with other code patterns and enhance readability.

Standards:

  • Clean-Code-Formatting
  • Maintainability-Quality-Consistency
Inconsistent Spacing Style

Missing space after 'if' keyword differs from other conditional statements in the file. Consistent spacing improves readability and maintainability by establishing visual patterns.

Standards:

  • Clean-Code-Formatting
  • Maintainability-Quality-Consistency
Method Documentation

While the method has a good descriptive comment, adding inline comments for each selection criterion would clarify the priority logic. This would help future maintainers understand the decision hierarchy.

Standards:

  • Clean-Code-Comments
  • Maintainability-Quality-Documentation

Comment on lines +1522 to +1523
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null Pointer Risk

Dereferencing fb without null check before using it in the condition at line 1524. If subBlock(0) returns null, this will cause a null pointer dereference crash, reducing decompiler reliability.

    if(clause0->getType()==FlowBlock::t_copy){
      fb = ((BlockCopy*)clause0)->subBlock(0);
      if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
Commitable Suggestion
Suggested change
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
Standards
  • ISO-IEC-25010-Reliability-Fault-Tolerance
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • SRE-Error-Handling

Comment on lines +1526 to +1527
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar Null Risk

Same null pointer risk as RFC-01. The fb pointer is used at line 1528 without verifying that subBlock(0) returned a valid pointer, potentially causing decompiler crashes during analysis.

    if(clause1->getType()==FlowBlock::t_copy){
      fb = ((BlockCopy*)clause1)->subBlock(0);
      if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
Commitable Suggestion
Suggested change
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
if(fb != nullptr && fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 1;
Standards
  • ISO-IEC-25010-Reliability-Fault-Tolerance
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • SRE-Error-Handling

Comment on lines +1520 to +1521
if(isRet0){ // both are return
FlowBlock* fb;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninitialized Variable Usage

Variable fb is declared but not initialized before potential use in conditional statements. If the first condition at line 1522 fails, fb remains uninitialized but might be used later, causing undefined behavior.

Standards
  • ISO-IEC-25010-Reliability-Maturity
  • ISO-IEC-25010-Functional-Correctness-Appropriateness

Comment on lines +1514 to +1517
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent Return Checking

The logic prefers non-return blocks over return blocks, but this contradicts the later logic that specifically looks for return blocks with single operations. This creates inconsistent branch selection behavior where first non-returns are preferred, but then single-operation returns are preferred.

Standards
  • Algorithm-Correctness-Logic-Consistency
  • Logic-Verification-Control-Flow
  • Business-Rule-State-Consistency

Comment on lines +1516 to +1523
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;

// prefer block containing only return op
if(isRet0){ // both are return
FlowBlock* fb;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated Code Pattern

Nearly identical code blocks with only parameter and return value differences. Extracting a helper method that takes clause and return value would reduce duplication and improve maintainability.

Standards
  • Clean-Code-DRY
  • Refactoring-Extract-Method

Comment on lines +1508 to +1509
int4 depth0 = clause0->getBlockDepth();
int4 depth1 = clause1->getBlockDepth();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inefficient Block Depth

Block depth calculations are performed unconditionally before being used in comparisons. This creates unnecessary computation overhead when the first comparison could short-circuit evaluation. Performance degrades with complex nested blocks requiring recursive depth calculations.

Standards
  • ISO-IEC-25010-Performance-Efficiency-Time-Behavior
  • Optimization-Pattern-Lazy-Evaluation
  • Algorithmic-Complexity-Conditional-Optimization

Comment on lines +1510 to +1511
if (depth0<depth1)return 0;
if (depth1<depth0)return 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early Return Opportunity

These consecutive conditionals check mutually exclusive conditions. Using an if-else structure would better communicate the relationship between conditions and improve readability.

Standards
  • Clean-Code-Conditionals
  • Maintainability-Quality-Readability

Comment on lines +1514 to +1527
bool isRet0 = clause0->lastOp()!=(PcodeOp *)0 && clause0->lastOp()->isStandardReturn();
bool isRet1 = clause1->lastOp()!=(PcodeOp *)0 && clause1->lastOp()->isStandardReturn();
if(isRet0 && !isRet1) return 1;
if(isRet1 && !isRet0) return 0;

// prefer block containing only return op
if(isRet0){ // both are return
FlowBlock* fb;
if(clause0->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause0)->subBlock(0);
if(fb->getType()==FlowBlock::t_basic && ((BlockBasic*)fb)->getOpSize()==1) return 0;
}
if(clause1->getType()==FlowBlock::t_copy){
fb = ((BlockCopy*)clause1)->subBlock(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested Null Check

Deep nesting with multiple type checks and early returns creates complex control flow. Extracting helper methods for checking block properties would improve readability and maintainability.

Standards
  • Clean-Code-Functions
  • Refactoring-Extract-Method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants