Skip to content

decomp: code review findings about reliability issues in AST and deserialization logic#154

Open
kumarak wants to merge 2 commits intomainfrom
fix_review
Open

decomp: code review findings about reliability issues in AST and deserialization logic#154
kumarak wants to merge 2 commits intomainfrom
fix_review

Conversation

@kumarak
Copy link
Collaborator

@kumarak kumarak commented Mar 13, 2026

The review identified reliability issues, including the use of assert statements that could lead to crashes on malformed input, missing validation in JSON deserialization paths, a lack of cycle detection in temporary resolution, and weak bounds/type checking in AST construction paths. The changes replace assertions with explicit error logging and recovery, introduce cycle detection for temporaries, strengthen input validation, and improve internal state handling to prevent crashes and undefined behavior. It also removes dead code related to the pcode-translate tool.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the decompiler pipeline against malformed inputs by replacing crash-prone assertions with logged error paths, adding cycle detection during temporary resolution, strengthening JSON/type validation, and removing the dead pcode-translate tool and its tests.

Changes:

  • Removed the pcode-translate tool, its CMake wiring, and its lit tests.
  • Strengthened parsing/validation in patchir-decomp and Ghidra JSON deserialization to avoid undefined behavior on malformed JSON/type entries.
  • Improved AST construction reliability (cycle detection for temporaries, additional null/overflow checks, safer error recovery).

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/pcode-translate/main.cpp Deleted unused translation driver binary entrypoint.
tools/pcode-translate/CMakeLists.txt Removed build/install rules for the deleted tool.
tools/patchir-decomp/main.cpp Added safer language-id parsing and top-level JSON object validation before deserialization.
tools/CMakeLists.txt Stopped building the removed pcode-translate subdirectory.
test/pcode-translate/help.json Deleted lit test for removed tool.
test/pcode-translate/function.json Deleted lit test for removed tool.
test/lit.cfg.py Removed %pcode-translate tool substitution.
test/CMakeLists.txt Removed pcode-translation-tests suite registration.
lib/patchestry/Passes/Compiler.cpp Replaced stoi with checked integer parsing for language-id bit size.
lib/patchestry/Ghidra/PcodeTranslation.cpp Deleted Pcode translation registration implementation.
lib/patchestry/Ghidra/JsonDeserialize.cpp Added JSON shape checks and removed an assertion by logging/recovering instead.
lib/patchestry/Ghidra/CMakeLists.txt Removed PcodeTranslation.cpp from the ghidra static library.
lib/patchestry/Dialect/Pcode/Deserialize.cpp Deleted Pcode JSON deserializer implementation.
lib/patchestry/Dialect/Pcode/CMakeLists.txt Removed Deserialize.cpp from the Pcode dialect library build.
lib/patchestry/AST/OperationStmt.cpp Replaced several asserts with logged error paths; added extra validation in a few op builders.
lib/patchestry/AST/OperationBuilder.cpp Added cycle detection for forward temporary resolution and replaced an assert with logging.
lib/patchestry/AST/FunctionBuilder.cpp Improved parsing robustness and protected pending_materialized during re-entrancy.
include/patchestry/Ghidra/PcodeTranslation.hpp Deleted unused header for removed translation registration API.
include/patchestry/Dialect/Pcode/Deserialize.hpp Deleted unused header for removed Pcode deserialization API.
include/patchestry/AST/OperationBuilder.hpp Added resolving_temporaries set for cycle detection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 309 to +313
auto result = sema().ImpCastExprToType(expr, to_type, kind);
assert(!result.isInvalid() && "Failed to make implicit cast expr");
if (result.isInvalid()) {
LOG(ERROR) << "Failed to make implicit cast expr\n";
return nullptr;
}
Comment on lines +160 to 172
clang::Stmt *result = nullptr;
if (auto maybe_operation = operationFromKey(function, op_key)) {
auto [stmt, _] = function_builder().create_operation(ctx, *maybe_operation);
if (stmt) {
function_builder().operation_stmts.emplace(*vnode.operation, stmt);
function_builder().operation_stmts.emplace(op_key, stmt);
// Recurse: will hit Case 1 (if already local) or Case 2.
return create_temporary(ctx, function, vnode);
result = create_temporary(ctx, function, vnode);
} else {
result = stmt;
}
return stmt;
} else {
LOG(ERROR) << "Failed to get operation for key: " << op_key << "\n";
}
Comment on lines +167 to +170
int bit_size = 0;
if (llvm::StringRef(lang_vec[2]).getAsInteger(10, bit_size)) {
LOG(ERROR) << "Invalid bit size in language id: " << lang_vec[2] << "\n";
return "";
Comment on lines +240 to +244
const auto *json_obj = json->getAsObject();
if (!json_obj) {
LOG(ERROR) << "Input JSON is not an object\n";
return EXIT_FAILURE;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants