decomp: code review findings about reliability issues in AST and deserialization logic#154
Open
decomp: code review findings about reliability issues in AST and deserialization logic#154
Conversation
…ecks for interger overflow
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the decompiler pipeline against malformed inputs by replacing crash-prone assertions with logged error paths, adding cycle detection during temporary resolution, strengthening JSON/type validation, and removing the dead pcode-translate tool and its tests.
Changes:
- Removed the
pcode-translatetool, its CMake wiring, and its lit tests. - Strengthened parsing/validation in
patchir-decompand Ghidra JSON deserialization to avoid undefined behavior on malformed JSON/type entries. - Improved AST construction reliability (cycle detection for temporaries, additional null/overflow checks, safer error recovery).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/pcode-translate/main.cpp | Deleted unused translation driver binary entrypoint. |
| tools/pcode-translate/CMakeLists.txt | Removed build/install rules for the deleted tool. |
| tools/patchir-decomp/main.cpp | Added safer language-id parsing and top-level JSON object validation before deserialization. |
| tools/CMakeLists.txt | Stopped building the removed pcode-translate subdirectory. |
| test/pcode-translate/help.json | Deleted lit test for removed tool. |
| test/pcode-translate/function.json | Deleted lit test for removed tool. |
| test/lit.cfg.py | Removed %pcode-translate tool substitution. |
| test/CMakeLists.txt | Removed pcode-translation-tests suite registration. |
| lib/patchestry/Passes/Compiler.cpp | Replaced stoi with checked integer parsing for language-id bit size. |
| lib/patchestry/Ghidra/PcodeTranslation.cpp | Deleted Pcode translation registration implementation. |
| lib/patchestry/Ghidra/JsonDeserialize.cpp | Added JSON shape checks and removed an assertion by logging/recovering instead. |
| lib/patchestry/Ghidra/CMakeLists.txt | Removed PcodeTranslation.cpp from the ghidra static library. |
| lib/patchestry/Dialect/Pcode/Deserialize.cpp | Deleted Pcode JSON deserializer implementation. |
| lib/patchestry/Dialect/Pcode/CMakeLists.txt | Removed Deserialize.cpp from the Pcode dialect library build. |
| lib/patchestry/AST/OperationStmt.cpp | Replaced several asserts with logged error paths; added extra validation in a few op builders. |
| lib/patchestry/AST/OperationBuilder.cpp | Added cycle detection for forward temporary resolution and replaced an assert with logging. |
| lib/patchestry/AST/FunctionBuilder.cpp | Improved parsing robustness and protected pending_materialized during re-entrancy. |
| include/patchestry/Ghidra/PcodeTranslation.hpp | Deleted unused header for removed translation registration API. |
| include/patchestry/Dialect/Pcode/Deserialize.hpp | Deleted unused header for removed Pcode deserialization API. |
| include/patchestry/AST/OperationBuilder.hpp | Added resolving_temporaries set for cycle detection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Comment on lines
309
to
+313
| auto result = sema().ImpCastExprToType(expr, to_type, kind); | ||
| assert(!result.isInvalid() && "Failed to make implicit cast expr"); | ||
| if (result.isInvalid()) { | ||
| LOG(ERROR) << "Failed to make implicit cast expr\n"; | ||
| return nullptr; | ||
| } |
Comment on lines
+160
to
172
| clang::Stmt *result = nullptr; | ||
| if (auto maybe_operation = operationFromKey(function, op_key)) { | ||
| auto [stmt, _] = function_builder().create_operation(ctx, *maybe_operation); | ||
| if (stmt) { | ||
| function_builder().operation_stmts.emplace(*vnode.operation, stmt); | ||
| function_builder().operation_stmts.emplace(op_key, stmt); | ||
| // Recurse: will hit Case 1 (if already local) or Case 2. | ||
| return create_temporary(ctx, function, vnode); | ||
| result = create_temporary(ctx, function, vnode); | ||
| } else { | ||
| result = stmt; | ||
| } | ||
| return stmt; | ||
| } else { | ||
| LOG(ERROR) << "Failed to get operation for key: " << op_key << "\n"; | ||
| } |
Comment on lines
+167
to
+170
| int bit_size = 0; | ||
| if (llvm::StringRef(lang_vec[2]).getAsInteger(10, bit_size)) { | ||
| LOG(ERROR) << "Invalid bit size in language id: " << lang_vec[2] << "\n"; | ||
| return ""; |
Comment on lines
+240
to
+244
| const auto *json_obj = json->getAsObject(); | ||
| if (!json_obj) { | ||
| LOG(ERROR) << "Input JSON is not an object\n"; | ||
| return EXIT_FAILURE; | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The review identified reliability issues, including the use of assert statements that could lead to crashes on malformed input, missing validation in JSON deserialization paths, a lack of cycle detection in temporary resolution, and weak bounds/type checking in AST construction paths. The changes replace assertions with explicit error logging and recovery, introduce cycle detection for temporaries, strengthen input validation, and improve internal state handling to prevent crashes and undefined behavior. It also removes dead code related to the pcode-translate tool.