Add: get_callers & get_callees tools #58

CX330Blake · 2025-12-24T03:26:14Z

get_callers: Takes one or more function names/addresses, returns JSON {"results":[{identifier,function,callers,caller_sites}], "errors":[...]} where each entry includes normalized function metadata, every caller, and HLIL/IL snippets for each call site.
get_callees: Accepts the same identifier inputs, returns {"results":[{identifier,function,callees,call_sites}], "errors":[...]} listing every outgoing callee plus per-site IL context, falling back to raw addresses when no symbol exists.

- `get_callers`: Takes one or more function names/addresses, returns JSON {"results":[{identifier,function,callers,caller_sites}], "errors":[...]} where each entry includes normalized function metadata, every caller, and HLIL/IL snippets for each call site. - `get_callees`: Accepts the same identifier inputs, returns {"results":[{identifier,function,callees,call_sites}], "errors":[...]} listing every outgoing callee plus per-site IL context, falling back to raw addresses when no symbol exists.

Copilot

Pull request overview

This PR adds two new tools for analyzing function call relationships in Binary Ninja: get_callers retrieves functions that call a given function along with call site details, and get_callees retrieves functions called by a given function. Both tools accept multiple identifiers (function names or addresses) and return JSON responses with normalized function metadata, caller/callee lists, and HLIL/IL snippets for each call site.

Key changes:

Added identifier extraction and normalization utilities to handle comma/semicolon-separated lists across HTTP, core operations, and MCP bridge layers
Implemented call graph traversal logic with fallback mechanisms for raw addresses when symbols are unavailable
Extended HTTP API with /getCallers and /getCallees endpoints that accept flexible query parameter names

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
plugin/server/http_server.py	Added `_extract_identifiers` method for flexible query parameter parsing and two new GET endpoint handlers for `/getCallers` and `/getCallees`
plugin/core/binary_operations.py	Implemented core call graph analysis logic with helper methods for identifier normalization, function reference formatting, related function collection, and call site summarization
plugin/api/endpoints.py	Added thin proxy methods `get_callers` and `get_callees` that delegate to BinaryOperations
bridge/binja_mcp_bridge.py	Added `_normalize_identifier_input` helper and two new MCP tools `get_callers` and `get_callees` with comma-separated identifier support
README.md	Updated documentation tables to include the new MCP functions and HTTP endpoints with usage descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-24T03:31:35Z

bridge/binja_mcp_bridge.py

+        raw = value.replace(";", ",").split(",")
+        tokens.extend([tok.strip() for tok in raw if tok.strip()])
+    elif isinstance(value, (list, tuple, set)):
+        for item in value:
+            if item is None:
+                continue
+            tokens.extend(_normalize_identifier_input(str(item)))


The recursive call to _normalize_identifier_input(str(item)) on line 817 will return a list, but the code attempts to extend tokens with this list. However, the recursive call for a string will parse it for comma/semicolon separation, which is correct. But if the original input is a list like ["a,b", "c"], the recursion will double-split, leading to unexpected behavior. The function should either not recurse for list items or handle them differently.

Suggested change

raw = value.replace(";", ",").split(",")

tokens.extend([tok.strip() for tok in raw if tok.strip()])

elif isinstance(value, (list, tuple, set)):

for item in value:

if item is None:

continue

tokens.extend(_normalize_identifier_input(str(item)))

# Top-level string: split on comma/semicolon and normalize whitespace.

raw = value.replace(";", ",").split(",")

tokens.extend([tok.strip() for tok in raw if tok.strip()])

elif isinstance(value, (list, tuple, set)):

# Iterable input: treat each element as an atomic identifier unless it is

# itself a nested iterable of identifiers.

for item in value:

if item is None:

continue

if isinstance(item, (list, tuple, set)):

tokens.extend(_normalize_identifier_input(item))

else:

tok = str(item).strip()

if tok:

tokens.append(tok)

Copilot · 2025-12-24T03:31:35Z

plugin/core/binary_operations.py

+                func = None
+                errors.append(f"{ident}: {exc}")
+            if not func:
+                errors.append(f"Function not found: {ident}")
+                continue


When an exception is caught during function lookup, the error is added to the errors list, but then the code continues to check if func is None and adds another error "Function not found: {ident}". This results in duplicate error messages being added for the same identifier when an exception occurs.

Copilot · 2025-12-24T03:31:36Z

plugin/core/binary_operations.py

+                func = None
+                errors.append(f"{ident}: {exc}")
+            if not func:
+                errors.append(f"Function not found: {ident}")
+                continue


When an exception is caught during function lookup, the error is added to the errors list, but then the code continues to check if func is None and adds another error "Function not found: {ident}". This results in duplicate error messages being added for the same identifier when an exception occurs.

Copilot · 2025-12-24T03:31:36Z

plugin/core/binary_operations.py

+    def get_callers(self, identifiers: Any) -> dict[str, Any]:
+        """Collect caller information for the given function identifiers."""
+        if not self._current_view:
+            raise RuntimeError("No binary loaded")
+
+        items = self._normalize_identifier_list(identifiers)
+        if not items:
+            raise ValueError("No function identifiers provided")
+
+        results: list[dict[str, Any]] = []
+        errors: list[str] = []
+        for ident in items:
+            try:
+                func = self.get_function_by_name_or_address(ident)
+            except Exception as exc:
+                func = None
+                errors.append(f"{ident}: {exc}")
+            if not func:
+                errors.append(f"Function not found: {ident}")
+                continue
+            entry = {
+                "identifier": str(ident),
+                "function": self._format_function_reference(func),
+                "callers": self._collect_related_functions(func, "callers"),
+                "caller_sites": self._summarize_call_sites(func, "callers"),
+            }
+            results.append(entry)
+
+        return {"results": results, "errors": errors}
+
+    def get_callees(self, identifiers: Any) -> dict[str, Any]:
+        """Collect callee information for the given function identifiers."""
+        if not self._current_view:
+            raise RuntimeError("No binary loaded")
+
+        items = self._normalize_identifier_list(identifiers)
+        if not items:
+            raise ValueError("No function identifiers provided")
+
+        results: list[dict[str, Any]] = []
+        errors: list[str] = []
+        for ident in items:
+            try:
+                func = self.get_function_by_name_or_address(ident)
+            except Exception as exc:
+                func = None
+                errors.append(f"{ident}: {exc}")
+            if not func:
+                errors.append(f"Function not found: {ident}")
+                continue
+            entry = {
+                "identifier": str(ident),
+                "function": self._format_function_reference(func),
+                "callees": self._collect_related_functions(func, "callees"),
+                "call_sites": self._summarize_call_sites(func, "callees"),
+            }
+            results.append(entry)
+
+        return {"results": results, "errors": errors}


The logic in get_callers and get_callees is nearly identical with only minor differences in the relation attribute names ("callers" vs "callees", "caller_sites" vs "call_sites"). This duplication makes maintenance harder and increases the risk of inconsistent behavior. Consider extracting this into a common helper method that takes the relation type as a parameter.

Copilot · 2025-12-24T03:31:36Z

plugin/server/http_server.py

+            elif path == "/getCallers":
+                identifiers = self._extract_identifiers(params)
+                if not identifiers:
+                    self._send_json_response(
+                        {
+                            "error": "Missing identifier parameter",
+                            "help": "Provide ?identifier=<name|address> or comma-separated ?identifiers=a,b",
+                        },
+                        400,
+                    )
+                    return
+                try:
+                    payload = self.binary_ops.get_callers(identifiers)
+                except Exception as e:
+                    bn.log_error(f"Error handling getCallers: {e}")
+                    self._send_json_response({"error": str(e)}, 500)
+                else:
+                    self._send_json_response(payload)
+
+            elif path == "/getCallees":
+                identifiers = self._extract_identifiers(params)
+                if not identifiers:
+                    self._send_json_response(
+                        {
+                            "error": "Missing identifier parameter",
+                            "help": "Provide ?identifier=<name|address> or comma-separated ?identifiers=a,b",
+                        },
+                        400,
+                    )
+                    return
+                try:
+                    payload = self.binary_ops.get_callees(identifiers)
+                except Exception as e:
+                    bn.log_error(f"Error handling getCallees: {e}")
+                    self._send_json_response({"error": str(e)}, 500)
+                else:
+                    self._send_json_response(payload)


The /getCallers and /getCallees endpoint handlers are nearly identical except for the method name they call. This creates maintenance overhead and increases the risk of inconsistent error handling or updates. Consider extracting the common logic into a helper method that takes the endpoint method name as a parameter.

Copilot · 2025-12-24T03:31:36Z

bridge/binja_mcp_bridge.py

+@mcp.tool()
+def get_callers(identifiers: str) -> str:
+    """
+    List callers and caller sites for one or more function identifiers (name or address).
+    Provide comma-separated identifiers like "sub_401000,main".
+    """
+    items = _normalize_identifier_input(identifiers)
+    if not items:
+        return "Error: provide at least one identifier"
+    data = get_json("getCallers", {"identifiers": ",".join(items)}, timeout=None)
+    if not data:
+        return "Error: no response"
+    import json as _json
+
+    return _json.dumps(data, indent=2, ensure_ascii=False)
+
+
+@mcp.tool()
+def get_callees(identifiers: str) -> str:
+    """
+    List callees and call sites for one or more function identifiers (name or address).
+    Provide comma-separated identifiers like "sub_401000,main".
+    """
+    items = _normalize_identifier_input(identifiers)
+    if not items:
+        return "Error: provide at least one identifier"
+    data = get_json("getCallees", {"identifiers": ",".join(items)}, timeout=None)
+    if not data:
+        return "Error: no response"
+    import json as _json
+
+    return _json.dumps(data, indent=2, ensure_ascii=False)


The get_callers and get_callees functions are nearly identical, differing only in the endpoint name and docstring. This duplication increases maintenance burden. Consider extracting a common helper function that accepts the endpoint name as a parameter.

Copilot · 2025-12-24T03:31:36Z

plugin/core/binary_operations.py

+        except Exception:
+            pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

pass

except Exception as exc:

# Swallow errors when iterating related functions, but log for diagnostics.

bn.log_debug(

f"BinaryOperations._collect_related_functions: failed while processing "

f"{relation_attr} for function {getattr(func, 'name', '<unknown>')}: {exc}"

)

Copilot AI review requested due to automatic review settings December 24, 2025 03:26

Copilot started reviewing on behalf of CX330Blake December 24, 2025 03:26 View session

Copilot AI reviewed Dec 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add: get_callers & get_callees tools #58

Add: get_callers & get_callees tools #58

Uh oh!

CX330Blake commented Dec 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Copilot AI Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        except Exception:
-            pass
+        except Exception as exc:
+            # Swallow errors when iterating related functions, but log for diagnostics.
+            bn.log_debug(
+                f"BinaryOperations._collect_related_functions: failed while processing "
+                f"{relation_attr} for function {getattr(func, 'name', '<unknown>')}: {exc}"
+            )

Add: get_callers & get_callees tools #58

Are you sure you want to change the base?

Add: get_callers & get_callees tools #58

Uh oh!

Conversation

CX330Blake commented Dec 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant