Skip to content

Conversation

@CX330Blake
Copy link
Collaborator

  • get_callers: Takes one or more function names/addresses, returns JSON {"results":[{identifier,function,callers,caller_sites}], "errors":[...]} where each entry includes normalized function metadata, every caller, and HLIL/IL snippets for each call site.
  • get_callees: Accepts the same identifier inputs, returns {"results":[{identifier,function,callees,call_sites}], "errors":[...]} listing every outgoing callee plus per-site IL context, falling back to raw addresses when no symbol exists.

- `get_callers`: Takes one or more function names/addresses, returns JSON
{"results":[{identifier,function,callers,caller_sites}], "errors":[...]}
where each entry includes normalized function metadata, every caller,
and HLIL/IL snippets for each call site.
- `get_callees`: Accepts the same identifier inputs, returns
{"results":[{identifier,function,callees,call_sites}], "errors":[...]}
listing every outgoing callee plus per-site IL context, falling back to
raw addresses when no symbol exists.
Copilot AI review requested due to automatic review settings December 24, 2025 03:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two new tools for analyzing function call relationships in Binary Ninja: get_callers retrieves functions that call a given function along with call site details, and get_callees retrieves functions called by a given function. Both tools accept multiple identifiers (function names or addresses) and return JSON responses with normalized function metadata, caller/callee lists, and HLIL/IL snippets for each call site.

Key changes:

  • Added identifier extraction and normalization utilities to handle comma/semicolon-separated lists across HTTP, core operations, and MCP bridge layers
  • Implemented call graph traversal logic with fallback mechanisms for raw addresses when symbols are unavailable
  • Extended HTTP API with /getCallers and /getCallees endpoints that accept flexible query parameter names

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
plugin/server/http_server.py Added _extract_identifiers method for flexible query parameter parsing and two new GET endpoint handlers for /getCallers and /getCallees
plugin/core/binary_operations.py Implemented core call graph analysis logic with helper methods for identifier normalization, function reference formatting, related function collection, and call site summarization
plugin/api/endpoints.py Added thin proxy methods get_callers and get_callees that delegate to BinaryOperations
bridge/binja_mcp_bridge.py Added _normalize_identifier_input helper and two new MCP tools get_callers and get_callees with comma-separated identifier support
README.md Updated documentation tables to include the new MCP functions and HTTP endpoints with usage descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +811 to +817
raw = value.replace(";", ",").split(",")
tokens.extend([tok.strip() for tok in raw if tok.strip()])
elif isinstance(value, (list, tuple, set)):
for item in value:
if item is None:
continue
tokens.extend(_normalize_identifier_input(str(item)))
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recursive call to _normalize_identifier_input(str(item)) on line 817 will return a list, but the code attempts to extend tokens with this list. However, the recursive call for a string will parse it for comma/semicolon separation, which is correct. But if the original input is a list like ["a,b", "c"], the recursion will double-split, leading to unexpected behavior. The function should either not recurse for list items or handle them differently.

Suggested change
raw = value.replace(";", ",").split(",")
tokens.extend([tok.strip() for tok in raw if tok.strip()])
elif isinstance(value, (list, tuple, set)):
for item in value:
if item is None:
continue
tokens.extend(_normalize_identifier_input(str(item)))
# Top-level string: split on comma/semicolon and normalize whitespace.
raw = value.replace(";", ",").split(",")
tokens.extend([tok.strip() for tok in raw if tok.strip()])
elif isinstance(value, (list, tuple, set)):
# Iterable input: treat each element as an atomic identifier unless it is
# itself a nested iterable of identifiers.
for item in value:
if item is None:
continue
if isinstance(item, (list, tuple, set)):
tokens.extend(_normalize_identifier_input(item))
else:
tok = str(item).strip()
if tok:
tokens.append(tok)

Copilot uses AI. Check for mistakes.
Comment on lines +506 to +510
func = None
errors.append(f"{ident}: {exc}")
if not func:
errors.append(f"Function not found: {ident}")
continue
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an exception is caught during function lookup, the error is added to the errors list, but then the code continues to check if func is None and adds another error "Function not found: {ident}". This results in duplicate error messages being added for the same identifier when an exception occurs.

Copilot uses AI. Check for mistakes.
Comment on lines +536 to +540
func = None
errors.append(f"{ident}: {exc}")
if not func:
errors.append(f"Function not found: {ident}")
continue
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an exception is caught during function lookup, the error is added to the errors list, but then the code continues to check if func is None and adds another error "Function not found: {ident}". This results in duplicate error messages being added for the same identifier when an exception occurs.

Copilot uses AI. Check for mistakes.
Comment on lines +491 to +549
def get_callers(self, identifiers: Any) -> dict[str, Any]:
"""Collect caller information for the given function identifiers."""
if not self._current_view:
raise RuntimeError("No binary loaded")

items = self._normalize_identifier_list(identifiers)
if not items:
raise ValueError("No function identifiers provided")

results: list[dict[str, Any]] = []
errors: list[str] = []
for ident in items:
try:
func = self.get_function_by_name_or_address(ident)
except Exception as exc:
func = None
errors.append(f"{ident}: {exc}")
if not func:
errors.append(f"Function not found: {ident}")
continue
entry = {
"identifier": str(ident),
"function": self._format_function_reference(func),
"callers": self._collect_related_functions(func, "callers"),
"caller_sites": self._summarize_call_sites(func, "callers"),
}
results.append(entry)

return {"results": results, "errors": errors}

def get_callees(self, identifiers: Any) -> dict[str, Any]:
"""Collect callee information for the given function identifiers."""
if not self._current_view:
raise RuntimeError("No binary loaded")

items = self._normalize_identifier_list(identifiers)
if not items:
raise ValueError("No function identifiers provided")

results: list[dict[str, Any]] = []
errors: list[str] = []
for ident in items:
try:
func = self.get_function_by_name_or_address(ident)
except Exception as exc:
func = None
errors.append(f"{ident}: {exc}")
if not func:
errors.append(f"Function not found: {ident}")
continue
entry = {
"identifier": str(ident),
"function": self._format_function_reference(func),
"callees": self._collect_related_functions(func, "callees"),
"call_sites": self._summarize_call_sites(func, "callees"),
}
results.append(entry)

return {"results": results, "errors": errors}
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in get_callers and get_callees is nearly identical with only minor differences in the relation attribute names ("callers" vs "callees", "caller_sites" vs "call_sites"). This duplication makes maintenance harder and increases the risk of inconsistent behavior. Consider extracting this into a common helper method that takes the relation type as a parameter.

Copilot uses AI. Check for mistakes.
Comment on lines +770 to +806
elif path == "/getCallers":
identifiers = self._extract_identifiers(params)
if not identifiers:
self._send_json_response(
{
"error": "Missing identifier parameter",
"help": "Provide ?identifier=<name|address> or comma-separated ?identifiers=a,b",
},
400,
)
return
try:
payload = self.binary_ops.get_callers(identifiers)
except Exception as e:
bn.log_error(f"Error handling getCallers: {e}")
self._send_json_response({"error": str(e)}, 500)
else:
self._send_json_response(payload)

elif path == "/getCallees":
identifiers = self._extract_identifiers(params)
if not identifiers:
self._send_json_response(
{
"error": "Missing identifier parameter",
"help": "Provide ?identifier=<name|address> or comma-separated ?identifiers=a,b",
},
400,
)
return
try:
payload = self.binary_ops.get_callees(identifiers)
except Exception as e:
bn.log_error(f"Error handling getCallees: {e}")
self._send_json_response({"error": str(e)}, 500)
else:
self._send_json_response(payload)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /getCallers and /getCallees endpoint handlers are nearly identical except for the method name they call. This creates maintenance overhead and increases the risk of inconsistent error handling or updates. Consider extracting the common logic into a helper method that takes the endpoint method name as a parameter.

Copilot uses AI. Check for mistakes.
Comment on lines +821 to +852
@mcp.tool()
def get_callers(identifiers: str) -> str:
"""
List callers and caller sites for one or more function identifiers (name or address).
Provide comma-separated identifiers like "sub_401000,main".
"""
items = _normalize_identifier_input(identifiers)
if not items:
return "Error: provide at least one identifier"
data = get_json("getCallers", {"identifiers": ",".join(items)}, timeout=None)
if not data:
return "Error: no response"
import json as _json

return _json.dumps(data, indent=2, ensure_ascii=False)


@mcp.tool()
def get_callees(identifiers: str) -> str:
"""
List callees and call sites for one or more function identifiers (name or address).
Provide comma-separated identifiers like "sub_401000,main".
"""
items = _normalize_identifier_input(identifiers)
if not items:
return "Error: provide at least one identifier"
data = get_json("getCallees", {"identifiers": ",".join(items)}, timeout=None)
if not data:
return "Error: no response"
import json as _json

return _json.dumps(data, indent=2, ensure_ascii=False)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_callers and get_callees functions are nearly identical, differing only in the endpoint name and docstring. This duplication increases maintenance burden. Consider extracting a common helper function that accepts the endpoint name as a parameter.

Copilot uses AI. Check for mistakes.
Comment on lines +414 to +415
except Exception:
pass
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as exc:
# Swallow errors when iterating related functions, but log for diagnostics.
bn.log_debug(
f"BinaryOperations._collect_related_functions: failed while processing "
f"{relation_attr} for function {getattr(func, 'name', '<unknown>')}: {exc}"
)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant