Highlight error message at the end of log by jordalgo · Pull Request #71 · libbpf/bpfvv

jordalgo · 2025-08-30T01:08:45Z

theihor · 2025-09-02T19:13:26Z

@jordalgo Thanks for working on this.

I don't like that we are updating a ParsedLine type depending on the line position, doesn't look reliable.

I am thinking a better approach is to directly pattern-match known messages, similarly to how we determine helper function calls, with bpf-helpers.json. But that requires some C parsing.

I made a script (with LLM help) parsing current verifier.c with tree-sitter and printing an argument of verbose() call.

#!/usr/bin/env python3
"""
Python script to find all verbose() function calls in a C file using tree-sitter.
"""

import sys
import os
try:
    import tree_sitter_c
    from tree_sitter import Language, Parser
except ImportError:
    print("Error: tree-sitter library not installed. Install with: pip install tree-sitter tree-sitter-c")
    sys.exit(1)

def setup_c_parser():
    """Set up tree-sitter parser for C language."""
    try:
        # Get the C language
        c_language = tree_sitter_c.language()

        # Create a Language object from the PyCapsule
        lang_obj = Language(c_language)

        # Create parser
        parser = Parser()
        parser.language = lang_obj
        return parser
    except Exception as e:
        print(f"Error setting up C parser: {e}")
        return None

def find_verbose_calls(node, source_lines, file_path=""):
    """
    Recursively traverse the AST to find all verbose() function calls.

    Args:
        node: Current AST node
        source_lines: List of source code lines for context
        file_path: Path to the source file for error reporting

    Returns:
        List of dictionaries containing call information
    """
    verbose_calls = []

    def traverse(node):
        # Check if this is a function call
        if node.type == 'call_expression':
            # Get the function being called
            function_node = node.child_by_field_name('function')
            if function_node and function_node.type == 'identifier':
                function_name = function_node.text.decode('utf-8')
                if function_name == 'verbose':
                    arg_list = node.child_by_field_name('arguments')
                    literal = arg_list.children[3]
                    print(literal.text.decode('utf-8'))

                    # Found a verbose() call
                    start_point = node.start_point
                    end_point = node.end_point

                    # Get the full call text
                    call_text = node.text.decode('utf-8')

                    # Store the call information
                    verbose_calls.append({
                        'line': start_point.row + 1,  # 1-based line numbering
                        'column': start_point.column + 1,  # 1-based column numbering
                        'text': call_text,
                        'end_line': end_point.row + 1,
                        'end_column': end_point.column + 1
                    })

        # Recursively traverse child nodes
        for child in node.children:
            traverse(child)

    traverse(node)
    return verbose_calls

def main():
    """Main function to parse file and find verbose() calls."""
    file_path = '/home/isolodrai/kernels/bpf-next/kernel/bpf/verifier.c'

    if len(sys.argv) > 1:
        file_path = sys.argv[1]

    # Check if file exists
    if not os.path.exists(file_path):
        print(f"Error: File '{file_path}' not found.")
        return 1

    # Set up parser
    parser = setup_c_parser()
    if parser is None:
        return 1

    try:
        # Read the file
        with open(file_path, 'r', encoding='utf-8') as f:
            source_code = f.read()

        # Parse the file
        print(f"Parsing {file_path}...")
        tree = parser.parse(bytes(source_code, 'utf-8'))

        # Find verbose() calls
        source_lines = source_code.split('\n')
        verbose_calls = find_verbose_calls(tree.root_node, source_lines, file_path)

        # Print results to stdout
        print(f"Found {len(verbose_calls)} verbose() calls:\n")

        return 0

    except FileNotFoundError:
        print(f"Error: Could not read file '{file_path}'")
        return 1
    except Exception as e:
        print(f"Error processing file: {e}")
        return 1

It produced this list of strings: https://gist.github.com/theihor/b96de6993990ea70bf5863b4b3b3c015

It needs to be cleaned up a bit, but then we can use it to determine a verifier error message at any position and highlight it. And then the head/tail logic with respect to instruction wouldn't be necessary.

We could polish the script to produce a json, and put it in the repo. What do you think?

jordalgo · 2025-09-02T19:46:28Z

@theihor Have we ever seen cases where the error message doesn't directly follow the last instruction in the verifier error log? If we haven't, then building this list of possible strings feels a little over-engineered and also possibly less reliable as the verifier can change and then our list will be out of date. We can also just land this dumb solution first and see if we encounter logs where it breaks.

theihor · 2025-09-02T20:15:15Z

Have we ever seen cases where the error message doesn't directly follow the last instruction in the verifier error log?

I haven't, and usually the error message is printed right before return -EACCES; or something like that.

The issue is that bpfvv can receive any fragment of a log, or multiple logs (say from veristat). Of course we can't handle all possible cases of incomplete or modified input, but since the list of the verifier errors is known (from the kernel source code), why not use it to highlight important messages.

If we haven't, then building this list of possible strings feels a little over-engineered and also possibly less reliable as the verifier can change and then our list will be out of date. We can also just land this dumb solution first and see if we encounter logs where it breaks.

We have the same problem with the helpers list, and more general problem of parsing the log from different kernel versions, so storing a list of known error messages doesn't introduce "verifier might change" problem or make it worse. Also this kind of list will be necessary anyways for "common errors" hints based on the error, that we were thinking to implement.

I understand this is a bit more work, so maybe let's park it for later, I'll open an issue.

As for now, if we are going head/tail route, I would put the logic of highlighting in the UI, and wouldn't touch the analyzer or parser. In my mind, parser determines line kind by it's content, and analyzer builds up the array of states. Patching a type of a ParsedLine after the fact based on position doesn't look right to me (although I admit it's similar to global func message handling).

How about this: remember the index of first and last ParsedLineType.INSTRUCTION in the VerifierLogState and then look at the indices when rendering log line for highlighting. No need to introduce a new message type. Thoughts?

jordalgo · 2025-09-02T20:45:21Z

so storing a list of known error messages doesn't introduce "verifier might change" problem or make it worse

Ah ok. If that's the case, then what you suggested sounds fine. Truth be told, I'd rather our identification of error messages was part of the parser/analyzer (and ParsedLine type) instead of in the UI. But I'm fine to do what you suggested as a stop-gap while we work on this script integration.

We could polish the script to produce a json, and put it in the repo. What do you think?

I think the script should be written in javascript/node. Even though python is pretty ubiquitous these days, I think it's helpful if the repo can stick to one language if possible. I'm sure we can just ask the LLM to re-write it in Node 😄

jordalgo · 2025-09-02T21:03:41Z

@theihor PR updated with your suggestion.

theihor · 2025-09-02T21:18:51Z

src/analyzer.ts


+  for (let i = lines.length - 1; i >= 0; --i) {
+    if (lines[i].type == ParsedLineType.INSTRUCTION) {
+      lastInsIdx = lines[i].idx;


We could track this in the previous loop:

if (parsedLine === ParsedLineType.INSTRUCTION) lastInsIdx = idx;

But given we are starting from the end it's a nit.

The current way is probably more performant if you have a ton of log lines (no check and mutation) but it's less code - so....idk I'll probably just leave it considering it's temporary anyway

theihor · 2025-09-02T21:19:30Z

@theihor PR updated with your suggestion.

This is much simpler, thanks!

jordalgo force-pushed the error_message branch from 5a05904 to 6440617 Compare September 1, 2025 16:35

theihor mentioned this pull request Sep 2, 2025

Verifier error message patterns #75

Open

Highlight error message at the end of log

6759585

jordalgo force-pushed the error_message branch from 6440617 to 6759585 Compare September 2, 2025 21:02

theihor approved these changes Sep 2, 2025

View reviewed changes

theihor merged commit 2f70be7 into libbpf:master Sep 2, 2025
1 check passed

jordalgo added a commit to jordalgo/bpfvv that referenced this pull request Sep 6, 2025

Highlight error message at the end of log (libbpf#71)

958c686

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlight error message at the end of log#71

Highlight error message at the end of log#71
theihor merged 1 commit intolibbpf:masterfrom
jordalgo:error_message

jordalgo commented Aug 30, 2025

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025 •

edited

Loading

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025

Uh oh!

theihor Sep 2, 2025

Uh oh!

jordalgo Sep 2, 2025

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jordalgo commented Aug 30, 2025

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025

Uh oh!

jordalgo commented Sep 2, 2025

Uh oh!

theihor Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

jordalgo Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

theihor commented Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jordalgo commented Sep 2, 2025 •

edited

Loading