Skip to content

Clarification on AST Syntax Match Score in CodeBLEU Example 1 #192

@MeghaSudheendran

Description

@MeghaSudheendran

I am a PhD student using CodeBLEU in my research on code generation evaluation. I noticed a discrepancy between the numbers reported in the paper and the results obtained using the official GitHub implementation.
In Example 1 of the paper, it is stated:

“The number of all sub-trees of the reference AST generated by tree-sitter is 21 and the hit number for the candidate is 13, so the syntactic AST match score is 13/21 ∗100 = 61.90(%)”

However, when I run the official CodeBLEU implementation on the same candidate and reference code, I obtain:

match_count = 11

total_count = 19

AST Syntax Match Score ≈ 0.5789

candidate_code = """
public static int Sign(double d){
    return (float)((d==0)?0:(c<0.0)?-1:1);

"""

reference_code = """
public static int Sign(double d){
    return (int)((d==0)?0:(d<0)?-1:1);
}
"""

This leads to a lower AST match score than what is reported in the paper.
Could you please clarify:

  1. Was Example 1 in the paper tested with the official implementation, or was it a simplified toy example for illustration?

  2. Is the official GitHub implementation considered the authoritative version, even if some example numbers differ from the paper?

Understanding this will help ensure that I interpret CodeBLEU results correctly in my research.
Thank you very much for your guidance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions