-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hello,
Thank you for the detailed evaluation presented in the paper. I am particularly interested in the performance analysis of ChatGPT and instruction-tuned LLaMA-2-7B on SQL and SPARQL generation as demonstrated in Table 12.
In the paper, it is mentioned that SPARQL was evaluated on 4,779 samples from LC-quad and KQA-pro, and SQL was evaluated on 15,900 samples from WikiSQL. However, I couldn't find the specific details or availability of the test sets used for these evaluations.
Could you please provide more information about the test sets, or share the test sets themselves, if possible? This would greatly aid in reproducing the results and further understanding the performance metrics presented.
Thank you for your assistance.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels