- Test `evals/codebleu/visualize.py` functions - Test with different result formats - Test edge cases (empty results, missing fields)