what's the difference between the 'evaluate' method and 'evaluate_raw' method in the class 'VQADataset' in 'vqa_data.py'?