For generated solutions with non-python code, how are the solution COT generated by the model cleaned in the evaluation process?

I can't locate the core logic for cleaning the model's generated solution from non-python code in the codes within evaluation directory.
Currently, i can only see how dataset ground truth is parsed, how input prompt is separated from solution, and how cleaned prediction(final answer) is compared  with parsed ground truth.