The example you provided uses the same dataset for both training and testing? In order to evaluate the performance of a machine learning model, we need to separate the dataset into training and testing sets? Can you show us how to evaluate the trained model in unseen dataset? thanks