I tried to train the model using 50salads and breakfast datasets, but why is the loss nan? I didn't make any changes to the code. 