When I try to run Stage 3 finetuning PPO for qwen 2 0.5B model, I got the following bug:
Assertion srcIndex < srcSelectDimSize failed,
which seems like issue about input dataset sequence length?
I have already set Num_Padding_at_Beginning=0 # this is model related
