Some of our E2E tests are flaky and fails time to time: https://github.com/kubeflow/trainer/actions/workflows/test-e2e.yaml
We should explore how to improve Notebook example to resolve it.
Example: https://github.com/kubeflow/trainer/actions/runs/23232456393
/area testing
/good-first-issue