-
Notifications
You must be signed in to change notification settings - Fork 73
Add upgrade tests for Trainer v2 #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughMoves GetSleepImage to a common package, adds exported namespace helpers and namespace log storage, and introduces multiple trainer upgrade integration tests (Kueue, sleep TrainJob, TrainingRuntime) with supporting helper functions. Changes
Sequence Diagram(s)(Skipped — changes are test additions and helper relocations without a new multi-component runtime control flow requiring visualization.) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/trainer/trainer_upgrade_sleep_test.go`:
- Around line 60-77: The test TestVerifySleepTrainJob currently only checks pod
count and restart count (via GetPods and sleepPodRestartCount) which can miss
non-Running phases; update the assertion to also verify the pod's phase is
Running (use the same GetPods call and add a
ContainElement(WithTransform(sleepPodPhase, Equal("Running"))) or equivalent
matcher referencing sleepPodPhase) so the test fails if the pod completed or
entered a different phase while still having zero restarts.
🧹 Nitpick comments (4)
tests/trainer/trainer_kueue_upgrade_training_test.go (3)
19-35: Consider reformatting imports per openshift-goimports convention.Based on learnings, imports should have blank lines between different third-party groups (e.g., separating
github.com,k8s.io, andsigs.k8s.iogroups).♻️ Suggested import organization
import ( "testing" trainerv1alpha1 "github.com/kubeflow/trainer/v2/pkg/apis/trainer/v1alpha1" . "github.com/onsi/gomega" + corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/api/resource" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + kueuev1beta1 "sigs.k8s.io/kueue/apis/kueue/v1beta1" kueueacv1beta1 "sigs.k8s.io/kueue/client-go/applyconfiguration/kueue/v1beta1" . "github.com/opendatahub-io/distributed-workloads/tests/common" . "github.com/opendatahub-io/distributed-workloads/tests/common/support" trainerutils "github.com/opendatahub-io/distributed-workloads/tests/trainer/utils" )
110-114: Minor: FieldManager value inconsistency.The FieldManager is
"application/apply-patch"here but"setup-TrainJob"inTestSetupUpgradeTrainJob. Consider using a consistent value or the commonApplyOptionsfromtests/common/support.gofor consistency.
127-137: Defensive delete logic is appropriate, but check scope is broader than necessary.The delete-before-create pattern handles test reruns gracefully. However, line 134 waits for all TrainJobs in the namespace to be empty (
TrainJobs(test, namespace).Should(BeEmpty())), even though only one specific TrainJob was deleted. This works but could be more precise.Consider using a targeted check:
- test.Eventually(TrainJobs(test, namespace), TestTimeoutShort).Should(BeEmpty()) + test.Eventually(func(g Gomega) { + _, err := test.Client().Trainer().TrainerV1alpha1().TrainJobs(namespace).Get(test.Ctx(), upgradeTrainJobName, metav1.GetOptions{}) + g.Expect(errors.IsNotFound(err)).To(BeTrue()) + }, TestTimeoutShort).Should(Succeed())tests/common/environment.go (1)
159-161: Guard against empty SLEEP_IMAGE values.If SLEEP_IMAGE is set but empty/whitespace, the current logic returns an empty image and the pod will fail to start. Consider trimming and falling back to the default.
♻️ Proposed update
func GetSleepImage() string { - return lookupEnvOrDefault(sleepImageEnvVar, "gcr.io/k8s-staging-perf-tests/sleep@sha256:8d91ddf9f145b66475efda1a1b52269be542292891b5de2a7fad944052bab6ea") + v := strings.TrimSpace(os.Getenv(sleepImageEnvVar)) + if v == "" { + return "gcr.io/k8s-staging-perf-tests/sleep@sha256:8d91ddf9f145b66475efda1a1b52269be542292891b5de2a7fad944052bab6ea" + } + return v }
| func TestVerifySleepTrainJob(t *testing.T) { | ||
| Tags(t, PostUpgrade) | ||
| test := With(t) | ||
| namespace := GetNamespaceWithName(test, sleepNamespaceName) | ||
|
|
||
| // Cleanup namespace in the end | ||
| defer DeleteTestNamespace(test, namespace) | ||
|
|
||
| // Pod should be still running without restart | ||
| test.Expect(GetPods(test, sleepNamespaceName, metav1.ListOptions{})). | ||
| Should( | ||
| And( | ||
| HaveLen(1), | ||
| ContainElement(WithTransform(sleepPodRestartCount, BeNumerically("==", 0))), | ||
| ), | ||
| ) | ||
| test.T().Logf("TrainJob %s/%s is still running after upgrade with no pod restarts", sleepNamespaceName, sleepTrainJobName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify phase should assert pod is still running.
Right now the check only validates restart count and length. If the pod completes or moves to another phase with zero restarts, the test would still pass.
🐛 Suggested assertion addition
test.Expect(GetPods(test, sleepNamespaceName, metav1.ListOptions{})).
Should(
And(
HaveLen(1),
+ ContainElement(WithTransform(podPhase, Equal(corev1.PodRunning))),
ContainElement(WithTransform(sleepPodRestartCount, BeNumerically("==", 0))),
),
)🤖 Prompt for AI Agents
In `@tests/trainer/trainer_upgrade_sleep_test.go` around lines 60 - 77, The test
TestVerifySleepTrainJob currently only checks pod count and restart count (via
GetPods and sleepPodRestartCount) which can miss non-Running phases; update the
assertion to also verify the pod's phase is Running (use the same GetPods call
and add a ContainElement(WithTransform(sleepPodPhase, Equal("Running"))) or
equivalent matcher referencing sleepPodPhase) so the test fails if the pod
completed or entered a different phase while still having zero restarts.
86f0fce to
49efd36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/trainer/trainer_trainingruntime_upgrade_test.go`:
- Around line 60-83: The test currently only verifies existence; update
TestVerifyTrainingRuntime to also assert the custom TrainingRuntime's PodSpec
preserved fields by locating the found runtime from runtimes.Items, then
extracting its container spec (e.g., runtime.Spec.Template.Spec.Containers[0])
and asserting container.Image, container.Command and container.Args equal the
expected values used when creating the custom runtime (use the same expected
literals/variables as the original test setup). Use
test.Expect/...To(Equal(...)) on container.Image, container.Command and
container.Args to fail the test if they differ.
♻️ Duplicate comments (1)
tests/trainer/trainer_upgrade_sleep_test.go (1)
60-76: Also assert the pod is still Running post-upgrade.
Right now only restart count is checked, so a Completed/Failed pod with zero restarts would pass.🐛 Suggested assertion addition
test.Expect(GetPods(test, sleepNamespaceName, metav1.ListOptions{})). Should( And( HaveLen(1), + ContainElement(WithTransform(podPhase, Equal(corev1.PodRunning))), ContainElement(WithTransform(sleepPodRestartCount, BeNumerically("==", 0))), ), )
🧹 Nitpick comments (1)
tests/common/support/namespace.go (1)
89-104: Apply options when reusing an existing namespace.
If the namespace already exists, provided options (labels/annotations) are ignored, which can leave required settings (e.g., Kueue managed) missing on reruns. Consider applying the options and updating before returning.♻️ Possible update
namespace, err := t.Client().Core().CoreV1().Namespaces().Get(t.Ctx(), name, metav1.GetOptions{}) if err == nil { - return namespace + for _, option := range options { + t.Expect(option.ApplyTo(namespace)).To(gomega.Succeed()) + } + if len(options) > 0 { + namespace, err = t.Client().Core().CoreV1().Namespaces().Update(t.Ctx(), namespace, metav1.UpdateOptions{}) + t.Expect(err).NotTo(gomega.HaveOccurred()) + } + return namespace } else if errors.IsNotFound(err) {
| // TestVerifyTrainingRuntime verifies the custom TrainingRuntime survives upgrade | ||
| // with its spec preserved. | ||
| func TestVerifyTrainingRuntime(t *testing.T) { | ||
| Tags(t, PostUpgrade) | ||
| test := With(t) | ||
|
|
||
| namespace := GetNamespaceWithName(test, runtimeNamespaceName) | ||
| defer DeleteTestNamespace(test, namespace) | ||
|
|
||
| // Verify TrainingRuntime still exists after upgrade by listing all runtimes | ||
| runtimes, err := test.Client().Trainer().TrainerV1alpha1().TrainingRuntimes(runtimeNamespaceName).List( | ||
| test.Ctx(), metav1.ListOptions{}) | ||
| test.Expect(err).NotTo(HaveOccurred(), "Failed to list TrainingRuntimes") | ||
|
|
||
| // Find Custom TrainingRuntime in the list | ||
| var found bool | ||
| for _, runtime := range runtimes.Items { | ||
| if runtime.Name == customRuntimeName { | ||
| found = true | ||
| break | ||
| } | ||
| } | ||
| test.Expect(found).To(BeTrue(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName) | ||
| test.T().Logf("TrainingRuntime %s/%s is preserved after upgrade", runtimeNamespaceName, customRuntimeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, check the file exists and get its size
wc -l tests/trainer/trainer_trainingruntime_upgrade_test.goRepository: opendatahub-io/distributed-workloads
Length of output: 137
🏁 Script executed:
# Read the entire file to understand context, setup, and whether spec checks exist
cat -n tests/trainer/trainer_trainingruntime_upgrade_test.goRepository: opendatahub-io/distributed-workloads
Length of output: 6001
🏁 Script executed:
# Search for customRuntimeName definition and setup
rg "customRuntimeName" tests/trainer/trainer_trainingruntime_upgrade_test.goRepository: opendatahub-io/distributed-workloads
Length of output: 803
🏁 Script executed:
# Check the trainerv1alpha1 package structure to verify field paths in suggested diff
fd "trainingruntime" --type f | head -20Repository: opendatahub-io/distributed-workloads
Length of output: 133
Verify spec preservation, not just existence.
The test claims to verify spec preservation (function name and comment) but only checks that the runtime exists. Add assertions to verify the container image, command, and args survive the upgrade.
Suggested changes
- // Find Custom TrainingRuntime in the list
- var found bool
- for _, runtime := range runtimes.Items {
- if runtime.Name == customRuntimeName {
- found = true
- break
- }
- }
- test.Expect(found).To(BeTrue(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName)
+ // Find Custom TrainingRuntime in the list
+ var foundRuntime *trainerv1alpha1.TrainingRuntime
+ for i := range runtimes.Items {
+ if runtimes.Items[i].Name == customRuntimeName {
+ foundRuntime = &runtimes.Items[i]
+ break
+ }
+ }
+ test.Expect(foundRuntime).NotTo(BeNil(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName)
+ if foundRuntime != nil {
+ container := foundRuntime.Spec.Template.Spec.ReplicatedJobs[0].Template.Spec.Template.Spec.Containers[0]
+ test.Expect(container.Image).To(Equal(GetSleepImage()))
+ test.Expect(container.Command).To(Equal([]string{"sleep"}))
+ test.Expect(container.Args).To(Equal([]string{"24h"}))
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // TestVerifyTrainingRuntime verifies the custom TrainingRuntime survives upgrade | |
| // with its spec preserved. | |
| func TestVerifyTrainingRuntime(t *testing.T) { | |
| Tags(t, PostUpgrade) | |
| test := With(t) | |
| namespace := GetNamespaceWithName(test, runtimeNamespaceName) | |
| defer DeleteTestNamespace(test, namespace) | |
| // Verify TrainingRuntime still exists after upgrade by listing all runtimes | |
| runtimes, err := test.Client().Trainer().TrainerV1alpha1().TrainingRuntimes(runtimeNamespaceName).List( | |
| test.Ctx(), metav1.ListOptions{}) | |
| test.Expect(err).NotTo(HaveOccurred(), "Failed to list TrainingRuntimes") | |
| // Find Custom TrainingRuntime in the list | |
| var found bool | |
| for _, runtime := range runtimes.Items { | |
| if runtime.Name == customRuntimeName { | |
| found = true | |
| break | |
| } | |
| } | |
| test.Expect(found).To(BeTrue(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName) | |
| test.T().Logf("TrainingRuntime %s/%s is preserved after upgrade", runtimeNamespaceName, customRuntimeName) | |
| // TestVerifyTrainingRuntime verifies the custom TrainingRuntime survives upgrade | |
| // with its spec preserved. | |
| func TestVerifyTrainingRuntime(t *testing.T) { | |
| Tags(t, PostUpgrade) | |
| test := With(t) | |
| namespace := GetNamespaceWithName(test, runtimeNamespaceName) | |
| defer DeleteTestNamespace(test, namespace) | |
| // Verify TrainingRuntime still exists after upgrade by listing all runtimes | |
| runtimes, err := test.Client().Trainer().TrainerV1alpha1().TrainingRuntimes(runtimeNamespaceName).List( | |
| test.Ctx(), metav1.ListOptions{}) | |
| test.Expect(err).NotTo(HaveOccurred(), "Failed to list TrainingRuntimes") | |
| // Find Custom TrainingRuntime in the list | |
| var foundRuntime *trainerv1alpha1.TrainingRuntime | |
| for i := range runtimes.Items { | |
| if runtimes.Items[i].Name == customRuntimeName { | |
| foundRuntime = &runtimes.Items[i] | |
| break | |
| } | |
| } | |
| test.Expect(foundRuntime).NotTo(BeNil(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName) | |
| if foundRuntime != nil { | |
| container := foundRuntime.Spec.Template.Spec.ReplicatedJobs[0].Template.Spec.Template.Spec.Containers[0] | |
| test.Expect(container.Image).To(Equal(GetSleepImage())) | |
| test.Expect(container.Command).To(Equal([]string{"sleep"})) | |
| test.Expect(container.Args).To(Equal([]string{"24h"})) | |
| } |
🤖 Prompt for AI Agents
In `@tests/trainer/trainer_trainingruntime_upgrade_test.go` around lines 60 - 83,
The test currently only verifies existence; update TestVerifyTrainingRuntime to
also assert the custom TrainingRuntime's PodSpec preserved fields by locating
the found runtime from runtimes.Items, then extracting its container spec (e.g.,
runtime.Spec.Template.Spec.Containers[0]) and asserting container.Image,
container.Command and container.Args equal the expected values used when
creating the custom runtime (use the same expected literals/variables as the
original test setup). Use test.Expect/...To(Equal(...)) on container.Image,
container.Command and container.Args to fail the test if they differ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/trainer/trainer_upgrade_sleep_test.go`:
- Around line 114-116: The helper sleepPodRestartCount reads
pod.Status.ContainerStatuses[0] without checking the slice length, which can
panic if ContainerStatuses is empty; update sleepPodRestartCount to defensively
check len(pod.Status.ContainerStatuses) > 0 and return a safe default (e.g., 0)
when empty, otherwise return int(pod.Status.ContainerStatuses[0].RestartCount),
so callers of sleepPodRestartCount won't encounter an out-of-range panic.
♻️ Duplicate comments (1)
tests/trainer/trainer_upgrade_sleep_test.go (1)
60-77: Verify phase should assert pod is still running.The check validates restart count and pod count but does not confirm the pod is still in
Runningphase. A pod that completed or crashed with zero restarts would pass this assertion.🐛 Suggested fix
test.Expect(GetPods(test, sleepNamespaceName, metav1.ListOptions{})). Should( And( HaveLen(1), + ContainElement(WithTransform(podPhase, Equal(corev1.PodRunning))), ContainElement(WithTransform(sleepPodRestartCount, BeNumerically("==", 0))), ), )
| func sleepPodRestartCount(pod corev1.Pod) int { | ||
| return int(pod.Status.ContainerStatuses[0].RestartCount) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds check to prevent potential panic.
Accessing ContainerStatuses[0] without verifying the slice is non-empty could panic if the pod hasn't initialized container status yet.
🐛 Suggested defensive fix
func sleepPodRestartCount(pod corev1.Pod) int {
+ if len(pod.Status.ContainerStatuses) == 0 {
+ return 0
+ }
return int(pod.Status.ContainerStatuses[0].RestartCount)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func sleepPodRestartCount(pod corev1.Pod) int { | |
| return int(pod.Status.ContainerStatuses[0].RestartCount) | |
| } | |
| func sleepPodRestartCount(pod corev1.Pod) int { | |
| if len(pod.Status.ContainerStatuses) == 0 { | |
| return 0 | |
| } | |
| return int(pod.Status.ContainerStatuses[0].RestartCount) | |
| } |
🤖 Prompt for AI Agents
In `@tests/trainer/trainer_upgrade_sleep_test.go` around lines 114 - 116, The
helper sleepPodRestartCount reads pod.Status.ContainerStatuses[0] without
checking the slice length, which can panic if ContainerStatuses is empty; update
sleepPodRestartCount to defensively check len(pod.Status.ContainerStatuses) > 0
and return a safe default (e.g., 0) when empty, otherwise return
int(pod.Status.ContainerStatuses[0].RestartCount), so callers of
sleepPodRestartCount won't encounter an out-of-range panic.
tests/common/support/namespace.go
Outdated
| // CreateOrGetUpgradeTestNamespace creates a namespace with the given name if it doesn't exist, | ||
| // or returns the existing namespace if it does. This is useful for upgrade tests where | ||
| // the namespace needs to persist across pre-upgrade and post-upgrade test phases. | ||
| func CreateOrGetUpgradeTestNamespace(t Test, name string, options ...Option[*corev1.Namespace]) *corev1.Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick -
| func CreateOrGetUpgradeTestNamespace(t Test, name string, options ...Option[*corev1.Namespace]) *corev1.Namespace { | |
| func CreateOrGetTestNamespace(t Test, name string, options ...Option[*corev1.Namespace]) *corev1.Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateOrGetTestNamespace is already exist - https://github.com/opendatahub-io/distributed-workloads/blob/main/tests/common/support/test.go#L170 and this would cause the naming conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such case something like CreateOrGetTestNamespaceWithName?
Current name hints that the function is usable for upgrade only, though in reality it can be used anywhere.
| break | ||
| } | ||
| } | ||
| test.Expect(found).To(BeTrue(), "Custom TrainingRuntime %s should exist after upgrade", customRuntimeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick - It is better to assert on whole list of runtimes, that way you get more context (list of all existing runtimes) when assertion fails.
| customRuntimeName = "custom-sleep-runtime" | ||
| ) | ||
|
|
||
| func TestSetupTrainingRuntime(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you cover additional scenario - all TrainingRuntimes which exist in PreUpgrade has to be kept after upgrade.
To persist runtime names you can use for example ConfigMap (or anything else).
| upgradeTrainJobName = "trainjob-upgrade" | ||
| ) | ||
|
|
||
| func TestSetupUpgradeTrainJob(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about another test case - same like this one, with the difference of using specific TrainingRuntime (i.e. torch_distributed_cuda128_torch29_py312.yaml, runtime name would have to be fetched somehow so we don't have to hardcode it here).
The reason for this additional test is to verify that runtime which is not updated still works properly. This test would have a main meaning for upgrades between stable versions, where default training runtime is updated, but specific training runtime isn't. It is good to verify that updated Trainer v2 operator still works properly with old training runtime.
WDYT?
49efd36 to
99c1724
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/trainer/trainer_trainingruntime_upgrade_test.go`:
- Around line 66-68: The current test calls DeleteTestNamespace immediately
after creating namespace via GetNamespaceWithName which can remove the custom
runtime before TestVerifyAllTrainingRuntimes and other post-upgrade checks run;
remove the early DeleteTestNamespace calls (references: GetNamespaceWithName,
DeleteTestNamespace) from within the per-test setup and instead perform
namespace deletion in a final cleanup step that runs after all post-upgrade
validations (e.g., a dedicated TestSuiteTeardown or a final test that deletes
the namespace), ensuring TestVerifyAllTrainingRuntimes and related checks run
against the created namespace first.
| namespace := GetNamespaceWithName(test, runtimeNamespaceName) | ||
| defer DeleteTestNamespace(test, namespace) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid deleting the namespace before all post-upgrade validations finish.
DeleteTestNamespace here can remove the custom runtime before TestVerifyAllTrainingRuntimes runs, which makes the suite order-dependent and flaky.
💡 Suggested change (remove early cleanup; move to a dedicated final cleanup step)
- namespace := GetNamespaceWithName(test, runtimeNamespaceName)
- defer DeleteTestNamespace(test, namespace)
+ namespace := GetNamespaceWithName(test, runtimeNamespaceName)Please relocate namespace cleanup to a stage that’s guaranteed to run after all post-upgrade checks (e.g., a dedicated cleanup test or suite-level teardown).
Also applies to: 174-205
🤖 Prompt for AI Agents
In `@tests/trainer/trainer_trainingruntime_upgrade_test.go` around lines 66 - 68,
The current test calls DeleteTestNamespace immediately after creating namespace
via GetNamespaceWithName which can remove the custom runtime before
TestVerifyAllTrainingRuntimes and other post-upgrade checks run; remove the
early DeleteTestNamespace calls (references: GetNamespaceWithName,
DeleteTestNamespace) from within the per-test setup and instead perform
namespace deletion in a final cleanup step that runs after all post-upgrade
validations (e.g., a dedicated TestSuiteTeardown or a final test that deletes
the namespace), ensuring TestVerifyAllTrainingRuntimes and related checks run
against the created namespace first.
Resolves RHOAIENG-41638
Description
Add upgrade tests for Trainer v2
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit
Tests
Bug Fixes / Improvements
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.