diff --git a/proposals/bring_your_own_argo_workflows/byoaw_context.md b/proposals/bring_your_own_argo_workflows/byoaw_context.md new file mode 100644 index 00000000000..7dd1ea90ebb --- /dev/null +++ b/proposals/bring_your_own_argo_workflows/byoaw_context.md @@ -0,0 +1,173 @@ +## Feature Overview +Data Science Pipelines currently deploys a standalone Argo Workflow Controller and its respective resources, including CRDs and other cluster-scoped manifests. This could potentially cause conflicts on clusters that already have a separate Argo Workflows deployed on the cluster, so the intent of this feature is to handle and reconcile these situations appropriately. +This feature will implement a global configuration option to disable WorkflowControllers from being deployed alongside DataSciencePipelineApplications, and instead use user-provided Argo Workflows instead. Consequently, this feature will also include documentation of supported versions between these “Bring your own” Argo installations and current versions of ODH/Data Science Pipelines, and improving our testing strategy around this as we validate this compatibility. + +## Why do we need this feature +Potential users, who have their own Argo Workflows installation already running on their clusters, have noted that the current architecture of Data Science Pipelines would conflict with their environment, as DSPAs currently provision their own Argo Workflow Controller. This would create a competition condition between the user-provided and DSP-provisioned AWF instances, and therefore this prevents the user from adopting DSP. Adding the ability to disable DSP-provided WorkflowControllers and instead use a “Bring-your-own” instance removes this block. + +## Feature Requirements +### High level requirements +* As a Cluster Administrator I want to be able to install ODH DSP in a cluster that has an existing Argo Workflows installation. +* As a Cluster Administrator I want to be able to globally enable and disable deploying Argo WorkflowControllers in a Data Science Project with a Data Science Pipelines Application installed. +* As a Cluster Administrator I want to be able to add or remove all Argo WorkflowControllers from managed Data Science Pipelines Applications by updating a platform-level configuration. +* As a Cluster Administrator I want to be able to upgrade my ODH cluster and the DSP component in a cluster that has Argo Workflows installation. +* As a Cluster Administrator I want to manage the lifecycle of my ODH and Argo Workflow installation independently. +* As a Cluster Administrator, I want to easily understand what versions of Argo are compatible with what versions of DSP + +### Non-functional requirements +* Pre-existing Argo CRDs and CRs should not be removed when installing DSP + * Removing the CRDs on DSP install would constitute a destructive installation side effect which needs to be avoided (breaks existing workflows) + * If a diff in pre-existing and shipped Argo CRDs exists, need to update-in-place, assuming compatibility is supported + * Includes Workflows, WorkflowTemplates, CronWorkflows, etc +* Version of supported Argo Workflows, and latest version of n-1 previous minor release, would need to be tracked and tested for compatibility upon new minor releases + * Example: ensure an ArgoWF v3.4.18 release is still compatible while DSP is using v3.4.17 +* Maintain a compatibility matrix of ArgoWF backend to DSP releases +* Add configuration mechanism to globally enable/disable deploying managed Argo WCs in DSPAs +* Add mechanism to DSPO to remove a subcomponent (such as Argo WC), rather than just removing management responsibilities of it +* Provide a migration plan for when DSP needs to upgrade to new ArgoWF version while using external ArgoWF +* Ensure that workflow runs on DSP using an external ArgoWF are only visible to users with access to the containing Project +* Update, improve and document a testing strategy for coverage of supported versions of Argo workflows for a given ODH version +* Update, improve and document a testing strategy for coverage of latest version of previous minor release of Argo Workflows for a given ODH version +* Get upstream community to add support and document multiple versions of Argo Workflows dependencies +* Documentation about the support and versions supported. +* Update the ODH and DSP operators to prevent creation of DSPAs with DSP-managed Workflow Controllers in cases where a pre-existing Argo Workflows installation is detected (P1: depends on feasibility of this detection mechanism) + +### Supported Version Compatibility +The Kubeflow Pipelines backend has codebase dependencies with Argo Workflows libraries, which in turn have interactions with the deployed Argo Workflows pipeline engine via k8s interfaces (CRs, etc). In turn, the Data Science Pipelines Application can be deployed with components that have AWF dependencies independent of the deployed Argo Workflows backend. The consequence of this is that it is possible for the API Server to be out-of-sync or not fully compatible with the deployed Workflow Controller, especially one that is deployed by a user outside of a Data Science Pipelines Application stack. Therefore, a compatibility matrix will need to be created, documented, tested, and maintained. + +Current messaging states that there is no written guarantee that future releases of Argo Workflows are compatible with previous versions, even Z-streams. However, community maintainers have stated they are working with this in mind and with the intention of introducing a written mandate that z-stream releases will not introduce breaking changes. Additionally, Argo documentation states patch versions will only contain bug fixes and minor features, which would include breaking changes. This will help broaden our support matrix and testing strategy so we should work upstream to cultivate and introduce this as quickly as possible. + +With that said, there is also no guarantee that Minor releases of Argo Workflows will not introduce breaking changes. In fact, we have seen multiple occasions where this happens (3.3 to 3.4 upgrade, for instance, required a very non-trivial PR that blocked upstream dependency upgrades for over a year. In contrast, the 3.4 to 3.5 upgrade was straightforward with no introduced breaking changes. This suggests that minor AWF upgrades will always carry inherent risk and therefore should not be included in the support matrix, at least without extensive testing. + +Given these conditions, an example compatibility matrix would look like the following table: + +| **ODH Version** | **Supported ArgoWF Version, Current State** | **Supported Range of ArgoWF Versions, upstream z-stream stability mandate accepted** | +|-------------------|---------------------------------------------|--------------------------------------------------------------------------------------| +| 3.4.1 | 3.4.16 | 3.4.16 | +| 3.5.0 | 3.5.14, 3.5.10 - 3.5.13, … | 3.5.x | +| 3.6.0 | 3.5.14 | 3.5.x - 3.5.y | + +### Out of scope +* Isolating a DSP ArgoWF WC from a vanilla cluster-scoped ArgoWF installation +* Using partial ArgoWF installs in combination with DSP-shipped Workflow Controller + +### Upgrades/Migration +In this feature, because the user is providing their own Workflow Controller, there will need to be documentation written on the Upgrade procedure such that self-provided AWF installations remain in-sync with the version supported by ODH during upgrades of the platform operator and/or DSPO. This should be simple - typically an AWF upgrade just involves re-applying manifests from a set of folders. Regardless, documentation should point to these upstream procedures to simplify the upgrade process. + +A migration plan should also be drafted (for switching the backing pipeline engine between user-provided and dspo-managed). That is - if a DSPA has a WC but the user wishes to remove it and leverage their own ArgoWF, how are runs, versions, etc persisted between the two Argo Workflows instances? As it stands now, because DSP stores metadata and artifacts in MLMD and S3, respectively, these should be hot-swappable and run history/artifact lineage should be maintained. The documentation produced should mention these conditions. + +Documentation should also mention that users with self-managed Argo Workflows will be responsible for upgrading their ODH installations appropriately to stay in-support with Argo Workflows. That is - if a user has brought their own AWF installation and it goes out-of-support/EOL, the user will be responsible with upgrading ODH to a version that has DSP built on an AWF backend that is still in-support. This can be done by cross-referencing the support matrix proposed above. ODH will not be responsible for rectifying conditions where an out-of-support Argo Workflows version is installed alongside a supported version of ODH, nor will ODH block on upgrading if this condition is encountered. Consequently, this also means that shipped/included ArgoWorkflowControllers of the latest ODH release will support an Argo Workflows version that is still maintained and supported by the upstream Argo community. + +### Multiple Workflow Controller Conflicts +We will need to account for possible situations where a cluster-scoped Workflow Controller has been deployed on a cluster, and then a DSPA is created without disabling the namespace-scoped Workflow Controller in the DSPA spec. + +Open Questions to answer via SPIKE: +Should we attempt to detect this condition? +Should this just be handled in documentation as an unsupported configuration? + +Conversely, if a WorkflowController already exists in a deployed DSPA and a user then deploys their own cluster-scoped Argo Workflow Controller, do we handle this the same way? Should the DSPOs detect an incompatible environment and attempt to reconcile by removing WCs? What are the consequences of this? + +These detection features would be “Nice to haves” ie P1, but not necessary for the MVP of the feature + +### Uninstall +Uninstallation of the component should remain consistent with the current procedure - deleting a DSPA should delete the included Workflow Controller, but should have no bearing on an onboard/user-provided WC. Users that have disabled WC deployments via the global toggle switch, the main mechanism for BYO Argos, also will remain unaffected - removing a DSPA that does not have a WC because it has not been deployed will still be removed in the same standard removal procedure. + +### ODH/DSPO Implementation +DSPO already supports deployment of a Data Science Pipelines Application stack without a Workflow Controller, so no non-trivial code changes should be necessary. This can be done by specifying spec.workflowController.deploy as false in the DSPAs + +```--- +apiVersion: datasciencepipelinesapplications.opendatahub.io/v1 +kind: DataSciencePipelinesApplication +metadata: + name: dspa + namespace: dspa +spec: + dspVersion: v2 + workflowController: + deploy: false + ... + ``` + +With that said, for ODH installations with a large number of DSPAs it would be unsustainable to require editing every DSPA individually. A global toggle mechanism must be implemented instead - one which would remove the Workflow Controller from ALL managed DSPAs. This would be set in the DataScienceCluster CR (see example below) and would involve coordination with the Platform dev team for implementation. Given that, documentation in the DSPA CRD will need to be added to notify users that it is an unsupported configuration to have individual WCs disabled if a user is providing their own Argo WorkflowController, and that the field is for development purposes only. + +Example DataScienceCluster with WorkflowControllers globally disabled: +```--- +kind: DataScienceCluster +... +spec: + components: + datasciencepipelines: + managementState: Managed + argoWorkflowsControllers: + managementState: Removed +``` + +Another consequence of this would be that the DSPO will need to have functionality to remove sub-components such as the WorkflowController (but not backing data, such as run details, metrics, etc) from an already-deployed DSPA. Currently, setting deploy to false simply removes the management responsibility of the DSPA for that Workflow Controller - it will still exist assuming it was deployed at some point (deploy set to true). See “Uninstall” section below for more details. + +Because the Argo RBAC and CRDs are installed on the platform level (i.e. when DSPO is created), these would be left in place even if the “global switch” is toggled to remove all DSPA-owned WCs. The DSP team would need to update the deployment/management mechanism, as updates made to these by a user to support bringing their own AWF would be overwritten by the platform operator. + + +## Test Plan Requirements +* Do not generate any code +* Create a high level test plan with sections +* Test plan should include the maintaining and validating changes against the compatibility matrix. The intent here is to cover an “N” and “N-1” version of Argo Workflows for verification of compatibility. +* Each Section is group of tests by type of tests with summary describing what types of tests are being covered and why +* Test Sections: + * Cluster config + * Negative functional tests + * Positive functional tests + * Security Tests + * Boundary tests + * Performance tests + * Compatibility matrix tests + * Miscellaneous Tests + * Final Regression/Full E2E Tests +* Test Cases for `Cluster config` section: + * [Kubernetes Native Mode](https://github.com/kubeflow/pipelines/tree/master/proposals/11551-kubernetes-native-api) + * FIPS Mode + * Disconnected Cluster +* Test Cases for `Negative functional tests` section: + * With conflicts Argo Workflow controller instances (DSP and External controllers coexisting and looking for the same type of events) + * With DSP and external workflow controller on different RBAC + * DSP with incompatible workflow schema +* Test Cases for `Positive functional tests` section: + * With artifacts + * Without artifacts + * For Loop + * Parallel for + * Custom root kfp + * Custom python package indexes + * Custom base images + * With input + * Without input + * With output + * Without output + * With iteration count + * With retry + * With cert handling + * etc. + * Override Pod Spec patch - create separate test cases for the following: + * Node taint + * PVC + * Custom labels +* Test Cases for `Security Tests` section: + * with different RBAC access with DSP at cluster level and Argo Workflow controller at Namespace level access +* Test Cases for `Miscellaneous Tests` section: + * Validate a successful run of a simple hello world pipeline With DSP Argo Workflow Controller to coexist with External Argo Workflow controller +* Test Cases for `Final Regression/Full E2E Tests` section (Run this on a fully deployed RHOAI cluster with latest of all products for that specific release): + * Run Iris Pipeline on a standard RHOAI Cluster with DB as storage + * Run Iris Pipeline on a FIPS enabled RHOAI Cluster + * Run Iris Pipeline on a disconnected RHOAI Cluster + * Run Iris Pipeline on a standard RHOAI Cluster with K8s Native API Storage +* Test case should be in a Markdown table format and include following: + - test case summary + - test steps + + Test steps should be a HTML format ordered list + - Expected results + + If there are multiple expectations, then it should be in a HTML format ordered list +* Iterate over 5 times before generating a final output +* Use this test plan documentation as an Example test plan document https://github.com/kubeflow/pipelines/blob/c1876c509aca1ffb68b467ac0213fa88088df7e1/proposals/11551-kubernetes-native-api/TestPlan.md +* Create a Markdown file as the output test plan + +### Example Test Plan +https://github.com/kubeflow/pipelines/blob/c1876c509aca1ffb68b467ac0213fa88088df7e1/proposals/11551-kubernetes-native-api/TestPlan.md diff --git a/proposals/bring_your_own_argo_workflows/byoaw_test_plan.md b/proposals/bring_your_own_argo_workflows/byoaw_test_plan.md new file mode 100644 index 00000000000..d43ba9f2588 --- /dev/null +++ b/proposals/bring_your_own_argo_workflows/byoaw_test_plan.md @@ -0,0 +1,681 @@ +**Assisted-by**: Cursor +# Test Plan: Bring Your Own Argo Workflows (BYOAW) + +## Table of Contents +1. [Overview](#overview) +2. [Test Scope](#test-scope) +3. [Test Environment Requirements](#test-environment-requirements) +4. [Test Categories](#test-categories) +5. [Success Criteria](#success-criteria) +6. [Risk Assessment](#risk-assessment) +7. [Test Execution Phases](#test-implementationexecution-phases) +## Overview + +This test plan validates the "Bring Your Own Argo Workflows" feature, which enables Data Science Pipelines to work with existing Argo Workflows installations instead of deploying dedicated WorkflowControllers. The feature includes a global configuration mechanism to disable DSP-managed WorkflowControllers and ensures compatibility with user-provided Argo Workflows. + +The plan covers comprehensive testing scenarios including: +- **Co-existence validation** of DSP and external Argo controllers competing for same events +- **Pre-existing Argo detection** and prevention mechanisms +- **CRD update-in-place** functionality and conflict resolution +- **RBAC compatibility** across different permission models (cluster vs namespace level) +- **Workflow schema version compatibility** and API compatibility validation +- **Z-stream (patch) version compatibility** testing +- **Data preservation** for WorkflowTemplates, CronWorkflows, and pipeline data +- **Independent lifecycle management** of ODH and external Argo Workflows installations +- **Project-level access controls** ensuring workflow visibility boundaries +- **Comprehensive migration scenarios** and upgrade path validation + +## Test Scope + +### In Scope +- Global configuration toggle to disable/enable WorkflowControllers across all DSPAs +- Compatibility validation with external Argo Workflows installations +- Version compatibility matrix testing (N and N-1 versions) +- Migration scenarios between DSP-managed and external Argo configurations +- Conflict detection and resolution mechanisms +- Co-existence testing of DSP and external WorkflowControllers competing for same events +- RBAC compatibility across different permission models (cluster vs namespace level) +- Workflow schema version compatibility validation +- DSPA lifecycle management with external Argo +- Security and RBAC integration with external Argo +- Performance impact assessment +- Upgrade scenarios for ODH with external Argo +- Hello world pipeline validation in co-existence scenarios + +### Out of Scope +- Partial ArgoWF installs combined with DSP-shipped Workflow Controller +- Isolation between DSP ArgoWF WC and vanilla cluster-scoped ArgoWF installation + +## Test Environment Requirements + +### Prerequisites +- OpenShift/Kubernetes clusters with ODH/DSP installed +- Multiple test environments with different Argo Workflows versions +- Access to modify DataScienceCluster and DSPA configurations +- Sample pipelines covering various complexity levels +- Test data for migration scenarios + +### Test Environments +| Environment | Argo Version | DSP Version | Purpose | +|-------------|----------------|-------------|-------------------------------| +| Env-1 | Current(3.7.x) | Current | N version compatibility | +| Env-2 | 3.6.x | Current | N-1 version compatibility | +| Env-3 | 3.4.x - 3.5.y | Previous | Upgrade scenarios | + +## Test Categories + +## 1. Cluster Configuration Tests +This section covers tests for different cluster configurations to ensure BYOAW functionality across various deployment scenarios. + +### 1.1 Global Configuration Toggle + +| Test Case ID | TC-CC-001 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify global toggle to disable WorkflowControllers works correctly | +| **Test Steps** |
  1. Install ODH with default configuration (WorkflowControllers enabled)
    1. Create DSPA and verify WorkflowController deployment
    1. Update DataScienceCluster to disable WorkflowControllers:
    `spec.components.datasciencepipelines.argoWorkflowsControllers.managementState: Removed`
    1. Verify existing WorkflowControllers are removed
    1. Create new DSPA and verify no WorkflowController is deployed
| +| **Expected Results** | - Global toggle successfully disables WorkflowController deployment
- Existing WorkflowControllers are cleanly removed
- New DSPAs respect global configuration
- No data loss during WorkflowController removal | + +| Test Case ID | TC-CC-002 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify re-enabling WorkflowControllers after global disable | +| **Test Steps** |
  1. Start with globally disabled WorkflowControllers
  2. Create DSPA without WorkflowController
  3. Re-enable WorkflowControllers globally
  4. Verify WorkflowController is deployed to existing DSPA
  5. Create new DSPA and verify WorkflowController deployment
| +| **Expected Results** | - Global re-enable successfully restores WorkflowController deployment
- Existing DSPAs receive WorkflowControllers
- New DSPAs deploy with WorkflowControllers
- Pipeline history and data preserved | + +### 1.2 Kubernetes Native Mode + +| Test Case ID | TC-CC-003 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify BYOAW compatibility with Kubernetes Native Mode - Create Pipeline Via CR | +| **Test Steps** |
  1. Configure cluster for Kubernetes Native Mode
  2. Install external Argo Workflows
  3. Disable DSP WorkflowControllers globally
  4. Create DSPA
  5. Create Pipeline via CR and create a pipeline run
| +| **Expected Results** | - Kubernetes Native Mode works with external Argo
- Pipeline execution uses Kubernetes-native constructs
- No conflicts between modes | + +| Test Case ID | TC-CC-003 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify BYOAW compatibility with Kubernetes Native Mode - Create Pipeline via API | +| **Test Steps** |
  1. Configure cluster for Kubernetes Native Mode
  2. Install external Argo Workflows
  3. Disable DSP WorkflowControllers globally
  4. Create DSPA
  5. Create Pipeline via API/UI and create a pipeline run
| +| **Expected Results** | - Kubernetes Native Mode works with external Argo
- Pipeline executes successfully
| + +### 1.3 FIPS Mode Compatibility + +| Test Case ID | TC-CC-004 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify BYOAW works in FIPS-enabled clusters | +| **Test Steps** |
  1. Configure FIPS-enabled cluster
  2. Install FIPS-compatible external Argo
  3. Configure DSPA with external Argo
  4. Execute pipeline suite
  5. Verify FIPS compliance maintained
| +| **Expected Results** | - External Argo respects FIPS requirements
- Pipeline execution maintains FIPS compliance
- No cryptographic violations | + +### 1.4 Disconnected Cluster Support + +| Test Case ID | TC-CC-005 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify BYOAW functionality in disconnected environments | +| **Test Steps** |
  1. Configure disconnected cluster environment
  2. Install external Argo from local registry
  3. Configure DSPA for external Argo
  4. Execute pipelines using local artifacts
  5. Verify offline operation
| +| **Expected Results** | - External Argo operates in disconnected mode
- Pipeline execution works without external connectivity
- Local registries and artifacts accessible | + +## 2. Positive Functional Tests +This section covers all positive functional tests to make sure that feature works as expected and there is no regression as well + +### 2.1 Basic Pipeline Execution + +| Test Case ID | TC-PF-001 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify basic pipeline execution with external Argo | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Submit simple addition pipeline
  3. Monitor execution through DSP UI
  4. Verify completion and results
  5. Check logs and artifacts
| +| **Expected Results** | - Pipeline submits successfully
- Execution progresses normally
- Results accessible through DSP interface
- Logs and monitoring functional | + +### 2.2 Complex Pipeline Types +Runs of different types of pipeline specs executes successfully. Pipelines that exercise all different inputs and outputs of a launcher/driver + +| Test Case ID | TC-PF-002 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with artifacts" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute pipeline - Pipelines with artifacts
  3. Verify each pipeline type executes correctly
  4. Validate artifacts, metadata, and custom configurations
| +| **Expected Results** | - Pipeline execute successfully
- Artifacts are produced to the right s3 location and are consumed correctly | + +| Test Case ID | TC-PF-003 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines without artifacts" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines without artifacts
  3. Verify each pipeline type executes correctly
| | +| **Expected Results** | - Pipeline runs successfully
- No artifacts are produced to S3 | + +| Test Case ID | TC-PF-004 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "For loop constructs" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - For loop constructs
  3. Verify each pipeline type executes correctly
| | +| **Expected Results** | - Pipeline runs successfully
- DAGs inside the for loop are interated over correctly | + + +| Test Case ID | TC-PF-005 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Parallel for execution" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Parallel for execution
  3. Verify each pipeline type executes correctly
| | +| **Expected Results** | - Pipeline runs successfully
- Parallel DAGs running in parallel and completes successfully | + + +| Test Case ID | TC-PF-006 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Custom root KFP components" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Custom root KFP components
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Artifcats are uploaded in the custom S3 bucket rather than the default, and downstream components are consuming from this custom location | + +| Test Case ID | TC-PF-007 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Custom python package indexes" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Custom python package indexes
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- When driver and launcher downloads python packages, it downloads from the custom index rather than pypi | + +| Test Case ID | TC-PF-008 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with input parameters" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with input parameters
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
Components are consuming the right parameters (verify it in the logs or input resolution in the Argo Workflow Status) | + +| Test Case ID | TC-PF-009 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Custom base images" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Custom base images
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Components are downloading custom base images | + +| Test Case ID | TC-PF-010 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with both input and output artifacts" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with both input and output artifacts
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Upstream and Downstream components can produce & consume artifacts | + +| Test Case ID | TC-PF-011 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines without input parameters" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines without input parameters
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully | + +| Test Case ID | TC-PF-012 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with NO input artifacts, but just output artifacts" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with output artifacts
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Output artifacts (like a model/trained data) are produced to S3 correctly | + +| Test Case ID | TC-PF-013 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines without output artifacts" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines without output artifacts
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully | + +| Test Case ID | TC-PF-014 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with iteration count" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with iteration count
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- DAGs are iterated over for the correct number of iterations | + +| Test Case ID | TC-PF-015 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with retry mechanisms" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with retry mechanisms
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Components are retried the correct number of times in case of any failure | + +| Test Case ID | TC-PF-016 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Pipelines with certificate handling" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Pipelines with certificate handling
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Components gets the right certificate installed | + +| Test Case ID | TC-PF-017 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify run of "Conditional branching pipelines" pipeline | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute Pipeline - Conditional branching pipelines
  3. Verify each pipeline type executes correctly
| +| **Expected Results** | - Pipeline runs successfully
- Nested DAGs runs only if the expected condition is true | + +### 2.3 Pod Spec Override Testing +Tests to validate that if you override Pod Spec, then correct kubernetes properties gets applied when the pods are created + +| Test Case ID | TC-PF-018 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify pipeline execution with Pod spec overrides containing "Node taints and tolerations" | +| **Test Steps** |
  1. Configure pipelines with Pod spec patch : - Node taints and tolerations
  2. Execute pipelines with external Argo
| +| **Expected Results** | - Pod spec overrides applied successfully
- Pipelines schedule on correct nodes
- PVCs mounted and accessible
- Custom labels and annotations present | + +| Test Case ID | TC-PF-019 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify pipeline execution with Pod spec overrides containing "Custom labels and annotations" | +| **Test Steps** |
  1. Configure pipelines with Pod spec patch : - Custom labels and annotations
  2. Execute pipelines with external Argo
| +| **Expected Results** | - Pod spec overrides applied successfully
- PVCs mounted and accessible
- Custom labels and annotations present | + +| Test Case ID | TC-PF-020 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify pipeline execution with Pod spec overrides containing "Resource limits" | +| **Test Steps** |
  1. Configure pipelines with Pod spec patch : - Resource limits
  2. Execute pipelines with external Argo
| +| **Expected Results** | - Pod spec overrides applied successfully
- Overridden component pod has the right resource limit assigned
- PVCs mounted and accessible
- Custom labels and annotations present | + +| Test Case ID | TC-PF-021 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify pipeline execution with component using GPU "Set Acceleration type and limit" | +| **Test Steps** |
  1. Configure pipelines with component requesting GPU
  2. Execute pipelines with external Argo
| +| **Expected Results** | - Pod spec overrides applied successfully
- Overridden component pod has the correct GPU allocated
| + +### 2.4 Multi-DSPA Environment + +| Test Case ID | TC-PF-022 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify multiple DSPAs sharing external Argo | +| **Test Steps** |
  1. Create DSPAs in different namespaces
  2. Configure all for external Argo
  3. Execute pipelines simultaneously
  4. Verify namespace isolation
  5. Check resource sharing and conflicts
| +| **Expected Results** | - Multiple DSPAs operate independently
- Proper namespace isolation maintained
- No pipeline interference or data leakage
- Resource sharing works correctly | + +## 3. Negative Functional Tests +This section overs error handling scenarios to make sure we are handling non-ideal cases within expectations + +### 3.1 Conflicting WorkflowController Detection + +| Test Case ID | TC-NF-001 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify behavior with conflicting WorkflowController configurations | +| **Test Steps** |
  1. Deploy DSPA with WorkflowController enabled
  2. Install external Argo on same cluster
  3. Attempt pipeline execution
  4. Document conflicts and behavior
  5. Test conflict resolution mechanisms
| +| **Expected Results** | - System behavior is predictable
- Appropriate warnings displayed
- No data corruption
- Clear guidance provided | + +### 3.1.1 Co-existing WorkflowController Event Conflicts + +| Test Case ID | TC-NF-001a | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test DSP and External WorkflowControllers co-existing and competing for same events | +| **Test Steps** |
  1. Deploy DSPA with internal WorkflowController
  2. Install external Argo WorkflowController watching same namespaces
  3. Submit pipeline that creates Workflow CRs
  4. Monitor which controller processes the workflow
  5. Verify event handling and potential conflicts
  6. Test resource ownership and cleanup
| +| **Expected Results** | - Event conflicts properly identified
- Clear ownership of workflow resources
- No orphaned or stuck workflows
- Predictable controller behavior documented | + +### 3.2 Incompatible Argo Version + +| Test Case ID | TC-NF-002 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify behavior with unsupported Argo versions | +| **Test Steps** |
  1. Install unsupported Argo version
  2. Configure DSPA for external Argo
  3. Attempt pipeline execution
  4. Document error messages
  5. Verify graceful degradation
| +| **Expected Results** | - Clear incompatibility errors
- Graceful failure without corruption
- Helpful guidance for resolution | + +### 3.3 Missing External Argo + +| Test Case ID | TC-NF-003 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify behavior when external Argo unavailable | +| **Test Steps** |
  1. Configure DSPA for external Argo
  2. Stop/remove external Argo service
  3. Attempt pipeline submission
  4. Restore Argo and verify recovery
  5. Check data integrity
| +| **Expected Results** | - Clear error messages when Argo unavailable
- Graceful recovery when restored
- No permanent data loss | + +### 3.4 Invalid Pipeline Submissions + +| Test Case ID | TC-NF-004 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test invalid pipeline handling with external Argo | +| **Test Steps** |
  1. Submit pipelines from `data/pipeline_files/invalid/`
  2. Verify appropriate error handling
  3. Check error message clarity
  4. Ensure no system instability
| +| **Expected Results** | - Invalid pipelines rejected appropriately
- Clear error messages provided
- System remains stable
- No resource leaks | + +### 3.5 Unsupported Configuration Detection + +| Test Case ID | TC-NF-005 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify detection of unsupported individual DSPA WorkflowController disable | +| **Test Steps** |
  1. Set global WorkflowController management to Removed
  2. Attempt to create DSPA with individual `workflowController.deploy: false`
  3. Verify appropriate warning/error messages
  4. Test documentation guidance for users
  5. Ensure configuration is flagged as development-only
| +| **Expected Results** | - Unsupported configuration detected
- Clear warning messages displayed
- Documentation provides proper guidance
- Development-only usage clearly indicated | + +### 3.6 CRD Version Conflicts + +| Test Case ID | TC-NF-006 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test behavior with conflicting Argo CRD versions | +| **Test Steps** |
  1. Install DSP with specific Argo CRD version
  2. Install external Argo with different CRD version
  3. Attempt pipeline execution
  4. Verify conflict detection and resolution
  5. Test update-in-place mechanisms
| +| **Expected Results** | - CRD version conflicts detected
- Update-in-place works when compatible
- Clear error messages for incompatible versions
- No existing workflow corruption | + +### 3.7 Different RBAC Between DSP and External Argo + +| Test Case ID | TC-NF-007 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test DSP and external WorkflowController with different RBAC configurations | +| **Test Steps** |
  1. Configure DSP with cluster-level RBAC permissions
  2. Install external Argo with namespace-level RBAC restrictions
  3. Submit pipelines through DSP interface
  4. Verify RBAC conflicts and permission issues
  5. Test resource access and execution failures
  6. Document RBAC compatibility requirements
| +| **Expected Results** | - RBAC conflicts properly identified
- Clear error messages for permission issues
- Guidance provided for RBAC alignment
- No security violations or escalations | + +### 3.8 DSP with Incompatible Workflow Schema + +| Test Case ID | TC-NF-008 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test DSP behavior with incompatible workflow schema versions | +| **Test Steps** |
  1. Install external Argo with older workflow schema
  2. Configure DSP to use external Argo
  3. Submit pipelines with newer schema features
  4. Verify schema compatibility checking
  5. Test graceful degradation or error handling
  6. Document schema compatibility matrix
| +| **Expected Results** | - Schema incompatibilities detected
- Clear error messages about schema conflicts
- Graceful handling of unsupported features
- No workflow corruption or data loss | + +## 4. RBAC and Security Tests +Make sure that RBACs are handled properly and users cannot misuse clusters due to a security hole + +### 4.1 Namespace-Level RBAC + +| Test Case ID | TC-RBAC-001 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify RBAC with DSP cluster-level and Argo namespace-level access | +| **Test Steps** |
  1. Configure DSP with cluster-level permissions
  2. Configure Argo with namespace-level restrictions
  3. Create users with different permission levels
  4. Test pipeline access and execution
  5. Verify permission boundaries
| +| **Expected Results** | - RBAC properly enforced at both levels
- Users limited to appropriate namespaces
- No unauthorized access to pipelines
- Permission escalation prevented | + +### 4.2 Service Account Integration + +| Test Case ID | TC-RBAC-002 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify service account integration with external Argo | +| **Test Steps** |
  1. Configure custom service accounts
  2. Set specific RBAC permissions
  3. Execute pipelines with different service accounts
  4. Verify permission enforcement
  5. Test cross-namespace access controls
| +| **Expected Results** | - Service accounts properly integrated
- Permissions correctly enforced
- No unauthorized resource access
- Proper audit trail maintained | + +### 4.3 Workflow Visibility and Project Access Control + +| Test Case ID | TC-RBAC-003 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify workflows using external Argo are only visible to users with Project access | +| **Test Steps** |
  1. Create multiple Data Science Projects with different users
  2. Configure external Argo for all projects
  3. Execute pipelines from different projects
  4. Test workflow visibility across projects with different users
  5. Verify users can only see workflows from their accessible projects
  6. Test API access controls and UI filtering
  7. Verify external Argo workflows respect DSP project boundaries
| +| **Expected Results** | - Workflows only visible to users with project access
- Proper isolation between Data Science Projects
- API and UI enforce access controls correctly
- External Argo workflows respect DSP boundaries
- No cross-project workflow visibility | + +## 5. Boundary Tests +Type of performance test to confirm that our current limits to resources and artifacts are still handled properly + +### 5.1 Resource Limits + +| Test Case ID | TC-BT-001 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify behavior at resource boundaries | +| **Test Steps** |
  1. Configure external Argo with resource limits
  2. Submit resource-intensive pipelines
  3. Monitor resource utilization
  4. Verify appropriate throttling
  5. Test recovery when resources available
| +| **Expected Results** | - Resource limits properly enforced
- Appropriate queuing/throttling behavior
- Clear resource constraint messages
- Graceful recovery when resources free | + +### 5.2 Large Artifact Handling + +| Test Case ID | TC-BT-002 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify handling of large pipeline artifacts | +| **Test Steps** |
  1. Configure pipelines with large data artifacts
  2. Execute with external Argo
  3. Monitor storage and transfer performance
  4. Verify artifact integrity
  5. Test cleanup mechanisms
| +| **Expected Results** | - Large artifacts handled efficiently
- No data corruption or loss
- Acceptable transfer performance
- Proper cleanup after completion | + +### 5.3 High Concurrency + +| Test Case ID | TC-BT-003 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Test high concurrency scenarios | +| **Test Steps** |
  1. Submit multiple concurrent pipelines
  2. Monitor external Argo performance
  3. Verify all pipelines complete
  4. Check for resource contention
  5. Validate result consistency
| +| **Expected Results** | - High concurrency handled appropriately
- No pipeline failures due to contention
- Consistent execution results
- Stable system performance | + +## 6. Performance Tests +Load Testing - this is just to make sure that with the change in argo workflow, there is no impact on the performance of components that are under our control + +### 6.1 Execution Performance Comparison + +| Test Case ID | TC-PT-001 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Compare performance between internal and external Argo | +| **Test Steps** |
  1. Execute identical pipeline suite with internal WC
  2. Execute same suite with external Argo
  3. Measure execution times and resource usage
  4. Compare throughput and latency
  5. Document performance characteristics
| +| **Expected Results** | - Performance with external Argo acceptable
- No significant degradation vs internal WC
- Resource utilization within bounds
- Scalability maintained | + +### 6.2 Startup and Initialization + +| Test Case ID | TC-PT-002 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Measure DSPA startup time with external Argo | +| **Test Steps** |
  1. Measure DSPA creation time with internal WC
  2. Measure DSPA creation time with external Argo
  3. Compare initialization times
  4. Monitor resource usage during startup
  5. Document timing differences
| +| **Expected Results** | - Startup time with external Argo reasonable
- Initialization completes successfully
- Resource usage during startup acceptable
- No significant delays | + +## 7. Compatibility Matrix Tests +Based on the compatability matrix as defined in #Test Environments + +### 7.1 Current Version (N) Compatibility + +| Test Case ID | TC-CM-001 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Validate compatibility with current supported Argo version | +| **Test Steps** |
  1. Install current supported Argo version (e.g., 3.4.16)
  2. Configure DSPA for external Argo
  3. Execute comprehensive pipeline test suite
  4. Verify all features work correctly
  5. Document any limitations
| +| **Expected Results** | - Full compatibility with current version
- All pipeline features operational
- No breaking changes or issues
- Performance within acceptable range | + +### 7.2 Previous Version (N-1) Compatibility + +| Test Case ID | TC-CM-002 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Validate compatibility with previous supported Argo version | +| **Test Steps** |
  1. Install previous supported Argo version (e.g., 3.4.15)
  2. Configure DSPA for external Argo
  3. Execute comprehensive pipeline test suite
  4. Document compatibility differences
  5. Verify core functionality maintained
| +| **Expected Results** | - Core functionality works with N-1 version
- Any limitations clearly documented
- No critical failures or data loss
- Upgrade path available | + +### 7.2.1 Z-Stream Version Compatibility + +| Test Case ID | TC-CM-002a | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Validate compatibility with z-stream (patch) versions of Argo | +| **Test Steps** |
  1. Test current DSP with multiple z-stream versions of same minor Argo release
  2. Example: Test DSP v3.4.17 with Argo v3.4.16, v3.4.17, v3.4.18
  3. Execute standard pipeline test suite for each z-stream version
  4. Document any breaking changes in patch versions
  5. Verify backward and forward compatibility within minor version
| +| **Expected Results** | - Z-stream versions maintain compatibility
- No breaking changes in patch releases
- Smooth operation across patch versions
- Clear documentation of any exceptions | + +### 7.3 Version Matrix Validation + +| Test Case ID | TC-CM-003 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Systematically validate compatibility matrix | +| **Test Steps** |
  1. For each version in compatibility matrix:
    a. Deploy specific Argo version
    b. Configure DSPA
    c. Execute standard test suite
    d. Document results and issues
  2. Update compatibility matrix
  3. Identify unsupported combinations
| +| **Expected Results** | - Compatibility matrix accurately reflects reality
- All supported versions documented
- Unsupported combinations identified
- Clear guidance for version selection | + +### 7.4 DSP and External Argo Co-existence Validation + +| Test Case ID | TC-CM-004 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Validate successful hello world pipeline with DSP and External Argo co-existing | +| **Test Steps** |
  1. Deploy DSPA with internal WorkflowController
  2. Install external Argo WorkflowController on same cluster
  3. Submit simple hello world pipeline through DSP
  4. Verify pipeline executes successfully using DSP controller
  5. Verify external Argo remains unaffected
  6. Test pipeline monitoring and status reporting
  7. Validate artifact handling and logs access
| +| **Expected Results** | - Hello world pipeline executes successfully
- DSP WorkflowController processes the pipeline
- External Argo WorkflowController unaffected
- No resource conflicts or interference
- Pipeline status and logs accessible
- Artifacts properly stored and retrievable | + +### 7.5 API Server and WorkflowController Compatibility + +| Test Case ID | TC-CM-005 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify DSP API Server compatibility with different external WorkflowController versions | +| **Test Steps** |
  1. Deploy DSP API Server with specific Argo library dependencies
  2. Install external Argo WorkflowController with different version
  3. Test API Server to WorkflowController communication
  4. Verify Kubernetes API interactions (CRs, status updates)
  5. Test pipeline submission, execution, and status reporting
  6. Monitor for API compatibility issues or version mismatches
  7. Document API compatibility matrix
| +| **Expected Results** | - API Server communicates successfully with external WC
- Kubernetes API interactions work correctly
- Pipeline lifecycle management functions properly
- Status updates and monitoring work correctly
- API compatibility documented and validated | + +## 8. Uninstall and Data Preservation Tests +Verify that if you uninstall DSPA or Argo Workflow Controller, then the data is still preserved, so that the next time deployment happens, things continue - this includes use case for different deployment strategies + +### 8.1 DSPA Uninstall with External Argo + +| Test Case ID | TC-UP-001 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify DSPA uninstall behavior with external Argo | +| **Test Steps** |
  1. Configure DSPA with external Argo (no internal WC)
  2. Execute multiple pipelines and generate data
  3. Delete DSPA
  4. Verify external Argo WorkflowController remains intact
  5. Verify DSPA-specific resources are cleaned up
  6. Check that pipeline history is appropriately handled
| +| **Expected Results** | - DSPA removes cleanly
- External Argo WorkflowController unaffected
- No impact on other DSPAs using same external Argo
- Pipeline data handling follows standard procedures | + +### 8.2 DSPA Uninstall with Internal WorkflowController + +| Test Case ID | TC-UP-002 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify standard DSPA uninstall with internal WorkflowController | +| **Test Steps** |
  1. Configure DSPA with internal WorkflowController
  2. Execute pipelines and generate data
  3. Delete DSPA
  4. Verify WorkflowController is removed with DSPA
  5. Verify proper cleanup of all DSPA components
  6. Ensure no external Argo impact
| +| **Expected Results** | - DSPA and WorkflowController removed completely
- Standard cleanup procedures followed
- No resource leaks or orphaned components
- External Argo installations unaffected | + +### 8.3 Data Preservation During WorkflowController Transitions + +| Test Case ID | TC-UP-003 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify data preservation during WorkflowController management transitions | +| **Test Steps** |
  1. Create DSPA with internal WC and execute pipelines
  2. Disable WC globally (transition to external Argo)
  3. Verify run history, artifacts, and metadata preserved
  4. Re-enable WC globally (transition back to internal)
  5. Verify all historical data remains accessible
  6. Test new pipeline execution in both states
| +| **Expected Results** | - Pipeline run history preserved across transitions
- Artifacts remain accessible
- Metadata integrity maintained
- New pipelines work in both configurations | + +### 8.4 WorkflowTemplates and CronWorkflows Preservation + +| Test Case ID | TC-UP-004 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify preservation of WorkflowTemplates and CronWorkflows during DSP install/uninstall | +| **Test Steps** |
  1. Install external Argo and create WorkflowTemplates and CronWorkflows
  2. Install DSP with BYOAW configuration
  3. Verify existing WorkflowTemplates and CronWorkflows remain intact
  4. Create additional WorkflowTemplates through DSP interface
  5. Uninstall DSP components
  6. Verify all WorkflowTemplates and CronWorkflows still exist
  7. Test functionality of preserved resources with external Argo
| +| **Expected Results** | - Pre-existing WorkflowTemplates and CronWorkflows preserved
- DSP-created templates also preserved during uninstall
- All preserved resources remain functional
- No data corruption or resource deletion
- External Argo can use all preserved templates | + +## 9. Migration and Upgrade Tests +Covers migration from internal to external WC and vice versa. Also covers upgrade of ODH and Argo versions + +### 9.1 DSP-Managed to External Migration + +| Test Case ID | TC-MU-001 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify migration from DSP-managed to external Argo | +| **Test Steps** |
  1. Create DSPA with internal WorkflowController
  2. Execute pipelines and accumulate data
  3. Install external Argo
  4. Disable internal WCs globally
  5. Verify data preservation and new execution
| +| **Expected Results** | - Migration completes without data loss
- Historical data remains accessible
- New pipelines use external Argo
- Artifacts and metadata preserved | + +### 9.2 External to DSP-Managed Migration + +| Test Case ID | TC-MU-002 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify migration from external to DSP-managed Argo | +| **Test Steps** |
  1. Configure DSPA with external Argo
  2. Execute pipelines and verify data
  3. Re-enable internal WCs globally
  4. Remove external Argo configuration
  5. Verify continued operation
| +| **Expected Results** | - Migration to internal WC successful
- Pipeline history preserved
- New pipelines use internal WC
- No service interruption | + +### 9.3 ODH Upgrade Scenarios + +| Test Case ID | TC-MU-003 | +|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify ODH upgrade preserves external Argo setup | +| **Test Steps** |
  1. Configure ODH with external Argo
  2. Execute baseline pipeline tests
  3. Upgrade ODH to newer version
  4. Verify external Argo configuration intact
  5. Re-execute pipeline tests
| +| **Expected Results** | - Upgrade preserves BYOAW configuration
- External Argo continues working
- No functionality regression
- Configuration settings maintained | + +### 9.4 Argo Version Upgrade with External Installation + +| Test Case ID | TC-MU-004 | +|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify external Argo version upgrade scenarios | +| **Test Steps** |
  1. Configure DSPA with external Argo version N-1
  2. Execute baseline pipeline tests
  3. Upgrade external Argo to version N
  4. Verify compatibility matrix adherence
  5. Test pipeline execution post-upgrade
  6. Document any required ODH updates
| +| **Expected Results** | - External Argo upgrade completes successfully
- Compatibility maintained within support matrix
- Clear guidance for required ODH updates
- Pipeline functionality preserved | + +### 9.5 Independent Lifecycle Management + +| Test Case ID | TC-MU-005 | +|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify independent lifecycle management of ODH and external Argo | +| **Test Steps** |
  1. Install and configure ODH with external Argo
  2. Perform independent upgrade of external Argo installation
  3. Verify ODH continues operating without issues
  4. Perform independent upgrade of ODH
  5. Verify external Argo continues operating without issues
  6. Test independent scaling of each component
  7. Verify independent maintenance and restart scenarios
| +| **Expected Results** | - Independent upgrades work without mutual interference
- Each component maintains functionality during the other's maintenance
- Scaling operations work independently
- No forced coupling of upgrade/maintenance schedules
- Clear documentation of independence boundaries | + +## 10 Miscellaneous Tests +Anything that we did cover in the above sections and do not fall under a certain category as well + +### 10.1 Platform-Level CRD and RBAC Management + +| Test Case ID | TC-MT-001 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify platform-level Argo CRDs and RBAC remain intact with external Argo | +| **Test Steps** |
  1. Install DSPO which creates platform-level Argo CRDs and RBAC
  2. Install external Argo with different CRD versions
  3. Toggle global WorkflowController disable
  4. Verify platform CRDs are not removed
  5. Test that user modifications to CRDs are preserved
  6. Verify RBAC conflicts are handled appropriately
| +| **Expected Results** | - Platform-level CRDs remain intact
- User CRD modifications preserved
- RBAC conflicts resolved without breaking functionality
- Platform operator doesn't overwrite user changes | + +### 10.2 Sub-Component Removal Testing + +| Test Case ID | TC-MT-002 | +|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify sub-component removal functionality for WorkflowControllers | +| **Test Steps** |
  1. Deploy DSPA with WorkflowController enabled
  2. Execute pipelines and accumulate run data
  3. Disable WorkflowController globally
  4. Verify WorkflowController is removed but data preserved
  5. Verify backing data (run details, metrics) remains intact
  6. Test re-enabling WorkflowController preserves historical data
| +| **Expected Results** | - WorkflowController removed cleanly
- Run details and metrics preserved
- Historical pipeline data remains accessible
- Re-enabling restores full functionality | + +### 1.7 Pre-existing Argo Detection and Prevention + +| Test Case ID | TC-MT-003 | +|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify detection and prevention of DSPA creation when pre-existing Argo exists | +| **Test Steps** |
  1. Install external Argo Workflows on cluster
  2. Install ODH DSP operator
  3. Attempt to create DSPA with default configuration (WC enabled)
  4. Verify detection mechanism identifies pre-existing Argo
  5. Test prevention of DSPA creation or automatic WC disable
  6. Verify appropriate warning/guidance messages
  7. Test manual override if supported
| +| **Expected Results** | - Pre-existing Argo installation detected
- DSPA creation prevented or WC automatically disabled
- Clear guidance provided to user
- Manual override works when applicable
- No conflicts or resource competition | + +### 1.8 CRD Update-in-Place Testing + +| Test Case ID | TC-MT-004 | +|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **Test Case Summary** | Verify CRD update-in-place when differences exist between pre-existing and shipped CRDs | +| **Test Steps** |
  1. Install external Argo with specific CRD version
  2. Create Workflows, WorkflowTemplates, and CronWorkflows
  3. Install DSP with different compatible CRD version
  4. Verify CRDs are updated in-place
  5. Verify existing CRs (Workflows, WorkflowTemplates, CronWorkflows) remain intact
  6. Test new CR creation with updated CRD schema
  7. Verify no data loss or corruption
| +| **Expected Results** | - CRDs updated in-place successfully
- Existing Workflows, WorkflowTemplates, CronWorkflows preserved
- New CRs work with updated schema
- No data loss or corruption
- Compatibility maintained | + +## 11 Initiative Level Tests +This is to verify that the integration of this feature with other product components does not introduce any regression. So this should be the very last tests that we need to run after verifying that there is no regression if used with last RHOAI release of other product components + +| Test Case ID | TC-IL-001 | +|-----------------------|----------------------------------------------------------------| +| **Test Case Summary** | Verify that Iris Pipeline Runs on a **standard** RHOAI cluster | +| **Test Steps** |
  1. Run an IRIS pipeline
| +| **Expected Results** | Verify that the pipeline run succeeds | + +| Test Case ID | TC-IL-002 | +|-----------------------|--------------------------------------------------------------------| +| **Test Case Summary** | Verify that Iris Pipeline Runs on a **FIPS Enabled** RHOAI cluster | +| **Test Steps** |
  1. Run an IRIS pipeline
| +| **Expected Results** | Verify that the pipeline run succeeds | + +| Test Case ID | TC-IL-003 | +|-----------------------|--------------------------------------------------------------------| +| **Test Case Summary** | Verify that Iris Pipeline Runs on a **Disconnected** RHOAI cluster | +| **Test Steps** |
  1. Run an IRIS pipeline
| +| **Expected Results** | Verify that the pipeline run succeeds | + +## Success Criteria + +### Must Have +- All positive functional tests pass without failures +- Compatibility matrix validation complete for N and N-1 versions +- Z-stream (patch) version compatibility validated +- Migration scenarios preserve data integrity +- Security and RBAC properly enforced +- Performance within acceptable bounds (no >20% degradation) +- Platform-level CRD and RBAC management works correctly +- Data preservation during WorkflowController transitions +- Sub-component removal functionality validated +- Pre-existing Argo detection and prevention working +- CRD update-in-place functionality validated +- WorkflowTemplates and CronWorkflows preservation confirmed +- API Server to WorkflowController compatibility verified +- Workflow visibility and project access controls enforced + +### Should Have +- Negative test scenarios handled gracefully +- Clear error messages for all failure modes +- Unsupported configuration detection functional +- CRD version conflict resolution working +- RBAC conflict detection and resolution +- Schema compatibility validation working +- Co-existence scenarios validated successfully +- Independent lifecycle management validated +- Documentation complete and accurate +- Uninstall scenarios preserve external Argo integrity + +### Could Have +- Performance optimizations for external Argo scenarios +- Enhanced monitoring and observability +- Additional version compatibility beyond N-1 +- Automated detection of conflicting configurations +- Advanced CRD update-in-place mechanisms + +## Risk Assessment + +### High Risk +- Data loss during migration scenarios +- Security vulnerabilities in multi-tenant setups +- Performance degradation with external Argo +- Incompatibility with future Argo versions + +### Medium Risk +- Complex configuration management +- Upgrade complications +- Resource contention in shared scenarios +- Error handling gaps + +### Low Risk +- Minor UI/UX inconsistencies +- Documentation completeness +- Non-critical performance variations +- Edge case handling + +## Test Deliverables + +1. **Test Execution Reports** - Detailed results for each test phase with comprehensive coverage +2. **Enhanced Compatibility Matrix** - Validated version combinations including Z-stream compatibility and API compatibility +3. **Performance Benchmarks** - Comparative analysis of internal vs external Argo across all scenarios +4. **Comprehensive Security Assessment** - RBAC and isolation validation including project access controls +5. **Migration Documentation** - Complete procedures for all migration scenarios and lifecycle management +6. **Data Preservation Guidelines** - Best practices for maintaining data integrity during all transitions +7. **Uninstall Procedures** - Validated procedures for clean removal preserving WorkflowTemplates and CronWorkflows +8. **CRD Management Guidelines** - Platform-level CRD update-in-place and conflict resolution procedures +9. **Pre-existing Argo Detection Guide** - Implementation and configuration of detection mechanisms +10. **Configuration Validation Guide** - Detection and resolution of all unsupported configurations +11. **RBAC Compatibility Matrix** - Comprehensive guidelines for DSP and external Argo RBAC alignment +12. **Schema Compatibility Guide** - Workflow schema version compatibility and API compatibility matrix +13. **Co-existence Best Practices** - Detailed recommendations for running DSP and external Argo together +14. **Z-Stream Testing Strategy** - Framework for ongoing patch version compatibility validation +15. **API Compatibility Documentation** - DSP API Server to external WorkflowController compatibility guidelines +16. **Independent Lifecycle Management Guide** - Best practices for managing ODH and Argo independently +17. **Known Issues Log** - Comprehensive documentation of limitations and workarounds +18. **Final Test Report** - Executive summary with recommendations, lessons learned, and future testing strategy + + +## Test Implementation/Execution Phases +### Phase 1 +List Test Cases to be executed/implemented as part of this phase + +### Phase 2 +List Test Cases to be executed/implemented as part of this phase + +### Phase 3 +Full End to End tests for that specific RHOAI release (with the `latest` of all products) as covered in #initiative_level_tests section