Skip to content

[TRACKING] bug: gpu e2e test are queued on A10.2 runner due to capacity issue. #3336

@jaiakash

Description

@jaiakash

Hi, Due to host capacity failure, we are facing long queue on GPU E2E Test which uses A10.2 runner. This is tracking issue for that.

Context:
We recently migrated from A10.1 to A10.2 for our GPU E2E Test, but there is limit of 20 GPUs from CNCF side and due to some issue currently it is facing out of capacity failure.

Related Slack
Kubeflow Slack - https://cloud-native.slack.com/archives/C0742LDFZ4K/p1773344737522609
#cncf-ci-infra - https://cloud-native.slack.com/archives/C08P4HUFQ6M/p1773430783406709

cc @andreyvelich @XploY04 @Goku2099

/priority p2

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions