Skip to content

Conversation

@arun-thmn
Copy link
Contributor

With the recent changes in pcl-tiergarten the v100 support is no more available. So, we update the CI salloc partition.

@arun-thmn
Copy link
Contributor Author

Hi @rengolin @adam-smnk @rolfmorel @shahidact , should we rely on a100 alone or can we move to hopper partition available in pcl-tiergarten

@arun-thmn arun-thmn requested a review from adam-smnk September 16, 2025 12:37
@arun-thmn arun-thmn marked this pull request as ready for review September 16, 2025 12:38
@adam-smnk
Copy link
Contributor

The two a100 nodes should be fine.
We'll revise it in the future if we ever have problems with availability.

@adam-smnk
Copy link
Contributor

Not exactly sure why it fails.
Maybe sth clashes with the old cached LLVM build? You could try to add this change to the LLVM bump PR and see if it runs.

Also, now CUDA 13 being available, we should be able to use default system's gcc.
You could try to tweak setup_gpu_env.sh to stop sourcing older version.

@adam-smnk
Copy link
Contributor

Hmm, I see we also change gcc version in build_tpp.sh.
Then maybe try fresh LLVM build with a bump first before changing tooling versions.

@arun-thmn
Copy link
Contributor Author

arun-thmn commented Sep 16, 2025

Hmm, I see we also change gcc version in build_tpp.sh. Then maybe try fresh LLVM build with a bump first before changing tooling versions.

I think, this errors happening because of updates to pcl-tiergarten. This PR points to the old LLVM bump and older PRs before tiergarten upgradation passes. And, particularly this fails for pcl-spr* machines but passes for pcl-sprh*

Will investigate.

@arun-thmn
Copy link
Contributor Author

Merged the changes with PR: #1091. Hence, cancelling this.

@arun-thmn arun-thmn closed this Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants