Skip to content

Conversation

@abussy
Copy link
Collaborator

@abussy abussy commented Jul 8, 2025

SLURM_GPUS_PER_TASK seems to be deprecated on Alps. The current submission script fails because the variable is passed to torch as an empty string. SLURM_GPUS_ON_NODE returns the proper number.

@abussy abussy requested a review from lukasgd July 8, 2025 16:06
@github-actions

This comment has been minimized.

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

preview available: https://docs.tds.cscs.ch/191

@github-actions

This comment was marked as off-topic.

@msimberg
Copy link
Collaborator

msimberg commented Jul 9, 2025

You can ignore the spell checker for now. I'm fixing typos and adding a bunch of words to the whitelist in separate PRs (e.g. #192). Bear with me for a moment...

That said, if it points out a real typo, please do fix it 😉 (not the case here for the changes introduced here; it's checking the whole file that was changed though)

@lukasgd
Copy link
Contributor

lukasgd commented Jul 9, 2025

Thanks, confirming the change is necessary. Additionally, we'll need to pin the nanotron version and move the tutorials to a more visible place, but that can be done in a follow-up PR.

@msimberg msimberg added this pull request to the merge queue Jul 9, 2025
Merged via the queue into eth-cscs:main with commit 9462399 Jul 9, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants