Skip to content

Conversation

@neumanbrett
Copy link
Contributor

Deleting the sections for "Shared vs. Exclusive" and distilled them into an example with a tip on how these impact performance or queue times.

Added sections for each node type. The HTC and GPGPU sections feel a bit basic so I'm open to feedback either to add detail or just remove the sections. The largemem and vis sections have a few tricks that are worth keeping.

@neumanbrett neumanbrett requested a review from vanderwb November 20, 2025 23:45
Copy link
Collaborator

@vanderwb vanderwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to get a review in here. I agree those sections are a bit basic, and I can't think of much to add at the moment, but I think we should keep them and can expound as we think of useful guidance to add.

Thanks for fixing this up!

The L40 nodes are also capable of basic GPGPU tasks like AI inference and are less utilized than the nodes within the GPGPU queue. This could significantly reduce your wait time in the queue.

!!! info
Data and Visualization node type requests will be submitted to the `vis` queue and these GPUs are shared among multiple users on a node. The maximum selectable GPUs is one and exclusive access cannot be guaranteed on the Data and Visualization node types. If you need exclusive node access, use the ML & GPGPU node types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would word this as "the maximum selectable ngpus is one"

Instead of GPUs, use the resource type to be clear that the number of gpus in the select statement is a max of 1.
@neumanbrett neumanbrett merged commit d19be0f into NCAR:main Jan 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants