-
Notifications
You must be signed in to change notification settings - Fork 133
Description
Time it takes to complete the test suite, roughly:
- 7 minutes with
nvidia.com/gpuallocation done by the device plugin - 12 minutes with
nvidia.com/gpuallocation done viaDRAExtendedResource
The difference is significant, and generally reproducible.
To a certain extent, a difference is expected: the DRA flow involves more work, and more components to coordinate.
And maybe everything already flows as fast as it can, with DRA. However, I think it will be very interesting to at some point make a proper distributed profiling exercise, to see precisely where we spend how much time in the information flow. There's probably a way to make some hops more snappy, with better event propagation or some tweaks here and there.
Performance optimization is clearly less important than achieving correctness and doing simplification work. However, to support wide adoption of GPU allocation via DRA we have to understand performance implications when users migrate from the device plugin world to DRA.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
