-
Hello, I am a big fan of backend.ai's fractional GPU technology! Since I first encountered this technology around October 2023, I have been thinking a lot about how it is implemented. For example, if the remaining capacity of physical GPUs(pGPUs) is 0.1, 0.1, 0.1 at each pGPU, and the size of the requested fGPU is equivalent to 0.3 pGPU, it would appear as if a single GPU with a size of 0.3 was allocated to the container, even if three physical GPU is allocated in the container actually. As a result, I heard that the user don't need to change their training code (i.e., it don't even need to modify their model code to work with multiple GPUs.) Is this correct? I have heard many rumors about how the Fractional GPU technology is implemented from external sources. I thought the only way to find out whether they are true or not is to ask directly. Since it's the property of Backend.AI, there might be parts that are difficult to answer... but I would be really grateful if you could provide even a small response! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
As you said, Backend.AI uses fgpu to allow you to use multiple pgpu allocations without changing the code. |
Beta Was this translation helpful? Give feedback.
When fGPU is enabled, Backend.AI splits large GPUs into smaller GPUs, but does not "merge" smaller GPUs into a large one. Though, it helps automatic configuration of multi-node, multi-GPU workloads by providing GPU-config environment variables along with homogeneously-sized fractions when it splits the GPUs to satisfy the resource requirements.