Skip to content

Bug: MIG removal not in order of placement makes dynamic allocation less dynamic #933

@Pavel-Okruhlica-SZN

Description

@Pavel-Okruhlica-SZN

Summary:
When MIG devices get removed, they are sometimes removed from random Placement and not from left-right (or reverse). This could leave GPU unable to create a MIG profile when it has the resources but not in the right "place"
eg.
# nvidia-smi mig -lgi -i 3

+---------------------------------------------------------+
| GPU instances:                                          |
| GPU   Name               Profile  Instance   Placement  |
|                            ID       ID       Start:Size |
|=========================================================|
|   3  MIG 1g.10gb           19        7          0:1     |
+---------------------------------------------------------+
|   3  MIG 1g.10gb           19        8          1:1     |
+---------------------------------------------------------+
|   3  MIG 1g.10gb           19       12          5:1     |
+---------------------------------------------------------+

# nvidia-smi mig -lgip -i 3

+-------------------------------------------------------------------------------+
| GPU instance profiles:                                                        |
| GPU   Name               ID    Instances   Memory     P2P    SM    DEC   ENC  |
|                                Free/Total   GiB              CE    JPEG  OFA  |
|===============================================================================|
|   3  MIG 1g.10gb         19     4/7        9.75       No     14     1     0   |
|                                                               1     1     0   |
+-------------------------------------------------------------------------------+
|   3  MIG 1g.10gb+me      20     1/1        9.75       No     14     1     0   |
|                                                               1     1     1   |
+-------------------------------------------------------------------------------+
|   3  MIG 1g.20gb         15     2/4        19.62      No     14     1     0   |
|                                                               1     1     0   |
+-------------------------------------------------------------------------------+
|   3  MIG 2g.20gb         14     1/3        19.62      No     30     2     0   |
|                                                               2     2     0   |
+-------------------------------------------------------------------------------+
|   3  MIG 3g.40gb          9     0/2        39.50      No     46     3     0   |
|                                                               3     3     0   |
+-------------------------------------------------------------------------------+
|   3  MIG 4g.40gb          5     0/1        39.50      No     62     4     0   |
|                                                               4     4     0   |
+-------------------------------------------------------------------------------+
|   3  MIG 7g.80gb          0     0/1        79.25      No     114    7     0   |
|                                                               8     7     1   |
+-------------------------------------------------------------------------------+

the placement of one of the 1g10g MIG in Placement:Start = 5 leaves only 3 compute positions between the previous MIG and making MIG 4g40gb profile unusable (when it could have been if it was placed on Placement:Start = 2), but why profile 3g40gb is not usable is unclear to me (maybe the memory sectors couldn't overlap?).

Steps to reproduce:
Create a deployment having 28 pods requesting 1g10gb MIG profile:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: abstract-mig-claim
  namespace: nvidia-dra-driver-gpu
  labels:
    app: abstract-mig-claim
spec:
  replicas: 28
  selector:
    matchLabels:
      app: abstract-mig-claim
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: abstract-mig-claim
    spec:
      restartPolicy: Always
      containers:
        - name: abstract-mig-claiming-pod
          image: cuda:13.1.1-runtime-ubuntu24.04
          command: ["sleep", "6000"]
          resources:
            claims:
            - name: mig-device
              request: mig-10gb
      resourceClaims:
        - name: mig-device
          resourceClaimTemplateName: at-least-10gb-mig-template

reduce number of spec.replicas to 14
and check nvidia-smi mig -lgi output

+---------------------------------------------------------+
| GPU instances:                                          |
| GPU   Name               Profile  Instance   Placement  |
|                            ID       ID       Start:Size |
|=========================================================|
|   0  MIG 1g.10gb           19        7          4:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19        8          5:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19        9          6:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19       11          0:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19       12          1:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19       13          2:1     |
+---------------------------------------------------------+
|   0  MIG 1g.10gb           19       14          3:1     |
+---------------------------------------------------------+
|   1  MIG 1g.10gb           19        8          5:1     |
+---------------------------------------------------------+
|   1  MIG 1g.10gb           19       14          3:1     |
+---------------------------------------------------------+
|   2  MIG 1g.10gb           19       11          0:1     |
+---------------------------------------------------------+
|   2  MIG 1g.10gb           19       12          1:1     |
+---------------------------------------------------------+
|   3  MIG 1g.10gb           19        7          0:1     |
+---------------------------------------------------------+
|   3  MIG 1g.10gb           19        8          1:1     |
+---------------------------------------------------------+
|   3  MIG 1g.10gb           19       12          5:1     |
+---------------------------------------------------------+

multiple MIG devices placed in out of order Placement:Start

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions