Skip to content

Commit d1e11b3

Browse files
authored
Support additional GPU config values (#90)
* Support additional GPU config values See https://qdrant.tech/documentation/guides/running-with-gpu/ * Update docs * Add validation
1 parent 44faa47 commit d1e11b3

File tree

5 files changed

+160
-5
lines changed

5 files changed

+160
-5
lines changed

api/v1/qdrantcluster_types.go

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,9 +181,44 @@ func (s QdrantClusterSpec) GetServicePerNode() bool {
181181
}
182182

183183
type GPU struct {
184-
// GPUType specifies the type of the GPU to use.
184+
// GPUType specifies the type of the GPU to use. If set, GPU indexing is enabled.
185185
// +kubebuilder:validation:Enum=nvidia;amd
186186
GPUType GPUType `json:"gpuType"`
187+
// ForceHalfPrecision for `f32` values while indexing.
188+
// `f16` conversion will take place
189+
// only inside GPU memory and won't affect storage type.
190+
// +kubebuilder:default=false
191+
ForceHalfPrecision bool `json:"forceHalfPrecision"`
192+
// DeviceFilter for GPU devices by hardware name. Case-insensitive.
193+
// List of substrings to match against the gpu device name.
194+
// Example: [- "nvidia"]
195+
// If not specified, all devices are accepted.
196+
// +kubebuilder:validation:MinItems:=1
197+
// +optional
198+
DeviceFilter []string `json:"deviceFilter,omitempty"`
199+
// Devices is a List of explicit GPU devices to use.
200+
// If host has multiple GPUs, this option allows to select specific devices
201+
// by their index in the list of found devices.
202+
// If `deviceFilter` is set, indexes are applied after filtering.
203+
// If not specified, all devices are accepted.
204+
// +kubebuilder:validation:MinItems:=1
205+
// +optional
206+
Devices []string `json:"devices,omitempty"`
207+
// ParallelIndexes is the number of parallel indexes to run on the GPU.
208+
// +kubebuilder:default=1
209+
// +kubebuilder:validation:Minimum:=1
210+
ParallelIndexes int `json:"parallelIndexes"`
211+
// GroupsCount is the amount of used vulkan "groups" of GPU.
212+
// In other words, how many parallel points can be indexed by GPU.
213+
// Optimal value might depend on the GPU model.
214+
// Proportional, but doesn't necessary equal to the physical number of warps.
215+
// Do not change this value unless you know what you are doing.
216+
// +optional
217+
// +kubebuilder:validation:Minimum:=1
218+
GroupsCount int `json:"groupsCount,omitempty"`
219+
// AllowIntegrated specifies whether to allow integrated GPUs to be used.
220+
// +kubebuilder:default=false
221+
AllowIntegrated bool `json:"allowIntegrated"`
187222
}
188223

189224
func (g *GPU) GetGPUType() GPUType {

api/v1/zz_generated.deepcopy.go

Lines changed: 11 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

charts/qdrant-kubernetes-api/templates/region-crds/qdrant.io_qdrantclusters.yaml

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -300,14 +300,66 @@ spec:
300300
description: GPU specifies GPU configuration for the cluster. If this
301301
field is not set, no GPU will be used.
302302
properties:
303+
allowIntegrated:
304+
default: false
305+
description: AllowIntegrated specifies whether to allow integrated
306+
GPUs to be used.
307+
type: boolean
308+
deviceFilter:
309+
description: |-
310+
DeviceFilter for GPU devices by hardware name. Case-insensitive.
311+
List of substrings to match against the gpu device name.
312+
Example: [- "nvidia"]
313+
If not specified, all devices are accepted.
314+
items:
315+
type: string
316+
minItems: 1
317+
type: array
318+
devices:
319+
description: |-
320+
Devices is a List of explicit GPU devices to use.
321+
If host has multiple GPUs, this option allows to select specific devices
322+
by their index in the list of found devices.
323+
If `deviceFilter` is set, indexes are applied after filtering.
324+
If not specified, all devices are accepted.
325+
items:
326+
type: string
327+
minItems: 1
328+
type: array
329+
forceHalfPrecision:
330+
default: false
331+
description: |-
332+
ForceHalfPrecision for `f32` values while indexing.
333+
`f16` conversion will take place
334+
only inside GPU memory and won't affect storage type.
335+
type: boolean
303336
gpuType:
304-
description: GPUType specifies the type of the GPU to use.
337+
description: GPUType specifies the type of the GPU to use. If
338+
set, GPU indexing is enabled.
305339
enum:
306340
- nvidia
307341
- amd
308342
type: string
343+
groupsCount:
344+
description: |-
345+
GroupsCount is the amount of used vulkan "groups" of GPU.
346+
In other words, how many parallel points can be indexed by GPU.
347+
Optimal value might depend on the GPU model.
348+
Proportional, but doesn't necessary equal to the physical number of warps.
349+
Do not change this value unless you know what you are doing.
350+
minimum: 1
351+
type: integer
352+
parallelIndexes:
353+
default: 1
354+
description: ParallelIndexes is the number of parallel indexes
355+
to run on the GPU.
356+
minimum: 1
357+
type: integer
309358
required:
359+
- allowIntegrated
360+
- forceHalfPrecision
310361
- gpuType
362+
- parallelIndexes
311363
type: object
312364
id:
313365
description: Id specifies the unique identifier of the cluster

crds/qdrant.io_qdrantclusters.yaml

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -299,14 +299,66 @@ spec:
299299
description: GPU specifies GPU configuration for the cluster. If this
300300
field is not set, no GPU will be used.
301301
properties:
302+
allowIntegrated:
303+
default: false
304+
description: AllowIntegrated specifies whether to allow integrated
305+
GPUs to be used.
306+
type: boolean
307+
deviceFilter:
308+
description: |-
309+
DeviceFilter for GPU devices by hardware name. Case-insensitive.
310+
List of substrings to match against the gpu device name.
311+
Example: [- "nvidia"]
312+
If not specified, all devices are accepted.
313+
items:
314+
type: string
315+
minItems: 1
316+
type: array
317+
devices:
318+
description: |-
319+
Devices is a List of explicit GPU devices to use.
320+
If host has multiple GPUs, this option allows to select specific devices
321+
by their index in the list of found devices.
322+
If `deviceFilter` is set, indexes are applied after filtering.
323+
If not specified, all devices are accepted.
324+
items:
325+
type: string
326+
minItems: 1
327+
type: array
328+
forceHalfPrecision:
329+
default: false
330+
description: |-
331+
ForceHalfPrecision for `f32` values while indexing.
332+
`f16` conversion will take place
333+
only inside GPU memory and won't affect storage type.
334+
type: boolean
302335
gpuType:
303-
description: GPUType specifies the type of the GPU to use.
336+
description: GPUType specifies the type of the GPU to use. If
337+
set, GPU indexing is enabled.
304338
enum:
305339
- nvidia
306340
- amd
307341
type: string
342+
groupsCount:
343+
description: |-
344+
GroupsCount is the amount of used vulkan "groups" of GPU.
345+
In other words, how many parallel points can be indexed by GPU.
346+
Optimal value might depend on the GPU model.
347+
Proportional, but doesn't necessary equal to the physical number of warps.
348+
Do not change this value unless you know what you are doing.
349+
minimum: 1
350+
type: integer
351+
parallelIndexes:
352+
default: 1
353+
description: ParallelIndexes is the number of parallel indexes
354+
to run on the GPU.
355+
minimum: 1
356+
type: integer
308357
required:
358+
- allowIntegrated
359+
- forceHalfPrecision
309360
- gpuType
361+
- parallelIndexes
310362
type: object
311363
id:
312364
description: Id specifies the unique identifier of the cluster

docs/api.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,13 @@ _Appears in:_
105105

106106
| Field | Description | Default | Validation |
107107
| --- | --- | --- | --- |
108-
| `gpuType` _[GPUType](#gputype)_ | GPUType specifies the type of the GPU to use. | | Enum: [nvidia amd] <br /> |
108+
| `gpuType` _[GPUType](#gputype)_ | GPUType specifies the type of the GPU to use. If set, GPU indexing is enabled. | | Enum: [nvidia amd] <br /> |
109+
| `forceHalfPrecision` _boolean_ | ForceHalfPrecision for `f32` values while indexing.<br />`f16` conversion will take place<br />only inside GPU memory and won't affect storage type. | false | |
110+
| `deviceFilter` _string array_ | DeviceFilter for GPU devices by hardware name. Case-insensitive.<br />List of substrings to match against the gpu device name.<br />Example: [- "nvidia"]<br />If not specified, all devices are accepted. | | MinItems: 1 <br /> |
111+
| `devices` _string array_ | Devices is a List of explicit GPU devices to use.<br />If host has multiple GPUs, this option allows to select specific devices<br />by their index in the list of found devices.<br />If `deviceFilter` is set, indexes are applied after filtering.<br />If not specified, all devices are accepted. | | MinItems: 1 <br /> |
112+
| `parallelIndexes` _integer_ | ParallelIndexes is the number of parallel indexes to run on the GPU. | 1 | Minimum: 1 <br /> |
113+
| `groupsCount` _integer_ | GroupsCount is the amount of used vulkan "groups" of GPU.<br />In other words, how many parallel points can be indexed by GPU.<br />Optimal value might depend on the GPU model.<br />Proportional, but doesn't necessary equal to the physical number of warps.<br />Do not change this value unless you know what you are doing. | | Minimum: 1 <br /> |
114+
| `allowIntegrated` _boolean_ | AllowIntegrated specifies whether to allow integrated GPUs to be used. | false | |
109115

110116

111117
#### GPUType

0 commit comments

Comments
 (0)