-
Notifications
You must be signed in to change notification settings - Fork 79
Define gpu memory utilization #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for the --gpu-memory-utilization parameter for vLLM configuration. The implementation is solid, following existing patterns for adding new flags, including a custom pflag value type to handle optional float values. The changes are well-tested with comprehensive unit and behavior tests. I've identified a couple of minor areas for improvement regarding code duplication, which are detailed in the specific comments. Overall, this is a great addition.
|
|
||
| func TestGPUMemoryUtilizationBehavior(t *testing.T) { | ||
| // Helper to create float64 pointer | ||
| float64Ptr := func(f float64) *float64 { return &f } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if utilization < 0.0 || utilization > 1.0 { | ||
| return nil, fmt.Errorf("gpu-memory-utilization must be between 0.0 and 1.0, got %f", utilization) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic for the gpu-memory-utilization range is also present in cmd/cli/commands/configure_flags.go on lines 190-192. This duplication could lead to maintenance challenges, for instance, if the valid range changes. To improve maintainability, consider abstracting this validation into a shared function or using shared constants for the boundaries and error message. While validation at different layers can be useful, centralizing the core logic will make the code easier to manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The range validation for
GPUMemoryUtilizationis implemented in bothConfigureFlags.BuildConfigureRequestandvllm_config.GetArgswith slightly different error messages; consider extracting a shared helper (and standardizing the message) to avoid duplication and potential future drift.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The range validation for `GPUMemoryUtilization` is implemented in both `ConfigureFlags.BuildConfigureRequest` and `vllm_config.GetArgs` with slightly different error messages; consider extracting a shared helper (and standardizing the message) to avoid duplication and potential future drift.
## Individual Comments
### Comment 1
<location> `cmd/cli/commands/configure_test.go:131` </location>
<code_context>
}
}
+func TestConfigureCmdGPUMemoryUtilizationFlag(t *testing.T) {
+ // Create the configure command
+ cmd := newConfigureCmd()
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for non-float input to the --gpu-memory-utilization flag to exercise the error path in Float64PtrValue.Set.
Since `Float64PtrValue.Set` returns an error when `strconv.ParseFloat` fails, please add a subtest that calls `cmd.Flags().Set("gpu-memory-utilization", "not-a-number")` and asserts that an error is returned. This will exercise the error path and verify the custom flag’s parsing behavior end-to-end.
</issue_to_address>
### Comment 2
<location> `cmd/cli/commands/configure_test.go:164` </location>
<code_context>
+ }
+}
+
+func TestGPUMemoryUtilizationBehavior(t *testing.T) {
+ // Helper to create float64 pointer
+ float64Ptr := func(f float64) *float64 { return &f }
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a case that verifies GPU memory utilization is set without clobbering existing VLLM fields in the configure request.
The table-driven cases cover nil, valid, edge, and invalid `GPUMemoryUtilization` values well. Please add one case where other VLLM options (e.g. `HFOverrides`) are already set on the request, to verify that `BuildConfigureRequest` merges `GPUMemoryUtilization` into an existing `req.VLLM` without dropping those fields. This helps catch future refactors that might reinitialize `req.VLLM` and lose configuration.
Suggested implementation:
```golang
tests := []struct {
name string
gpuMemValue *float64
expectError bool
expectGPUMemSet bool
expectedGPUMemUtil float64
}{
{
name: "default - not set (nil)",
```
```golang
{
name: "default - not set (nil)",
},
// NOTE: other existing cases (valid, edge, invalid) remain unchanged here...
{
name: "valid gpu-mem with existing VLLM options preserved",
// Use a valid, non-edge value to focus this case on merge behavior rather than validation.
gpuMemValue: float64Ptr(0.8),
expectError: false,
expectGPUMemSet: true,
expectedGPUMemUtil: 0.8,
},
```
To fully implement the behavior this new case is intended to check, update the body of `TestGPUMemoryUtilizationBehavior` as follows:
1. Inside the `for _, tt := range tests { t.Run(...) }` loop (which is not shown in the snippet), detect the `"valid gpu-mem with existing VLLM options preserved"` case and pre-populate the configure request with existing VLLM options **before** calling `BuildConfigureRequest`. For example:
```go
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
req := &apiv1.ConfigureRequest{}
if tt.name == "valid gpu-mem with existing VLLM options preserved" {
req.VLLM = &apiv1.VLLMOptions{
HFOverrides: map[string]string{
"some.hf.override": "preserve-me",
},
}
}
// existing logic that applies gpuMemValue and calls BuildConfigureRequest(...)
err := BuildConfigureRequest(req /* other args as needed */)
// existing assertions for tt.expectError, tt.expectGPUMemSet, tt.expectedGPUMemUtil...
// e.g., checking req.VLLM.GPUMemoryUtilization or similar
```
2. After asserting the GPU memory utilization behavior for this case, add explicit checks that the original `HFOverrides` entry is still present and unchanged. For example:
```go
if tt.name == "valid gpu-mem with existing VLLM options preserved" {
if req.VLLM == nil {
t.Fatalf("VLLM options should not be nil after BuildConfigureRequest")
}
if got := req.VLLM.HFOverrides["some.hf.override"]; got != "preserve-me" {
t.Errorf("expected existing VLLM HFOverrides to be preserved, got %q", got)
}
}
})
}
```
3. Make sure to use the actual types and field names from your codebase (e.g., `apiv1.ConfigureRequest`, `apiv1.VLLMOptions`, `HFOverrides`, `GPUMemoryUtilization`) and adjust the call to `BuildConfigureRequest` to match its real signature.
These adjustments will ensure the new table case verifies that `BuildConfigureRequest` merges `GPUMemoryUtilization` into an existing `req.VLLM` without dropping pre-existing configuration such as `HFOverrides`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| } | ||
| } | ||
|
|
||
| func TestConfigureCmdGPUMemoryUtilizationFlag(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Add a test case for non-float input to the --gpu-memory-utilization flag to exercise the error path in Float64PtrValue.Set.
Since Float64PtrValue.Set returns an error when strconv.ParseFloat fails, please add a subtest that calls cmd.Flags().Set("gpu-memory-utilization", "not-a-number") and asserts that an error is returned. This will exercise the error path and verify the custom flag’s parsing behavior end-to-end.
Adds
--gpu-memory-utilizationparameter for vLLM configuration. The implementation follows the existing patterns and ensures that no value is passed to vLLM if the parameter is not provided.This parameter is needed for wsl2