Skip to content

[ARO-25489] Reduce the memory usage of ResourceSKUs querying#4713

Open
hawkowl wants to merge 5 commits intomasterfrom
hawkowl/resourceskus-filter-mem
Open

[ARO-25489] Reduce the memory usage of ResourceSKUs querying#4713
hawkowl wants to merge 5 commits intomasterfrom
hawkowl/resourceskus-filter-mem

Conversation

@hawkowl
Copy link
Copy Markdown
Collaborator

@hawkowl hawkowl commented Mar 26, 2026

Which issue this PR addresses:

Part of [ARO-25489]

What this PR does / why we need it:

I have an inkling that this list in larger regions is consuming a lot of memory. So, let's consume less :)

Test plan for issue:

Unit tests provided, E2E

Is there any documentation that needs to be updated for this PR?

N/A

How do you know this will function as expected in production?

E2E, hopefully

@hawkowl hawkowl added enhancement New feature or request next-up ready-for-review next-release To be included in the next RP release rollout go Pull requests that update Go code skippy pull requests raised by member of Team Skippy labels Mar 26, 2026
Copilot AI review requested due to automatic review settings March 26, 2026 04:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces memory usage when querying Azure Compute Resource SKUs by switching from building full in-memory SKU lists to streaming/iterating SKUs and only retaining the subset needed by callers (or just names for admin listing).

Changes:

  • Changed ResourceSKUsClient.List to return an iterator (iter.Seq2) instead of []*ResourceSKU, enabling streaming pagination.
  • Added computeskus.SelectVMSkusInCurrentRegion and computeskus.ListUnrestrictedVMSkusInCurrentRegion to avoid loading all SKUs into memory.
  • Updated frontend/admin/cluster flows and tests to use the new iterator-based SKU querying.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/util/azureclient/azuresdk/armcompute/resourceskus_addons.go Reworks SKU listing to stream results via iter.Seq2 instead of returning a full slice.
pkg/util/mocks/azureclient/azuresdk/armcompute/armcompute.go Updates generated mock to match the iterator-based List signature.
pkg/util/computeskus/computeskus.go Introduces streaming-based SKU selection/listing helpers and updates error handling accordingly.
pkg/util/computeskus/computeskus_test.go Replaces old FilterVMSizes test with new unit tests for streaming-based helpers.
pkg/frontend/sku_validation.go Uses SelectVMSkusInCurrentRegion to fetch only the SKUs referenced by the cluster doc.
pkg/frontend/sku_test.go Updates mocks and expected errors for iterator-based SKU listing.
pkg/frontend/adminactions/azureactions.go Changes admin VM SKU listing to return []string and adds VMGetSKUs for targeted queries.
pkg/util/mocks/adminactions/azureactions.go Updates generated adminactions mocks for VMSizeList type change and new VMGetSKUs.
pkg/frontend/admin_openshiftcluster_vmsizelist.go Removes local filtering logic; now sorts and returns the already-filtered SKU name list.
pkg/frontend/admin_openshiftcluster_vmsizelist_test.go Updates expectations to reflect []string return type and stable sorting.
pkg/frontend/admin_openshiftcluster_vmresize_pre_validation.go Switches pre-resize validation to query only the desired SKU via VMGetSKUs.
pkg/frontend/admin_openshiftcluster_vmresize_pre_validation_test.go Updates mocks to use VMGetSKUs and map-based SKU lookup.
pkg/cluster/validate.go Switches zone validation to only fetch SKUs needed for master/worker sizing.
pkg/cluster/validate_test.go Updates SKU client mocks to return iterators rather than slices.
pkg/cluster/loadbalancerinternal.go Changes load balancer zonal migration path to fetch only the needed SKU(s).
pkg/cluster/loadbalancerinternal_test.go Updates SKU client mocks to return iterators rather than slices.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hawkowl
Copy link
Copy Markdown
Collaborator Author

hawkowl commented Mar 26, 2026

/azp run ci

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).


filteredSkus, err := computeskus.SelectVMSkusInCurrentRegion(ctx, m.armResourceSKUs, location, []string{
string(m.doc.OpenShiftCluster.Properties.MasterProfile.VMSize),
string(m.doc.OpenShiftCluster.Properties.WorkerProfiles[0].VMSize),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd want to validate all worker profiles here, correct? Not just the first entry in the array?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bootstrap version, which only supports 1 entry -- the actual sku check below it also only does the first entry

Copilot AI review requested due to automatic review settings March 27, 2026 00:02
@hawkowl hawkowl force-pushed the hawkowl/resourceskus-filter-mem branch from b9ac6f9 to 4e65a1a Compare March 27, 2026 00:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +224 to +229
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map literal uses key type *armcompute.ResourceSKU but the keys are written as struct literals ({ ... }) without taking an address. This will not compile; the keys need to be &armcompute.ResourceSKU{...} (and the same applies to the other maps.All(map[*armcompute.ResourceSKU]error{ ... }) blocks in this file).

Copilot uses AI. Check for mistakes.
Comment on lines +316 to +322
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
LocationInfo: []*armcompute.ResourceSKULocationInfo{
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here: the maps.All(map[*armcompute.ResourceSKU]error{ ... }) literal is keyed by *armcompute.ResourceSKU but the keys are not address-of expressions. This won’t compile until the keys are changed to &armcompute.ResourceSKU{...}.

Copilot uses AI. Check for mistakes.
Comment on lines +396 to +401
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same compile issue: keys in this map[*armcompute.ResourceSKU]error literal need to be pointers (&armcompute.ResourceSKU{...}), not struct literals.

Copilot uses AI. Check for mistakes.
Comment on lines +599 to +606
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
LocationInfo: []*armcompute.ResourceSKULocationInfo{
{
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same compile issue: keys in this map[*armcompute.ResourceSKU]error literal must be pointer values (use &armcompute.ResourceSKU{...}), otherwise the test won’t compile.

Copilot uses AI. Check for mistakes.
Comment on lines +676 to +683
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
LocationInfo: []*armcompute.ResourceSKULocationInfo{
{
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same compile issue: this map[*armcompute.ResourceSKU]error literal uses non-pointer struct literals as keys. Change the keys to &armcompute.ResourceSKU{...} so the test compiles.

Copilot uses AI. Check for mistakes.
Comment on lines +451 to +457
mrsc.EXPECT().List(gomock.Any(), "location eq northus2", false).Return(
maps.All(map[*armcompute.ResourceSKU]error{
{
Name: pointerutils.ToPtr("bigmachine_v1"),
ResourceType: pointerutils.ToPtr("virtualMachines"),
Locations: pointerutils.ToSlicePtr([]string{"northus2"}),
LocationInfo: []*armcompute.ResourceSKULocationInfo{
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same compile issue: this map[*armcompute.ResourceSKU]error uses struct literals as keys instead of &armcompute.ResourceSKU{...} pointers.

Copilot uses AI. Check for mistakes.
Comment on lines 58 to +63
for _, restriction := range sku.Restrictions {
for _, restrictedLocation := range restriction.RestrictionInfo.Locations {
if *restrictedLocation == location {
return true
if restriction.RestrictionInfo != nil {
for _, restrictedLocation := range restriction.RestrictionInfo.Locations {
if restrictedLocation != nil && strings.EqualFold(*restrictedLocation, location) {
return true
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sku.Restrictions is a slice of pointers in the Azure SDK; individual entries can be nil. This code dereferences restriction.RestrictionInfo without checking restriction != nil, which can panic if a nil restriction is returned. Add a nil check for restriction before accessing its fields.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +105
for sku, err := range skusIter {
if err != nil {
return nil, fmt.Errorf("%w: %w", ErrListVMResourceSKUs, err)
}

if len(sku.LocationInfo) == 0 { // happened in eastus2euap
// We only care about VMs and ones with locations/locationinfo
if *sku.ResourceType != "virtualMachines" || len(sku.Locations) == 0 || len(sku.LocationInfo) == 0 {
continue
}

// We copy only part of the object so we don't have to keep
// a lot of data in memory.
vmskus[*sku.Name] = &sdkcompute.ResourceSKU{
Name: sku.Name,
Restrictions: sku.Restrictions,
LocationInfo: sku.LocationInfo,
Capabilities: sku.Capabilities,
// Make sure it's actually in our location
if !slices.ContainsFunc(sku.Locations, func(s *string) bool { return s != nil && strings.EqualFold(*s, location) }) {
continue
}
}

return vmskus
}
if slices.Contains(skuNames, *sku.Name) {
vmskus[*sku.Name] = sku
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential nil dereferences: sku can be nil (the iterator’s contract allows yielding nil), and the SDK fields ResourceType / Name are pointers. Dereferencing *sku.ResourceType and *sku.Name without nil checks can panic on unexpected API responses. Add guards for sku == nil, sku.ResourceType == nil, and sku.Name == nil before dereferencing.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update Go code next-release To be included in the next RP release rollout next-up ready-for-review skippy pull requests raised by member of Team Skippy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants