Skip to content

Conversation

@krbar
Copy link
Contributor

@krbar krbar commented Dec 23, 2025

Description

  • Bumped managed cluster, agent pool, and maintenance configuration modules to API 2025-09-01, surfacing new advanced networking, provisioning, and monitoring inputs.
  • Refactored parameters to API-typed objects (linuxProfile, aadProfile, nodeProvisioningProfile, autoScalerProfile, azureMonitorProfile, serviceMeshProfile, etc.) and added new cluster options (fqdnSubdomain, aiToolchainOperatorProfile, bootstrapProfile, upgradeSettings, windowsProfile).
  • Expanded agent pool schema with capacity reservation/host groups, gateway/gpu profiles, network and DNS profiles, pod IP allocation mode, power/VM profiles, and richer OS SKU list.
  • Maintenance configurations now accept optional notAllowedTime and timeInWeek windows for finer scheduling.
  • Breaking: consolidated legacy per-field settings into securityProfile, upgradeSettings, and other typed profiles; sshPublicKey/adminUsername now provided via linuxProfile; SKU tier string normalized to Free.

Resolves #1923
Resolves #2412
Resolves #5815
Resolves #6010
Resolves #6179
Resolves #6331
Resolves #6334
Resolves #4470

Pipeline Reference

Pipeline
avm.res.container-service.managed-cluster

Type of Change

  • Azure Verified Module updates:
    • Bugfix containing backwards-compatible bug fixes, and I have NOT bumped the MAJOR or MINOR version in version.json:
    • Feature update backwards compatible feature updates, and I have bumped the MINOR version in version.json.
    • Breaking changes and I have bumped the MAJOR version in version.json.
    • Update to documentation
  • Update to CI Environment or utilities (Non-module affecting changes)

Checklist

  • I'm sure there are no other open Pull Requests for the same update/change
  • I have run Set-AVMModule locally to generate the supporting module files.
  • My corresponding pipelines / checks run clean and green without any errors or warnings
  • I have updated the module's CHANGELOG.md file with an entry for the next version

- Updated resource definitions to use Microsoft.ContainerService/managedClusters@2025-09-01 API version.
- Adjusted parameters for pod identity, security settings, and monitoring configurations to align with the new API.
- Deprecated parameters that are no longer available in the updated API version.
- Added new parameters for enhanced cluster configurations, including AI toolchain operator settings and bootstrap profile.
- Updated the resource group API version in the WAF-aligned test module.
- Ensures compatibility with the latest Azure Resource Manager features.
- Module path: avm/res/container-service/managed-cluster/tests/e2e/waf-aligned
@microsoft-github-policy-service microsoft-github-policy-service bot added Needs: Triage 🔍 Maintainers need to triage still Type: AVM 🅰️ ✌️ Ⓜ️ This is an AVM related issue labels Dec 23, 2025
- Removed individual security parameters (enableSecureBoot, enableVTPM, sshAccess) from agent pool and managed cluster definitions.
- Introduced a consolidated securityProfile parameter for better management of security settings.
- Updated references in the resource definitions to utilize the new securityProfile structure.
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 7, 2026 22:29 — with GitHub Actions Inactive
@krbar krbar self-assigned this Jan 7, 2026
- Refactored parameters in the managed cluster Bicep file to use resourceInput types for better compatibility with the 2025-09-01 API version.
- Removed deprecated parameters and replaced them with updated structures, including changes to the aadProfile, autoScalerProfile, and securityProfile.
- Updated test files to reflect changes in parameter names and structures, ensuring alignment with the new Bicep definitions.
- Enhanced test cases for various configurations, including private clusters and workload auto-scaling features.
@krbar krbar temporarily deployed to avm-validation January 12, 2026 07:58 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 12, 2026 07:58 — with GitHub Actions Inactive
@krbar krbar temporarily deployed to avm-validation January 12, 2026 07:58 — with GitHub Actions Inactive
@krbar krbar marked this pull request as ready for review January 12, 2026 08:52
@krbar krbar requested review from a team as code owners January 12, 2026 08:52
@avm-organizer avm-organizer bot added the Needs: Module Owner 📣 This module needs an owner to develop or maintain it label Jan 12, 2026
@avm-organizer avm-organizer bot requested a review from JPEasier January 12, 2026 08:53
@krbar
Copy link
Contributor Author

krbar commented Jan 12, 2026

Open point: link the issues before merging.

param gpuProfile resourceInput<'Microsoft.ContainerService/managedClusters/agentPools@2025-09-01'>.properties.gpuProfile?

@description('Optional. This is of the form /subscriptions/{subscriptionId}/resourcegroups/{resourcegroupname}/providers/microsoft.compute/hostgroups/{hostgroupname}. For more information see [Azure Dedicated Hosts](https://learn.microsoft.com/azure/virtual-machines/dedicated-hosts).')
param hostGroupId string?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
param hostGroupId string?
param hostGroupResourceId string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 32df228

}
}
]
networkPlugin: 'azure'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the 'kubenet' test was kinda the max test (which tests the networkPlugin 'kubenet'), while the 'waf-aligned' test valides the 'azure' mode (and naturally being a lot smaller).

Copy link
Collaborator

@AlexanderSehr AlexanderSehr Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying we can't have another test - but there may have been a reason for moving away from the original 2 tests in CARML ref. Costs may come to mind, but if there's a good reason for having both modes validated in depth I won't object :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there are a lot of new properties, I didn’t want to use the test case for (legacy) ‘kubenet’, which contains only a fraction of the properties in ‘max’. I hope this is fine.

'stable'
])
@description('Optional. Auto-upgrade channel on the AKS cluster.')
param autoUpgradeProfileUpgradeChannel string = 'stable'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @krbar,
I noteiced you removed a large set of parameters - presumably for maintainability.
One uestion though - these parameters had, for the most part, default values that may or may not match the resource provider's defaults. If they don't match them, we may be removing carefully chosen defaults. Did you vet all these values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexanderSehr Thank you for your review and for raising this point. I went through all the parameters that were removed and compared their default values with those provided by the resource provider. I didn’t find any evidence of intentional differences or custom defaults that would be lost as a result of this update. While I can’t guarantee I didn’t overlook something, in general, the defaults should now align with the provider’s settings.
The initial large number of parameters likely originated from before resourceInput was available.

gatewayProfile: agentPool.?gatewayProfile
gpuInstanceProfile: agentPool.?gpuInstanceProfile
gpuProfile: agentPool.?gpuProfile
hostGroupId: agentPool.?hostGroupId
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be updated as per the comment in the child module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 32df228

krbar and others added 7 commits January 17, 2026 21:43
- Renamed 'hostGroupId' to 'hostGroupResourceId' in agent pool and managed cluster modules for consistency.
- Updated references in both main and agent pool Bicep files to reflect the new parameter name.
Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment