-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Add kubeReserved calculation visibility for troubleshooting #8810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add detailed V(1) logging for instance type resource calculations - Log effective pod count, kube-reserved, and allocatable values - Add documentation section for debugging kubeReserved calculations - Include examples for Custom AMI configuration This change provides visibility into how Karpenter calculates allocatable resources, helping users troubleshoot capacity issues especially with Custom AMI configurations. Relates to aws#8497
DerekFrank
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really address #8497, and I am not sure that the amount of logs is worth it.
pkg/providers/instancetype/types.go
Outdated
|
|
||
| // Log kubeReserved calculation details for troubleshooting | ||
| allocatable := it.Allocatable() | ||
| log.FromContext(ctx).V(1).Info("calculated instance type resources", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this log a line for every single instance on startup? I'm not sure that kind of spam is useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I've updated the code to only log for Custom AMI families where this troubleshooting info is most valuable. This avoids the log spam from hundreds of instance types at startup.
|
For the non-custom AMIs, we could instead put this information into the website. After we figure out the override behavior for #8497, we can log the result for custom amis only I think |
|
@DerekFrank Thank you for the valuable feedback! You're absolutely right about the log volume concern and the relationship with #8497. After reviewing the situation, I understand that:
I'll update this PR to:
Rationale:
|
|
Fixed in 27e46d3! Now only logs for Custom AMI families to avoid the spam: if amiFamilyType == v1.AMIFamilyCustom {
log.FromContext(ctx).V(1).Info("calculated instance type resources for Custom AMI", ...)
}For EKS-optimized AMIs, I've added detailed documentation with formulas and examples instead. |
|
@moko-poi, |
|
@jigisha620 Good point! I've updated the log level to V(2) in 4a49e45. This ensures the logs are only visible when users explicitly need detailed troubleshooting with |
Summary
This PR addresses issue #8497 by adding comprehensive visibility into how Karpenter calculates
kubeReservedand allocatable resources for instance types. This helps users, especially those using Custom AMI families, understand and troubleshoot capacity-related issues.Changes
Code Changes
NewInstanceType()to track resource calculationseffectivePodsandkubeReservedResourcesas variablesDocumentation Changes
Problem Solved
Issue #8497 - Problem 3: Lack of visibility
Users previously had no way to understand:
maxPodsmight not match the effective pod countImpact
Testing
Example Log Output
{ "level": "info", "ts": "2025-12-23T19:30:52Z", "msg": "calculated instance type resources", "instance-type": "m5.large", "ami-family": "Custom", "max-pods-configured": 110, "effective-pods": 737, "uses-eni-limited-overhead": true, "capacity-memory": "8192Mi", "capacity-cpu": "2", "kube-reserved-memory": "8362Mi", "kube-reserved-cpu": "80m", "system-reserved-memory": "0", "system-reserved-cpu": "0", "allocatable-memory": "6.5Gi", "allocatable-cpu": "1920m" }Related Issues
Checklist
Screenshots/Recordings
N/A - This is a logging and documentation enhancement