-
Notifications
You must be signed in to change notification settings - Fork 5.1k
fix: Minimize logging when waiting for ip address #21753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @henry3260. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We repeat the same code in 3 places. Adding logging 3 times to the 3 duplicates is not the right way to maintain this code.
What we need to do:
- Eliminate the duplicate code and move the retry loop to a helper (likely in the common package).
- Add the nicer log in the loop, replacing 2 logs (one in the loop, one in the code searching the Mac address) with single log.
pkg/drivers/common/common.go
Outdated
log.Debugf("Searching for %s in %s ...", mac, path) | ||
if klog.V(4).Enabled() { | ||
log.Debugf("Searching for %s in %s ...", mac, path) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this log, there is no point in logging this since we log the attempts in the loop calling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this log, there is no point in logging this since we log the attempts in the loop calling this.
I think we could keep the original log and only surface it when the user increases the log verbosity level. That way, it won’t clutter the default output but still offers useful traceability when needed. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log is not needed, lets keep it simple.
pkg/drivers/krunkit/krunkit.go
Outdated
} | ||
// Implement a retry loop because IP address isn't added to dhcp leases file immediately | ||
attempts := 0 | ||
for i := 0; i < 60; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding another variable use:
for attempts := 0; attempts < 60; attempts++ {
But I think we can keep the loop as is and use i
.
pkg/drivers/krunkit/krunkit.go
Outdated
attempts = i | ||
if klog.V(4).Enabled() { | ||
log.Debugf("Attempt %d", i) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log is completely unneeded now since we log the attempts in the next log.
pkg/drivers/krunkit/krunkit.go
Outdated
time.Sleep(2 * time.Second) | ||
} | ||
path := "/var/db/dhcpd_leases" | ||
log.Debugf("Searching for %s in %s (attempt %d/60)", d.MACAddress, path, attempts+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot duplicate the path here. We can use the constant from common.LeasesPath.
So I would keep only this log in this loop:
log.Debugf("Searching for %s in %s (attempt %d/%d)", d.MACAddress, common.LeasesPath, i+1, maxAttempts)
multiplier := 1 | ||
if detect.NestedVM() { | ||
multiplier = 3 // will help with running in Free github action Macos VMs (takes 112+ retries on average) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a variable instead of repeating 60*multiplier
and use it in the lop and the log.
cost maxAttempts := 60 * multiplier
Same comment for vfkit.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a variable instead of repeating
60*multiplier
and use it in the lop and the log.cost maxAttempts := 60 * multiplier
Would it be better to put maxAttempts in the const section instead of defining it at the top of the loop?
We can't do that because the multiplier is not a constant — its value isn't known at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it must be in the function and unfortunately const will not work in Go for local value that cannot change after setting it.
} | ||
return nil | ||
} | ||
// Implement a retry loop because IP address isn't added to dhcp leases file immediately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a constant for maxAttempts since we need it twice:
const maxAttempts = 60
This will also make it more consistent with other drivers, which will make it easier to extract this loop to a helper function, since all the drivers use the similar logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I agree with this Change ! someitmes when the minikube log Stuck on Online it is very devistating to look at it, wondering what minikube is doing, I think we should keep this, since this will be use full to implemement Sub Steps or a Progress-bar style UI spinning later
but I really like not repeating
I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
thats a good improveement !
instead of removing all the updates, I suggrest limit number of times it trying, we could skip every 3 or 5 times.
so long one in 5,
so the log should say
I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 1
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 5
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 10
...
this was actually very usefful for me when I was trying to make minikube work on Github Action Macos, I would know exactly how many Minutes to add to the timeout or find the correct Max Attempts for github actions
This current code is not implementing what the issue suggests.
The issue is not suggesting to remove all logs when searching the ip address, but to have log per attempt instead of 2 logs. This keeps enough progress so minikube does not fill stuck until the search is complete.
This will be too complicated and we should not spend time on something which is wrong by design. The way to get the ip address is to get it from the guest, either by an agent running in the guest. This will work with any network (shared, bridged) and does not require polling. Until we have this we should do the minimal changes needed so this make less noise in the logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henry3260 please update the PR with the new Before/After
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better now! just need to remove the unwanted constant at the op of the file, and bake the multiplier into the value so we don't repeat the multiplier.
pkg/drivers/krunkit/krunkit.go
Outdated
serialFileName = "serial.log" | ||
logLevelInfo = "3" | ||
defaultSSHUser = "docker" | ||
maxAttempts = 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value makes sense only in Start(), and even there is should be right above the code using it. Putting it here confuses the reader that this driver uses maxAttempts for many operations while it is used only for one operation.
Let make it a variable at the start right above the loop. This way we can bake into the value the multiplier in the 2 version using a multiplier.
Updated the PR with the new Before/After! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: henry3260, medyagh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
@henry3260: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
kvm2 driver with docker runtime
Times for minikube start: 46.0s 42.7s 41.3s 41.8s 42.1s Times for minikube ingress: 18.9s 15.9s 16.8s 15.9s 16.9s docker driver with docker runtime
Times for minikube start: 20.6s 23.9s 22.6s 24.5s 24.6s Times for minikube ingress: 11.7s 11.7s 12.7s 10.7s 12.7s docker driver with containerd runtime
Times for minikube start: 20.7s 19.9s 20.5s 20.9s 19.5s Times for minikube ingress: 22.2s 20.2s 20.2s 20.2s 22.2s |
Here are the number of top 10 failed tests in each environments with lowest flake rate.
Besides the following environments also have failed tests: To see the flake rates of all tests by environment, click here. |
Fix: #21105
Before:
I0721 15:38:26.704501 10969 main.go:141] libmachine: Attempt 0
I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 1
I0721 15:38:28.705413 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:30.706205 10969 main.go:141] libmachine: Attempt 2
I0721 15:38:30.706254 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:32.706985 10969 main.go:141] libmachine: Attempt 3
I0721 15:38:32.707114 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:34.707951 10969 main.go:141] libmachine: Attempt 4
I0721 15:38:34.708015 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:36.708653 10969 main.go:141] libmachine: Attempt 5
I0721 15:38:36.708722 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:38.709263 10969 main.go:141] libmachine: Attempt 6
I0721 15:38:38.709309 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:40.709823 10969 main.go:141] libmachine: Attempt 7
I0721 15:38:40.709885 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:42.711666 10969 main.go:141] libmachine: Attempt 8
I0721 15:38:42.711705 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:44.713371 10969 main.go:141] libmachine: Attempt 9
I0721 15:38:44.713427 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:46.713856 10969 main.go:141] libmachine: Attempt 10
I0721 15:38:46.713897 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:48.714571 10969 main.go:141] libmachine: Attempt 11
I0721 15:38:48.714619 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:50.715270 10969 main.go:141] libmachine: Attempt 12
I0721 15:38:50.715313 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:52.716156 10969 main.go:141] libmachine: Attempt 13
I0721 15:38:52.716192 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:54.717573 10969 main.go:141] libmachine: Attempt 14
I0721 15:38:54.717630 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:56.719394 10969 main.go:141] libmachine: Attempt 15
I0721 15:38:56.719576 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:58.721309 10969 main.go:141] libmachine: Attempt 16
I0721 15:38:58.721354 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:00.722254 10969 main.go:141] libmachine: Attempt 17
I0721 15:39:00.722291 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:02.723277 10969 main.go:141] libmachine: Attempt 18
I0721 15:39:02.723346 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:04.725023 10969 main.go:141] libmachine: Attempt 19
I0721 15:39:04.725093 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:06.725607 10969 main.go:141] libmachine: Attempt 20
I0721 15:39:06.725653 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:08.727193 10969 main.go:141] libmachine: Attempt 21
I0721 15:39:08.727256 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
After: