Skip to content

Conversation

henry3260
Copy link

@henry3260 henry3260 commented Oct 15, 2025

Fix: #21105

Before:

I0721 15:38:26.704501 10969 main.go:141] libmachine: Attempt 0
I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 1
I0721 15:38:28.705413 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:30.706205 10969 main.go:141] libmachine: Attempt 2
I0721 15:38:30.706254 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:32.706985 10969 main.go:141] libmachine: Attempt 3
I0721 15:38:32.707114 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:34.707951 10969 main.go:141] libmachine: Attempt 4
I0721 15:38:34.708015 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:36.708653 10969 main.go:141] libmachine: Attempt 5
I0721 15:38:36.708722 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:38.709263 10969 main.go:141] libmachine: Attempt 6
I0721 15:38:38.709309 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:40.709823 10969 main.go:141] libmachine: Attempt 7
I0721 15:38:40.709885 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:42.711666 10969 main.go:141] libmachine: Attempt 8
I0721 15:38:42.711705 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:44.713371 10969 main.go:141] libmachine: Attempt 9
I0721 15:38:44.713427 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:46.713856 10969 main.go:141] libmachine: Attempt 10
I0721 15:38:46.713897 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:48.714571 10969 main.go:141] libmachine: Attempt 11
I0721 15:38:48.714619 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:50.715270 10969 main.go:141] libmachine: Attempt 12
I0721 15:38:50.715313 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:52.716156 10969 main.go:141] libmachine: Attempt 13
I0721 15:38:52.716192 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:54.717573 10969 main.go:141] libmachine: Attempt 14
I0721 15:38:54.717630 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:56.719394 10969 main.go:141] libmachine: Attempt 15
I0721 15:38:56.719576 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:58.721309 10969 main.go:141] libmachine: Attempt 16
I0721 15:38:58.721354 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:00.722254 10969 main.go:141] libmachine: Attempt 17
I0721 15:39:00.722291 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:02.723277 10969 main.go:141] libmachine: Attempt 18
I0721 15:39:02.723346 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:04.725023 10969 main.go:141] libmachine: Attempt 19
I0721 15:39:04.725093 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:06.725607 10969 main.go:141] libmachine: Attempt 20
I0721 15:39:06.725653 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:39:08.727193 10969 main.go:141] libmachine: Attempt 21
I0721 15:39:08.727256 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...

After:

image

Copy link

linux-foundation-easycla bot commented Oct 15, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: henry3260 / name: Henry Chen (37d7943)

@k8s-ci-robot k8s-ci-robot requested review from medyagh and nirs October 15, 2025 18:14
@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 15, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @henry3260. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 15, 2025
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 15, 2025
Copy link
Contributor

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We repeat the same code in 3 places. Adding logging 3 times to the 3 duplicates is not the right way to maintain this code.

What we need to do:

  1. Eliminate the duplicate code and move the retry loop to a helper (likely in the common package).
  2. Add the nicer log in the loop, replacing 2 logs (one in the loop, one in the code searching the Mac address) with single log.

log.Debugf("Searching for %s in %s ...", mac, path)
if klog.V(4).Enabled() {
log.Debugf("Searching for %s in %s ...", mac, path)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this log, there is no point in logging this since we log the attempts in the loop calling this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this log, there is no point in logging this since we log the attempts in the loop calling this.

I think we could keep the original log and only surface it when the user increases the log verbosity level. That way, it won’t clutter the default output but still offers useful traceability when needed. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log is not needed, lets keep it simple.

}
// Implement a retry loop because IP address isn't added to dhcp leases file immediately
attempts := 0
for i := 0; i < 60; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding another variable use:

for attempts := 0; attempts < 60; attempts++ {

But I think we can keep the loop as is and use i.

attempts = i
if klog.V(4).Enabled() {
log.Debugf("Attempt %d", i)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log is completely unneeded now since we log the attempts in the next log.

time.Sleep(2 * time.Second)
}
path := "/var/db/dhcpd_leases"
log.Debugf("Searching for %s in %s (attempt %d/60)", d.MACAddress, path, attempts+1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot duplicate the path here. We can use the constant from common.LeasesPath.

So I would keep only this log in this loop:

log.Debugf("Searching for %s in %s (attempt %d/%d)", d.MACAddress, common.LeasesPath, i+1, maxAttempts)

multiplier := 1
if detect.NestedVM() {
multiplier = 3 // will help with running in Free github action Macos VMs (takes 112+ retries on average)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a variable instead of repeating 60*multiplier and use it in the lop and the log.

cost maxAttempts := 60 * multiplier

Same comment for vfkit.go.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a variable instead of repeating 60*multiplier and use it in the lop and the log.

cost maxAttempts := 60 * multiplier

Would it be better to put maxAttempts in the const section instead of defining it at the top of the loop?
We can't do that because the multiplier is not a constant — its value isn't known at compile time.

Copy link
Contributor

@nirs nirs Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it must be in the function and unfortunately const will not work in Go for local value that cannot change after setting it.

}
return nil
}
// Implement a retry loop because IP address isn't added to dhcp leases file immediately
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a constant for maxAttempts since we need it twice:

const maxAttempts = 60

This will also make it more consistent with other drivers, which will make it easier to extract this loop to a helper function, since all the drivers use the similar logic.

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I agree with this Change ! someitmes when the minikube log Stuck on Online it is very devistating to look at it, wondering what minikube is doing, I think we should keep this, since this will be use full to implemement Sub Steps or a Progress-bar style UI spinning later

but I really like not repeating
I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...

thats a good improveement !

instead of removing all the updates, I suggrest limit number of times it trying, we could skip every 3 or 5 times.
so long one in 5,

so the log should say

I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ...
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 1
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 5
I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 10
...

this was actually very usefful for me when I was trying to make minikube work on Github Action Macos, I would know exactly how many Minutes to add to the timeout or find the correct Max Attempts for github actions

@nirs
Copy link
Contributor

nirs commented Oct 16, 2025

I am not sure if I agree with this Change !

This current code is not implementing what the issue suggests.

someitmes when the minikube log Stuck on Online it is very devistating to look at it, wondering what minikube is doing, I think we should keep this, since this will be use full to implemement Sub Steps or a Progress-bar style UI spinning later

The issue is not suggesting to remove all logs when searching the ip address, but to have log per attempt instead of 2 logs. This keeps enough progress so minikube does not fill stuck until the search is complete.

instead of removing all the updates, I suggrest limit number of times it trying, we could skip every 3 or 5 times. so long one in 5,

so the log should say

I0721 15:38:26.704823 10969 main.go:141] libmachine: Searching for d6:3f:76:90:91:76 in /var/db/dhcpd_leases ... I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 1 I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 5 I0721 15:38:28.705345 10969 main.go:141] libmachine: Attempt 10 ...

this was actually very usefful for me when I was trying to make minikube work on Github Action Macos, I would know exactly how many Minutes to add to the timeout or find the correct Max Attempts for github actions

This will be too complicated and we should not spend time on something which is wrong by design. The way to get the ip address is to get it from the guest, either by an agent running in the guest. This will work with any network (shared, bridged) and does not require polling.

Until we have this we should do the minimal changes needed so this make less noise in the logs.

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henry3260 please update the PR with the new Before/After

Copy link
Contributor

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better now! just need to remove the unwanted constant at the op of the file, and bake the multiplier into the value so we don't repeat the multiplier.

serialFileName = "serial.log"
logLevelInfo = "3"
defaultSSHUser = "docker"
maxAttempts = 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value makes sense only in Start(), and even there is should be right above the code using it. Putting it here confuses the reader that this driver uses maxAttempts for many operations while it is used only for one operation.

Let make it a variable at the start right above the loop. This way we can bake into the value the multiplier in the 2 version using a multiplier.

@henry3260
Copy link
Author

@henry3260 please update the PR with the new Before/After

Updated the PR with the new Before/After!

@henry3260
Copy link
Author

@nirs @medyagh Thanks for the review and guidance!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: henry3260, medyagh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 17, 2025
@medyagh
Copy link
Member

medyagh commented Oct 17, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 17, 2025
@medyagh medyagh self-requested a review October 17, 2025 18:34
@k8s-ci-robot
Copy link
Contributor

@henry3260: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-minikube-integration 37d7943 link true /test pull-minikube-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21753 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 42.8s    │ 44.5s                  │
│ enable ingress │ 16.9s    │ 17.0s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 46.0s 42.7s 41.3s 41.8s 42.1s
Times for minikube (PR 21753) start: 43.6s 45.1s 43.8s 44.8s 45.2s

Times for minikube ingress: 18.9s 15.9s 16.8s 15.9s 16.9s
Times for minikube (PR 21753) ingress: 16.4s 15.9s 16.4s 15.9s 20.4s

docker driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21753 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 23.2s    │ 23.8s                  │
│ enable ingress │ 11.9s    │ 12.3s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 20.6s 23.9s 22.6s 24.5s 24.6s
Times for minikube (PR 21753) start: 26.1s 22.4s 23.6s 23.5s 23.3s

Times for minikube ingress: 11.7s 11.7s 12.7s 10.7s 12.7s
Times for minikube (PR 21753) ingress: 10.7s 10.7s 12.7s 13.7s 13.7s

docker driver with containerd runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21753 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 20.3s    │ 20.9s                  │
│ enable ingress │ 21.0s    │ 21.2s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 20.7s 19.9s 20.5s 20.9s 19.5s
Times for minikube (PR 21753) start: 19.9s 20.7s 20.5s 23.0s 20.6s

Times for minikube ingress: 22.2s 20.2s 20.2s 20.2s 22.2s
Times for minikube (PR 21753) ingress: 21.2s 20.2s 23.2s 21.2s 20.2s

@minikube-pr-bot
Copy link

Here are the number of top 10 failed tests in each environments with lowest flake rate.

Environment Test Name Flake Rate
KVM_Linux_crio (7 failed) TestFunctional/serial/KubectlGetPods(gopogh) 0.00% (chart)
KVM_Linux_crio (7 failed) TestFunctional/serial/MinikubeKubectlCmd(gopogh) 0.00% (chart)
KVM_Linux_crio (7 failed) TestFunctional/serial/MinikubeKubectlCmdDirectly(gopogh) 0.00% (chart)
KVM_Linux_crio (7 failed) TestFunctional/serial/SoftStart(gopogh) 6.52% (chart)
Docker_Linux_containerd_arm64 (1 failed) TestScheduledStopUnix(gopogh) 9.80% (chart)
KVM_Linux_containerd (1 failed) TestFunctional/parallel/ConfigCmd(gopogh) 0.00% (chart)

Besides the following environments also have failed tests:

  • Docker_Linux_crio: 37 failed (gopogh)

  • Docker_Linux_crio_arm64: 44 failed (gopogh)

To see the flake rates of all tests by environment, click here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cleanup: Minimize logging when waiting for ip address

6 participants