kubectl delete gets stuck although resource is properly deleted.

**Environmental Info:**
K3s Version: 
`v1.34.2`, `v1.34.3`, `v1.35.0`, `v1.35.1`


Node(s) CPU architecture, OS, and Version: 


Cluster Configuration:

Provisioning cluster this way:
k3d cluster create upstream --agents 1 --wait --image "rancher/k3s:$K3SERVER_VERSION"

**Describe the bug:**

The problem we’re encountering is that our `e2e` tests fail because at certain points we run a `kubectl delete` on _Fleet_ CRDs such as _GitRepo_ or _HelmOp_, which have finalizers and, obviously, are not deleted immediately.

We use `k3d` with the `--image` flag to create the test cluster.

You can see the failure, for example, in this GitHub job:
https://github.com/rancher/fleet/actions/runs/21831206743/job/63109227320?pr=4602

In that job, `kubectl delete` is being called with `-v=6`.

I was able to reproduce this by running the attached script.
I can confirm that delete times out, but the _GitRepo_ resource no longer exists in the cluster.

If, while delete is still running (after several seconds have already passed), you run a` kubectl get` on the same resource (in another terminal, of course), you can see that it no longer exists.

Naturally, our first thought was that perhaps a finalizer was still pending and that was why the resource was not being removed.
However, I can confirm that it does not exist even before delete reports the timeout.
This has been tested with the following versions:

* `v1.35.1-k3s1`
* `v1.35.0-k3s3`
* `v1.35.0-k3s1`
* `v1.34.3-k3s1`
* `v1.34.2-k3s1`
In all of them it fails at some point (it doesn’t always fail immediately; sometimes it takes longer, sometimes less).
The same script has been running for more than 6 hours using version `v1.34.1-k3s1` without failing even once.
In fact, we detected the issue when upgrading from `v1.34.1-k3s1` to `v1.35.0-k3s1`.

I've tested the following versions and could **not** recreate the issue:
* `v1.34.1-k3s1`
* `v1.33.6-k3s1`
* `v1.33.5-k3s1`

The first version failing is `v1.34.2-k3s1`



**Steps To Reproduce:**

Run the script and reproduce the problem, simply extract the tarball and execute the included bash script from the same directory:
```bash
tar xzvf test-delete.tar.gz
cd test-delete
./test-delete.sh
```

The script creates and deletes a GitRepo in a loop until the timeout occurs.

[test-delete.tar.gz](https://github.com/user-attachments/files/25338599/test-delete.tar.gz)

**Expected behavior:**

`kubectl delete` should not get stuck when deleting a resource (that is successfully deleted)

**Actual behavior:**

`kubectl delete` gets stuck when deleting a resource.

**Additional context / logs:**

I can also confirm the issue does not happen when using `etcd`, so it looks an issue related with `kine`.
I cannot recreate it when adding  `--k3s-arg "--cluster-init@server:0"` to `k3d`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubectl delete gets stuck although resource is properly deleted. #13656

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kubectl delete gets stuck although resource is properly deleted. #13656

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions