-
Notifications
You must be signed in to change notification settings - Fork 711
Description
Hi,
We have a setup where we first try to restore a container using criu. If for any reason criu restore failed, we will launch another container normally as fallback. These two containers will belong to the same pod, sharing the same network namespace of the pod.
One issue we observed is that when the first container fails to be restored from a checkpoint due to TCP issues, these iptable rules would be leaked to the second fallback container because of shared network namespace, thus preventing it from functioning normally.
Chain INPUT (policy ACCEPT 3 packets, 180 bytes)
pkts bytes target prot opt in out source destination
1086 65160 CRIU 0 -- * * 0.0.0.0/0 0.0.0.0/0
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 3 packets, 120 bytes)
pkts bytes target prot opt in out source destination
293 17580 CRIU 0 -- * * 0.0.0.0/0 0.0.0.0/0
Chain CRIU (2 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT 0 -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0xc114
1379 82740 DROP 0 -- * * 0.0.0.0/0 0.0.0.0/0
I believe these iptable rules are introduced during network_lock_internal
Line 2123 in 2cf8f13
| ret = network_lock_internal(/* restore = */ true); |
However, when restore failed, the control flow jumps to out_kill without invoking network_unlock
Line 2201 in 2cf8f13
| network_unlock(); |
which seems to be only invoked on successfully restoring the checkpoint.
I'm wondering if we should consider clean up the iptable rules even on failure. Thanks for your attention!