When CRIU restore failed, iptable rules could be leaked when using network namespace

Hi,

We have a setup where we first try to restore a container using criu. If for any reason criu restore failed, we will launch another container normally as fallback. These two containers will belong to the same pod, sharing the same network namespace of the pod.

One issue we observed is that when the first container fails to be restored from a checkpoint due to TCP issues, these iptable rules would be leaked to the second fallback container because of shared network namespace, thus preventing it from functioning normally.
```
Chain INPUT (policy ACCEPT 3 packets, 180 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 1086 65160 CRIU       0    --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 3 packets, 120 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  293 17580 CRIU       0    --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain CRIU (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     0    --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0xc114
 1379 82740 DROP       0    --  *      *       0.0.0.0/0            0.0.0.0/0       
```

I believe these iptable rules are introduced during `network_lock_internal` https://github.com/checkpoint-restore/criu/blob/2cf8f13ca1f11a0491977e438b262e646137256c/criu/cr-restore.c#L2123

However, when restore failed, the control flow jumps to `out_kill` without invoking `network_unlock`  https://github.com/checkpoint-restore/criu/blob/2cf8f13ca1f11a0491977e438b262e646137256c/criu/cr-restore.c#L2201

which seems to be only invoked on successfully restoring the checkpoint.

I'm wondering if we should consider clean up the iptable rules even on failure. Thanks for your attention!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When CRIU restore failed, iptable rules could be leaked when using network namespace #2828

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When CRIU restore failed, iptable rules could be leaked when using network namespace #2828

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions