Skip to content

Container startup gets stuck in vnc_startup.sh when running on AWS ECS, on EC2 instances (not with Fargate) and with awsvpc network mode #59

@fj604

Description

@fj604

function wait_for_network_devices() {
while true; do
interfaces=$(ip link show type veth | awk -F: '/^[0-9]+: / {print $2}' | awk '{print $1}' | sed 's/@.*//')
if [ -z "$interfaces" ]; then
sleep 1
continue
fi
for interface in $interfaces; do
# ignore eth* interfaces if egress gateway is enabled
if [[ $interface == eth* && -z $KASM_SVC_EGRESS ]]; then
return
fi
if [[ $interface == k-p-* ]]; then
wait_for_egress_signal
if [ -z "$KASM_PROFILE_LDR" ]; then
http_proxy="" https_proxy="" curl -k "https://${KASM_API_HOST}:${KASM_API_PORT}/api/set_kasm_session_status?token=${KASM_API_JWT}" -H 'Content-Type: application/json' -d '{"status": "running"}'
fi
return
fi
done
sleep 1
done
}

When running KASM workspaces in AWS ECS on EC2 (not Fargate) and using awsvpc network mode, the container gets stuck in startup script and never starts listening on port 6901.

default:~$ ip link show type veth
3: ecs-eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 0a:58:a9:fe:ac:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
  • The script expects interfaces be of type venv and to be named eth*
  • In awsvpc mode the container is given its own elastic network interface (ENI) that appears as ecs-eth0.
  • Because ecs-eth0 does not start with eth nor with k-p-, neither branch of the test fires, the loop finishes one iteration, sleeps one second, and repeats indefinitely. Start-up therefore stalls even though the interface was correctly “seen”.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions