Skip to content

Command inside container hangs forever #253

Closed
@fooock

Description

@fooock

I'm able to run a new busybox container using this command:

sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock run \
--snapshotter firecracker-naive \
--runtime aws.firecracker \
--rm --env HELLO=world --tty docker.io/library/busybox:latest busybox-1

When I'm inside, I execute these commands:

/ # echo $HELLO
world
/ # ls
bin   dev   etc   home  proc  root  run   sys   tmp   usr   var
/ # ls /sys/block/

The ls command hangs forever. The task can't be killed, paused or destroyed. Logs from firecracker-containerd:

DEBU[2019-08-29T19:09:50.989475168+02:00] prepare snapshot                              key=busybox-1 parent="sha256:0d315111b4847e8cd50514ca19657d1e8d827f4e128d172ce8b2f76a04f3faea"
DEBU[2019-08-29T19:09:51.569003261+02:00] event published                               ns=default topic=/snapshot/prepare type=containerd.events.SnapshotPrepare
DEBU[2019-08-29T19:09:51.572614670+02:00] get snapshot mounts                           key=busybox-1
DEBU[2019-08-29T19:09:55.283637876+02:00] event published                               ns=default topic=/containers/create type=containerd.events.ContainerCreate
DEBU[2019-08-29T19:09:55.288290532+02:00] get snapshot mounts                           key=busybox-1
time="2019-08-29T19:09:55.329489962+02:00" level=debug msg=StartShim runtime=aws.firecracker task_id=busybox-1
time="2019-08-29T19:09:55.329619889+02:00" level=info msg="will start a single-task VM since no VMID has been provided" runtime=aws.firecracker task_id=busybox-1 vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
DEBU[2019-08-29T19:09:55.330150426+02:00] create VM request: VMID:"ccf256ab-2ad4-4380-acc9-4cd6acae2c34" ContainerCount:1 ExitAfterAllTasksDeleted:true  
DEBU[2019-08-29T19:09:55.330178269+02:00] using namespace: default                     
DEBU[2019-08-29T19:09:55.330532743+02:00] waiting on shim process                       vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.351619410+02:00] starting signal loop                          namespace=default path=/run/firecracker-containerd/default/ccf256ab-2ad4-4380-acc9-4cd6acae2c34 pid=13909
INFO[2019-08-29T19:09:55.351810128+02:00] creating new VM                               runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.386745469+02:00] Called startVMM(), setting up a VMM on /var/run/firecracker-containerd/default/ccf256ab-2ad4-4380-acc9-4cd6acae2c34/firecracker.sock  runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.399817508+02:00] refreshMachineConfiguration: [GET /machine-config][200] getMachineConfigurationOK  &{CPUTemplate:T2 HtEnabled:0xc00047819b MemSizeMib:0xc000478190 VcpuCount:0xc000478188}  runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.400159781+02:00] PutGuestBootSource: [PUT /boot-source][204] putGuestBootSourceNoContent   runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.400185833+02:00] Attaching drive /var/run/firecracker-containerd/default/ccf256ab-2ad4-4380-acc9-4cd6acae2c34/stub0, slot stub0, root false.  runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.400653874+02:00] Attached drive /var/run/firecracker-containerd/default/ccf256ab-2ad4-4380-acc9-4cd6acae2c34/stub0: [PUT /drives/{drive_id}][204] putGuestDriveByIdNoContent   runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.400676438+02:00] Attaching drive /var/lib/firecracker-containerd/runtime/default-rootfs.img, slot root_drive, root true.  runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.401031501+02:00] Attached drive /var/lib/firecracker-containerd/runtime/default-rootfs.img: [PUT /drives/{drive_id}][204] putGuestDriveByIdNoContent   runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.410509688+02:00] startInstance successful: [PUT /actions][204] createSyncActionNoContent   runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:55.410539751+02:00] calling agent                                 runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:56.211042010+02:00] successfully started the VM                   runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
DEBU[2019-08-29T19:09:56.211351012+02:00] event forwarded                               ns=default topic=/firecracker-vm/start type=VMStart
INFO[2019-08-29T19:09:56.212777900+02:00] PatchGuestDrive successful                    runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
INFO[2019-08-29T19:09:56.342161558+02:00] successfully created task                     ExecID= TaskID=busybox-1 pid_in_vm=742 runtime=aws.firecracker vmID=ccf256ab-2ad4-4380-acc9-4cd6acae2c34
DEBU[2019-08-29T19:09:56.342434728+02:00] event forwarded                               ns=default topic=/tasks/create type=containerd.events.TaskCreate
DEBU[2019-08-29T19:09:56.348199604+02:00] event forwarded                               ns=default topic=/tasks/start type=containerd.events.TaskStart

If I stop and start again the firecracker-containerd, then this lines are shown in the log:

DEBU[2019-08-29T19:43:39.599290608+02:00] loading tasks in namespace                    namespace=default
WARN[2019-08-29T19:43:39.599402490+02:00] cleaning up after shim disconnected           id=busybox-1 namespace=default
INFO[2019-08-29T19:43:39.599413056+02:00] cleaning up dead shim                        
WARN[2019-08-29T19:43:39.622662715+02:00] failed to clean up after shim disconnected    error="aws.firecracker: failed to connect: dial unix /run/firecracker-containerd/containerd.sock.ttrpc: connect: connection refused\n: exit status 1" id=busybox-1 namespace=default
DEBU[2019-08-29T19:43:39.622741381+02:00] event published                               ns=default topic=/tasks/exit type=containerd.events.TaskExit

Why the ls command hangs forever? What I can do to avoid this? Maybe setup container timeout?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions