Skip to content

Cannot stop the container: stop timeout #3125

@sofat1989

Description

@sofat1989

Containerd: v1.2.4
Error Message:

time="2019-03-25T02:07:08.210779178-07:00" level=error msg="An error occurs during waiting for container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" to be stopped" error="wait container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" stop timeout"

The container is soon OOMKilled after it is up. The command, crictl ps shows the container is running. But the container is exited
I checked the logs of containerd, here is an error message

Mar 22 01:56:43  containerd[9598]: time="2019-03-22T01:56:43.650896850-07:00" level=info msg="StartContainer for "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3""
Mar 22 01:56:43 containerd[9598]: time="2019-03-22T01:56:43.652175173-07:00" level=info msg="shim containerd-shim started" address="/containerd-shim/k8s.io/50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3/shim.sock" debug=true pid=28058
Mar 22 01:56:44 containerd[9598]: time="2019-03-22T01:56:44.476147620-07:00" level=info msg="StartContainer for "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" returns successfully"
Mar 22 01:58:04 containerd[9598]: time="2019-03-22T01:58:04-07:00" level=error msg="post event" error="failed to publish event" namespace=k8s.io path="/run/containerd/io.containerd.runtime.v1.linux/k8s.io/50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" pid=28058
Mar 22 02:01:23 containerd[9598]: time="2019-03-22T02:01:23.970668962-07:00" level=info msg="Finish piping stderr of container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3""
Mar 22 02:01:23  containerd[9598]: time="2019-03-22T02:01:23.970856941-07:00" level=info msg="Finish piping stdout of container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3""
Mar 22 02:01:24 containerd[9598]: time="2019-03-22T02:01:24.013183916-07:00" level=info msg="TaskOOM event &TaskOOM{ContainerID:50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3,}"
Mar 22 02:07:16  containerd[9598]: time="2019-03-22T02:07:16.799368410-07:00" level=info msg="StopContainer for "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" with timeout 30 (s)"
Mar 22 02:07:16 containerd[9598]: time="2019-03-22T02:07:16.818108390-07:00" level=info msg="Stop container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" with signal terminated"
Mar 22 02:07:46  containerd[9598]: time="2019-03-22T02:07:46.834222655-07:00" level=error msg="An error occurs during waiting for container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" to be stopped" error="wait container "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" stop timeout"

Here is an error

Mar 22 01:58:04  containerd[9598]: time="2019-03-22T01:58:04-07:00" level=error msg="post event" error="failed to publish event" namespace=k8s.io path="/run/containerd/io.containerd.runtime.v1.linux/k8s.io/50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" pid=28058

I think here should have a TaskExit event. We cannot get the reason why the event is failed to publish. We doubt that the container cannot be stopped because containerd didn't get the TaskExit event

After the containerd is restarted, The containerd shows

Mar 25 02:07:28 containerd[25806]: time="2019-03-25T02:07:28.877990557-07:00" level=info msg="TaskExit event &TaskExit{ContainerID:50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3,ID:50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3,Pid:28097,ExitStatus:137,ExitedAt:2019-03-25 09:07:26.071348787 +0000 UTC,}"
Mar 25 02:07:36  containerd[25806]: time="2019-03-25T02:07:36.494666976-07:00" level=info msg="Container to stop "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" must be in running or unknown state, current state "CONTAINER_EXITED""
Mar 25 02:07:36  containerd[25806]: time="2019-03-25T02:07:36.516380707-07:00" level=info msg="Container to stop "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" must be in running or unknown state, current state "CONTAINER_EXITED""
Mar 25 02:07:36  containerd[25806]: time="2019-03-25T02:07:36.537179076-07:00" level=info msg="RemoveContainer for "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3""
Mar 25 02:07:36 containerd[25806]: time="2019-03-25T02:07:36.563730617-07:00" level=info msg="RemoveContainer for "50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" returns successfully"
Mar 25 02:07:36  containerd[25806]: time="2019-03-25T02:07:36.766274471-07:00" level=debug msg="removed snapshot" key="k8s.io/4711/50226b57f4b79111737460f54cc3e142631d8803efa6ecef58d808f8b56ed0a3" snapshotter=overlayfs

Q1: In which conditions, will the publishing events fail?
Q2: Why cannot the container be stopped?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions