Skip to content

Low entropy causes VM agent to hang indefinitely during start #325

@sipsma

Description

@sipsma

It appears for unknown reasons a Firecracker VM may not have sufficient entropy during boot, which can cause our VM agent to experience long pauses very early during process start (possibly as the go runtime itself is still starting, before any of our code is actually executing). This results timeouts when the runtime shim attempts to connect to the agent over vsock, resulting in errors like

failed to create VM: failed to dial the VM over vsock: context deadline exceeded

Because the pause happens so early when agent is starting, it never writes any output to stdout/stderr, so the debug logs show nothing from agent.

Running low on entropy during boot is an open issue w/ Firecracker. There are some suggested fixes there (which I have not yet tried), but it doesn't appear to have a strong conclusion at this time. I just saw the rngd suggestion in the thread, I will give that a try and update this issue with the results.

I don't know why this suddenly started happening to me, it's on the dev machine I normally use (an i3.metal with very similar setup to the CI machines). It may just be entirely random when it starts occurring.

We need to follow that Firecracker thread on solutions to this issue and decide how we can possibly address this for Firecracker-containerd users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions