-
Notifications
You must be signed in to change notification settings - Fork 205
Description
It appears for unknown reasons a Firecracker VM may not have sufficient entropy during boot, which can cause our VM agent to experience long pauses very early during process start (possibly as the go runtime itself is still starting, before any of our code is actually executing). This results timeouts when the runtime shim attempts to connect to the agent over vsock, resulting in errors like
failed to create VM: failed to dial the VM over vsock: context deadline exceeded
Because the pause happens so early when agent is starting, it never writes any output to stdout/stderr, so the debug logs show nothing from agent.
Running low on entropy during boot is an open issue w/ Firecracker. There are some suggested fixes there (which I have not yet tried), but it doesn't appear to have a strong conclusion at this time. I just saw the rngd
suggestion in the thread, I will give that a try and update this issue with the results.
I don't know why this suddenly started happening to me, it's on the dev machine I normally use (an i3.metal
with very similar setup to the CI machines). It may just be entirely random when it starts occurring.
We need to follow that Firecracker thread on solutions to this issue and decide how we can possibly address this for Firecracker-containerd users.