-
Notifications
You must be signed in to change notification settings - Fork 133
Refactor: pull in firecracker-containerd's internal vsock module as a package #380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Issue #87 was previously addressed by having our shim process exit when all tasks it was managing exited. However, it turns out that is not the expected behavior for a containerd shim. The expected behavior is that a shim continue running until all tasks have been *deleted* by the client (via the containerd Delete API call). When any task has been deleted (or create of a task fails) containerd will send the shim a Shutdown API call, at which time the shim is expected to exit if it *all* tasks it was managing have been deleted. This commit fixes brings our shim in line with that expected behavior. It, however, also retains the existing FCControl parameter "ExitAfterAllTasksGone" (though now renamed to "ExitAfterAllTasksDeleted") to give users the ability to specify whether the default containerd shim behavior should be ignored and instead allow the Shim+VM to keep running even after all Tasks have been deleted. The majority of the changes here actually end up just being a refactor of the TaskManager interface to safely handle Create/Delete task APIs alongside checking whether the shim has any tasks left being managed (which is uses to decide if it should shutdown). The refactorization also ensures that all IO is flushed before a Delete call returns (which is a better solution to handling I/O races than the temporary fix applied in 1e36219). Signed-off-by: Erik Sipsma <[email protected]>
Signed-off-by: Erik Sipsma <[email protected]>
Signed-off-by: Erik Sipsma <[email protected]>
This consists of: * Upgrading to the new Go SDK version. * Using the new vsock interface with unix-domain socket backend on the host. Signed-off-by: Erik Sipsma <[email protected]>
Since 0.20.0, Firecracker itself writes "OK <port>" as ACK. So we don't have to write our own "IMALIVE" message anymore. Signed-off-by: Kazuyoshi Kato <[email protected]>
Signed-off-by: Kazuyoshi Kato <[email protected]>
As like StopVM, CreateVM's timeout should be configurable to accomodate other root fs images and slower environments. Fixes #423. Signed-off-by: Kazuyoshi Kato <[email protected]>
Prior to this commit, io would be proxied from the VM to the host, but no timeout is provided to the proxy. So if vsock has an issue, the io would forever try to connect to the vsock leading to an extraneous amount of errors. This commit adds a simple default timeout, 5 second, which will eliminate the extraneous amount of logs Signed-off-by: xibz <[email protected]>
Signed-off-by: Jeremy Williams <[email protected]>
tryConnect() has multiple error cases which would be difficult to understand. This change adds some context on each of them. Signed-off-by: Kazuyoshi Kato <[email protected]>
firecracker-containerd and Firecracker no longer use `master` branch since 435ade4. The links and references must be updated. Signed-off-by: Kazuyoshi Kato <[email protected]>
vsock/dialer.go
Outdated
|
||
type vsockDialer struct { | ||
timeout VsockTimeout | ||
ctx context.Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, it's an anti-pattern to store a context in a struct because it is possible to construct and use an object in different contexts (e.g. a smaller scoped context with a shorter timeout). I know it technically doesn't matter for this because only vsock.Dial is exposed, but it's good to keep convention for easier refactoring.
Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Dial() could take a context. It is different from other Dial functions do, but these Dial functions are implemented before context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this change and passed the context for dial via input parameter. Listener is another story because we implement the net.Listener interface. At the very least this change doesn't make the problem worse.
f6d71d8
to
480f651
Compare
Oh 480f651 doesn't have Signed-off line. |
Signed-off-by: Austin Vazquez <[email protected]>
Pushed commit with sign-off. |
Issue #, if available:
#379
Description of changes:
Pull in firecracker-containerd's internal vsock module and refactor into a reusable package.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.