Skip to content

Commit f248f45

Browse files
committed
Add support for CNI-configured VM network interfaces.
Signed-off-by: Erik Sipsma <[email protected]>
1 parent 2ca2ed7 commit f248f45

27 files changed

+1378
-241
lines changed

Makefile

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,34 @@ firecracker-containerd-naive-integ-test-image: $(RUNC_BIN) $(FIRECRACKER_BIN) $(
131131

132132
.PHONY: all $(SUBDIRS) clean proto deps lint install image test-images firecracker-container-test-image firecracker-containerd-naive-integ-test-image test test-in-docker $(TEST_SUBDIRS) integ-test $(INTEG_TEST_SUBDIRS)
133133

134+
##########################
135+
# CNI Network
136+
##########################
137+
138+
CNI_BIN_ROOT?=/opt/cni/bin
139+
$(CNI_BIN_ROOT):
140+
mkdir -p $(CNI_BIN_ROOT)
141+
142+
PTP_BIN?=$(CNI_BIN_ROOT)/ptp
143+
$(PTP_BIN): $(CNI_BIN_ROOT)
144+
GOBIN=$(CNI_BIN_ROOT) GO111MODULE=off go get -u github.com/containernetworking/plugins/plugins/main/ptp
145+
146+
HOSTLOCAL_BIN?=$(CNI_BIN_ROOT)/host-local
147+
$(HOSTLOCAL_BIN): $(CNI_BIN_ROOT)
148+
GOBIN=$(CNI_BIN_ROOT) GO111MODULE=off go get -u github.com/containernetworking/plugins/plugins/ipam/host-local
149+
150+
TC_REDIRECT_TAP_BIN?=$(CNI_BIN_ROOT)/tc-redirect-tap
151+
$(TC_REDIRECT_TAP_BIN): $(CNI_BIN_ROOT)
152+
GOBIN=$(CNI_BIN_ROOT) go install github.com/firecracker-microvm/firecracker-go-sdk/cni/cmd/tc-redirect-tap
153+
154+
FCNET_CONFIG?=/etc/cni/conf.d/fcnet.conflist
155+
$(FCNET_CONFIG):
156+
mkdir -p $(dir $(FCNET_CONFIG))
157+
cp tools/demo/fcnet.conflist $(FCNET_CONFIG)
158+
159+
.PHONY: demo-network
160+
demo-network: $(PTP_BIN) $(HOSTLOCAL_BIN) $(TC_REDIRECT_TAP_BIN) $(FCNET_CONFIG)
161+
134162
##########################
135163
# Firecracker submodule
136164
##########################

docs/getting-started.md

Lines changed: 79 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,10 @@ configuration file has the following fields:
193193
delivered.
194194
* `ht_enabled` (unused) - Reserved for future use.
195195
* `debug` (optional) - Enable debug-level logging from the runtime.
196+
* `default_network_interfaces` (optional) - a list of network interfaces to configure
197+
a VM with if no list of network interfaces is provided with a CreateVM call. Defaults
198+
to an empty list. The structure of the items in the list is the same as the Go API
199+
FirecrackerNetworkInterface defined [in protobuf here](../proto/types.proto).
196200

197201
<details>
198202
<summary>A reasonable example configuration</summary>
@@ -206,8 +210,7 @@ configuration file has the following fields:
206210
"cpu_template": "T2",
207211
"log_fifo": "fc-logs.fifo",
208212
"log_level": "Debug",
209-
"metrics_fifo": "fc-metrics.fifo",
210-
213+
"metrics_fifo": "fc-metrics.fifo"
211214
}
212215
```
213216
</details>
@@ -241,11 +244,13 @@ And start a container!
241244

242245
```bash
243246
$ sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
244-
run --snapshotter firecracker-naive --runtime aws.firecracker --tty \
247+
run --snapshotter firecracker-naive --runtime aws.firecracker \
248+
--rm --tty --net-host \
245249
docker.io/library/busybox:latest busybox-test
246250
```
247251

248-
Alternatively you can specify `--runtime` and `--snapshotter` just once when creating a new namespace using containerd's default labels:
252+
Alternatively you can specify `--runtime` and `--snapshotter` just once when
253+
creating a new namespace using containerd's default labels:
249254

250255
```bash
251256
$ sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
@@ -258,6 +263,75 @@ $ sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
258263

259264
$ sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
260265
-n fc \
261-
run --tty \
266+
run --rm --tty --net-host \
262267
docker.io/library/busybox:latest busybox-test
263268
```
269+
270+
## Networking support
271+
Firecracker-containerd supports the same networking options as provided by the
272+
Firecracker Go SDK, [documented here](https://github.com/firecracker-microvm/firecracker-go-sdk#network-configuration).
273+
This includes support for configuring VM network interfaces both with
274+
pre-created tap devices and with tap devices created automatically by
275+
[CNI](https://github.com/containernetworking/cni) plugins.
276+
277+
### CNI Setup
278+
CNI-configured networks offer the quickest way to get VMs up and running with
279+
connectivity to external networks. Setting one up requires a few extra steps in
280+
addition to the above Setup steps.
281+
282+
To install the required CNI dependencies, run the following make target from the
283+
previously cloned firecracker-containerd repository:
284+
```bash
285+
$ sudo make demo-network
286+
```
287+
288+
You can check the Makefile to see exactly what is installed and where, but for a
289+
quick summary:
290+
* [`ptp` CNI plugin](https://github.com/containernetworking/plugins/tree/master/plugins/main/ptp)
291+
- Creates a [veth](http://man7.org/linux/man-pages/man4/veth.4.html) pair with
292+
one end in a private network namespace and the other end in the host's network namespace.
293+
* [`host-local` CNI
294+
plugin](https://github.com/containernetworking/plugins/tree/master/plugins/ipam/host-local)
295+
- Manages IP allocations of network devices present on the local machine by
296+
vending them from a statically defined subnet.
297+
* [`tc-redirect-tap` CNI
298+
plugin](https://github.com/firecracker-microvm/firecracker-go-sdk/tree/master/cni)
299+
- A CNI plugin that adapts other CNI plugins to be usable by Firecracker VMs.
300+
[See this doc for more details](networking.md). It is used here to adapt veth
301+
devices created by the `ptp` plugin to tap devices provided to VMs.
302+
* [`fcnet.conflist`](../tools/demo/fcnet.conflist) - A sample CNI configuration
303+
file that defines a `fcnet` network created via the `ptp`, `host-local` and
304+
`tc-redirect-tap` plugins
305+
306+
After those dependencies are installed, an update to the firecracker-containerd
307+
configuration file is required for VMs to use the `fcnet` CNI-configuration as
308+
their default way of generating network interfaces. Just include the following `
309+
default_network_interfaces` key in your runtime configuration file (by default
310+
at `/etc/containerd/firecracker-runtime.json`):
311+
```json
312+
"default_network_interfaces": [
313+
{
314+
"CNIConfig": {
315+
"NetworkName": "fcnet",
316+
"InterfaceName": "veth0"
317+
}
318+
}
319+
]
320+
```
321+
322+
After that, start up a container (as described in the above Usage section) and
323+
try pinging your host IP.
324+
325+
At the time of this writing, there is a bug in the ptp plugin that prevents the
326+
DNS settings from the IPAM plugin being propagated. This is being addressed, but
327+
until that time DNS resolution will require users manually tweak the installed
328+
CNI configuration to specify static DNS nameservers appropriate to their local
329+
network in [the `dns` section of the PTP plugin](https://github.com/containernetworking/plugins/tree/master/plugins/main/ptp#network-configuration-reference)
330+
331+
While your host's IP should always be reachable from the VM given the above
332+
networking setup, your VM may or may not have outbound internet access depending
333+
on the details of your host's network. The ptp plugin attempts to setup iptables
334+
rules to allow the VM's traffic to be forwarded on your host's network but may
335+
not be able to if there are pre-existing iptables rules that overlap. In those
336+
cases, granting your VM outbound internet access may require customization of
337+
the CNI configuration past what's installed above.

docs/networking.md

Lines changed: 34 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ The Linux Kernel’s [Traffic Control (TC)](http://tldp.org/HOWTO/Traffic-Contro
7070
Most relevant to our interests, the [U32 filter](http://man7.org/linux/man-pages/man8/tc-u32.8.html) provided as part of TC allows you to create a rule that essentially says “take all the packets entering the ingress queue of this device and move them to the egress queue of this other device”. For example, if you have DeviceA and DeviceB you can setup that rule on each of them such that the end effect is every packet sent into DeviceA goes out of DeviceB and every packet sent to DeviceB goes out of DeviceA. The host kernel just moves the ethernet packets from one device’s queue to the other’s, so the redirection is entirely transparent to any userspace application or VM guest kernel all the way down to and including the link layer.
7171

7272
* We first learned about this approach from [Kata Containers](https://github.com/kata-containers/runtime), who are using it for similar purposes in their framework. They have [some more background information documented here](https://gist.github.com/mcastelino/7d85f4164ffdaf48242f9281bb1d0f9b).
73+
* Another use of TC redirect filters in the context of CNI plugins can be found in the [bandwidth CNI plugin](https://github.com/containernetworking/plugins/tree/master/plugins/meta/bandwidth).
7374

7475
This technique can be used to redirect between a Firecracker VM’s tap device and another device in the network namespace Firecracker is running in. If, for example, the VM tap is redirecting with a veth device in a network namespace, the VM guest internally gets a network device with the same mac address as the veth and needs to assign to it the same IP and routes the veth uses. After that, the VM guest essentially operates as though its nic is the same as the veth device outside on the host.
7576

@@ -117,51 +118,45 @@ VMs will execute in.
117118
In this option, Firecracker-containerd just asks for CNI configuration during a CreateVM call, which it will use to configure a network namespace for the Firecracker VM to execute in. The API updates may look something like:
118119

119120
```
120-
message FirecrackerNetworkConfiguration {
121-
// CNI Configuration to use to create the network namespace in which the
122-
// VM will execute. It's an error to specify both this and any NetworkInterfaces
123-
// below.
124-
FirecrackerCNIConfiguration CNIConfiguration;
121+
message FirecrackerNetworkInterface {
122+
// <existing fields...>
123+
124+
// CNI Configuration that will be used to configure the network interface
125+
CNIConfiguration CNIConfig;
125126
126-
// The existing FirecrackerNetworkInterface configuration
127-
// which specifies the name of the tap device on the host and rate limiters
128-
repeated FirecrackerNetworkInterface NetworkInterfaces;
127+
// Static configuration that will be used to configure the network interface
128+
StaticNetworkConfiguration StaticConfig;
129129
}
130130
131131
message FirecrackerCNIConfiguration {
132132
// Name of the CNI network that will be used to configure the VM
133133
string NetworkName;
134134
135-
// Path to CNI bin directory and CNI conf directory, respectively, that will
136-
// be used to configure the VM.
137-
string BinDirectory;
135+
// IF_NAME CNI parameter provided to plugins for the name of devices to create
136+
string InterfaceName;
137+
138+
// Paths to CNI bin directories, CNI conf directory and CNI cache directory,
139+
// respectively, that will be used to configure the VM.
140+
repeated string BinPath;
138141
string ConfDirectory;
142+
string CacheDirectory;
143+
144+
// CNI Args passed to plugins
145+
repeated CNIArg Args;
139146
}
140147
141-
message FirecrackerNetworkInterface {
142-
// <existing fields...>
143-
144-
// (Optional) Static configuration that will be applied internally in the
145-
// Guest VM. At first, it will be an error to specify this for multiple
146-
// NetworkInterfacesin the same CreateVM call (due to the limitations of
147-
// using "ip=..."). In time, we may be able to lift that restriction with
148-
// updates to the implementation.
149-
StaticIPConfiguration StaticIP;
148+
message StaticNetworkConfiguration {
149+
string MacAddress;
150+
string HostDevName;
151+
IPConfiguration IPConfig;
150152
}
151153
152-
message StaticIPConfiguration {
154+
message IPConfiguration {
153155
// Network configuration that will be applied to a network interface in a
154156
// Guest VM on boot.
155-
string IP;
156-
string SubnetMask;
157-
string DefaultGateway;
157+
string PrimaryAddress;
158+
string GatewayAddress;
158159
repeated string Nameservers;
159-
string Hostname;
160-
}
161-
162-
message CreateVMRequest {
163-
// <same existing fields except FirecrackerNetworkInterface which is replaced with the following...>
164-
FirecrackerNetworkConfiguration NetworkConfiguration;
165160
}
166161
```
167162

@@ -258,7 +253,7 @@ In order for networking to work as expected inside the VM, it needs to have IP a
258253

259254
The IP configuration is just pre-configured in the kernel when the system starts (the same end effect of having run the corresponding netlink commands to configure IP and routes). The DNS configuration is applied by writing the nameserver and search domain configuration to /proc/net/pnp in a format that is compatible with /etc/resolv.conf. The typical approach is to then have /etc/resolv.conf be a symlink to /proc/net/pnp.
260255

261-
Users of Firecracker-containerd are also free to provide their own kernel boot options, which could include their own static IP/DNS configuration. In those cases, if they have enabled CNI configuration, Firecracker-containerd will return an error.
256+
Users of Firecracker-containerd are also free to provide their own kernel boot options, which could include their own static IP/DNS configuration. In those cases, if they have enabled CNI configuration, Firecracker-containerd will return an error.
262257

263258
**Pros**
264259

@@ -335,21 +330,15 @@ The biggest immediate downside of Option A is the requirement that /etc/resolv.c
335330

336331
Firecracker-containerd will build the current binaries it does today plus a new CNI-plugin compatible binary, `tc-redirect-tap`, that takes an existing network namespace and creates within it a tap device that is redirected via a TC filter to an already networked device in the netns. This CNI plugin is only useful when chained with other CNI-plugins (which will setup the device that the tap will redirect with).
337332

338-
When setting up Firecracker-containerd, users can optionally include CNI configuration in Firecracker-containerd’s runtime config file. If CNI configuration is not passed during CreateVM (such as the single-container VM use case), the runtime will fall back to configuration in the runtime config. If there’s no CNI configuration present in either the CreateVM call or the runtime config, the behavior will remain the same as it is today.
333+
When setting up Firecracker-containerd, users can optionally include a set of default network interfaces to provide to a VM if none are specified by the user. This allows users to optionally set their VMs to use CNI-configured network interfaces by default. The user is free to provide an explicit NetworkInterfaces list during the CreateVM call (including an empty list), in which case that will be used instead of any defaults present in the runtime config file.
339334

340-
On a high-level, the implementation of CreateVM relevant to the new networking configuration will look something like this:
341-
342-
1. Parse what, if any, CNI configuration is provided via either the CreateVM call or the defaults in the runtime config file.
343-
2. If CNI Configuration is not present, just continue the VM creation process as it is today
344-
3. If CNI Configuration is present, check to see if the Jailer configuration specifies a network namespace
345-
1. If it does, that will be the network namespace provided to the CNI plugins
346-
2. If it does not, a new empty network namespace will be created by the runtime and provided to the CNI plugins
347-
4. Use the provided CNI configuration to configure the network namespace
348-
5. Start the Firecracker VM in the network namespace via the Jailer and with the corresponding `ip=...` kernel boot parameters
335+
The Firecracker Go SDK will take care of checking whether any Jailer config specifies a pre-existing network namespace to use and, if not, creating a new network namespace for the VM on behalf of the user. The Go SDK will also take care of invoking CNI on that network namespace, starting the VMM inside of it, and handling CNI network deletion after the VM stops.
349336

350337
If CreateVM succeeds, any containers running inside the VM with a “host” network namespace will have access to the network configured via CNI outside the VM.
351338

352-
The CNI configuration Firecracker-containerd asks for are just references to a CNI network name, a CNI bin directory (i.e. `/opt/cni/bin`) and a CNI configuration directory (i.e. `/etc/cni/net.d`). A hypothetical example CNI configuration file that uses the standard [ptp CNI plugin](https://github.com/containernetworking/plugins/tree/master/plugins/main/ptp) to create a veth device whose traffic is redirected with a tap device:
339+
The CNI configuration Firecracker-containerd requires from users are a CNI network name and an IfName parameter to provide to CNI plugins. Other values such as the a CNI bin directories and CNI configuration directories can be provided but will have sensible defaults if not provided.
340+
341+
A hypothetical example CNI configuration file that uses the standard [ptp CNI plugin](https://github.com/containernetworking/plugins/tree/master/plugins/main/ptp) to create a veth device whose traffic is redirected with a tap device:
353342

354343
```
355344
{
@@ -361,10 +350,8 @@ The CNI configuration Firecracker-containerd asks for are just references to a C
361350
"ipMasq": true,
362351
"ipam": {
363352
"type": "host-local",
364-
"subnet": "192.168.1.0/24"
365-
},
366-
"dns": {
367-
"nameservers": [ "1.1.1.1" ]
353+
"subnet": "192.168.1.0/24",
354+
"resolvConf": "/etc/resolv.conf"
368355
}
369356
},
370357
{
@@ -376,7 +363,7 @@ The CNI configuration Firecracker-containerd asks for are just references to a C
376363

377364
Given the above configuration, the containers inside the VM will have access to the 192.168.1.0/24 network. Thanks to setting `ipMasq: true`, the containers should also have internet access (assuming the host itself has internet access).
378365

379-
Firecracker-containerd will also provide an example CNI configuration that, if used, will result in Firecracker VMs being spun up with the same access to the network the host has on its default interface (something comparable to Docker’s default networking configuration). This can be setup via a Makefile target (i.e. `install-default-network`), which allows users trying out Firecracker-containerd to get networking, including outbound internet access, working in their Firecracker VMs by default if they so choose.
366+
Firecracker-containerd will also provide an example CNI configuration that, if used, will result in Firecracker VMs being spun up with the same access to the network the host has on its default interface (something comparable to Docker’s default networking configuration). This can be setup via a Makefile target (i.e. `demo-network`), which allows users trying out Firecracker-containerd to get networking, including outbound internet access, working in their Firecracker VMs by default if they so choose.
380367

381368
## Hypothetical CRI interactions
382369

0 commit comments

Comments
 (0)