-
Notifications
You must be signed in to change notification settings - Fork 9
Description
The system-level zones we'll be running on sleds for: the control plane, storage, customer instances, dendrite, etc., will require communication both directly on the underlay and the boundary services overlay. An example of the latter is Nexus serving off-rack client requests to the user-facing API.
Given the requirement for both underlay and overlay communication, and the encapsulation capabilities of OPTE combined with OPTE's general-purpose architecture – it seems like a win to leverage OPTE for service-zone communications in addition to customer instance communications.
The general situation would look like this
overlay
╔═══════════destinations══════════╗
║ ║
║ underlay ║
║ ┌────destinations───┐ ║
║ │ │ ║
║ │ │ ║
║ ┌───────┐ ┌───────┐ ║
╚══│ phy │ │ phy │══╝
└───────┘ └───────┘
│ │
┌────┴────────────────┬──┴───────┐
│ │ │
┌───────┐ ┌───────┐ ┌───────┐
│ opte/ │ │ opte/ │ │ opte/ │
fd00::1 │ xde │ │ xde │ │ xde │
└───────┘ └───────┘ └───────┘
│ │ │
┌─────────┼─────────┐ │ │
│ │ │ │ │
┌────────┐┌────────┐┌────────┐ ┌────────┐ ┌────────┐
│ system ││ system ││ system │ │ │ │ │
│ zone ││ zone ││ zone │ │instance│ │instance│
└────────┘└────────┘└────────┘ │ │ │ │
fd00::10 fd00::11 fd00::12 └────────┘ └────────┘
There are a few notable details in this diagram
- The xde device is plumbed with an IP interface and has an address on the underlay, this would replace the address we are currently adding to
lo0
. - There is an expectation that communications sourced from the xde IP interface in the GZ destined to services in system zones will work. I think this is in the spirit of OPTE's virtual switch architecture, treating each zone interface as being connected to a port. The GZ address could also be on a VNIC hanging off the xde in the GZ.
In an initial implementation, the IP addresses in the zones would be atop VNICs over the xde device. This presents the somewhat awkward situation that we need link-local addresses on these VNICs as well as on the xde device. I've got plans to relax that constraint for on-host communications, but for now, I think it's something we can probably live with.
For underlay traffic, OPTE would mostly be in pass-through mode, letting traffic flow between system-zone instances and external sources or the GZ. When OPTE detects overlay traffic, it behaves similarly as it does for customer instances, performing encap/dcap onto/from the boundary services overlay.
Required Work
- DLPI implementation for xde for interface plumbing.
- Multi-port support for xde, right now xde assumes there is only one port for the virtual network interface of the instance it's attached to.