-
Notifications
You must be signed in to change notification settings - Fork 212
Description
A QAT device may be present on a node, but may only be configured with the kernel driver - or SR-IOV mode may not have been enabled. The operator will do a scan of the available PCI devices but does not check if any of the further pre-requisites as described in the QAT plugin pre-requisites section (possibly, vfio-pci driver loaded, vfs enabled on QAT device, etc).
This could be the case when a QAT device is present in the system, but will not be available for node resource allocation and exposed to the cluster - but other nodes may have configured QAT devices available.
It is unclear how to limit the deployment of plugins via the operator to avoid nodes with available, unconfigured devices on the cluster without disabling deployment of QAT plugin to the whole cluster. In this case the operator should continue deployment of other detected devices and avoid attempting to deploy plugins to unconfigured, but installed nodes.
Please advise on correct approach/behavior
Example failure of QAT plugin when device is present via lspci, but not intended to be configured on the node.
I1020 03:28:26.413601 1 qat_plugin.go:62] QAT device plugin started in 'dpdk' mode
E1020 03:28:26.413805 1 manager.go:102] Device scan failed: open /sys/bus/pci/drivers/vfio-pci/new_id: permission denied
write to driver failed: 8086 4941
github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv.writeToDriver
/go/src/github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv/dpdkdrv.go:445
github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv.(*DevicePlugin).setupDeviceIDs
/go/src/github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv/dpdkdrv.go:196
github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv.(*DevicePlugin).Scan
/go/src/github.com/intel/intel-device-plugins-for-kubernetes/cmd/qat_plugin/dpdkdrv/dpdkdrv.go:210
github.com/intel/intel-device-plugins-for-kubernetes/pkg/deviceplugin.(*Manager).Run.func1
/go/src/github.com/intel/intel-device-plugins-for-kubernetes/pkg/deviceplugin/manager.go:100
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1598
failed to set device ID 4941 for vfio-pci. Driver module not loaded?