Skip to content

Conversation

@adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented Oct 3, 2025

Description

Adds APIs to allow a plugin EP to create a virtual OrtHardwareDevice that can be used for model cross-compilation. For example, this allows an EP to create a compiled model for NPU on a device that does not have an NPU.

Application code

An application must explicitly allow registered plugin EPs to create virtual devices. This is currently done by using a registration name that ends in the ".virtual" suffix. Ex:

#include "onnxruntime_cxx_api.h"
#include "onnxruntime_ep_device_ep_metadata_keys.h"


const char* ep_registration_name = "my_ep_lib.virtual";  // IMPORTANT: ".virtual" suffix is a signal to EP library
ort_env->RegisterExecutionProviderLibrary(ep_registration_name, "my_ep.dll");

std::vector<Ort::ConstEpDevice> ep_devices = ort_env->GetEpDevices();

// ep_devices includes an OrtEpDevice from "my_ep.dll" that uses a virtual OrtHardwareDevice.
Ort::ConstEpDevice virtual_ep_device = std::find_if(ep_devices.begin(), ep_devices.end(),
                                                    [](Ort::ConstEpDevice& device) {
                                                      return device.EpName() == std::string("MyEpName");
                                                    });

// App can look in HW metadata to check if is virtual
Ort::ConstHardwareDevice virtual_hw_device = virtual_ep_device.Device();
std::unordered_map<std::string, std::string> metadata = virtual_hw_device.Metadata().GetKeyValuePairs();
assert(metadata[kOrtHardwareDevice_MetadataKey_IsVirtual] == "1");

// App can use the virtual OrtEpDevice in a session to, for example, compile a model
// ...

Plugin EP code

This PR introduces a new optional C API function in the OrtEpFactory struct called SetEnvironmentOptions that allows ORT to pass options (as key/value pairs) to an EP factory. Currently, the only key supported is "allow_virtual_devices", which indicates to the EP factory that creating virtual devices is allowed.

When the application registers a plugin EP library, ORT creates the library's EP factories and checks if they implement the SetEnvironmentOptions API function. If so, ORT calls ep_factory.SetEnvironmentOptions with "allow_virtual_devices" set to "1" if the EP registration name set by the application ends in the ".virtual" suffix (or "0" otherwise).

Here's an example implementation of OrtEpFactory::SetEnvironmentOptions taken from a test plugin EP that supports a virtual GPU:

/*static*/
OrtStatus* ORT_API_CALL EpFactoryVirtualGpu::SetEnvironmentOptionsImpl(OrtEpFactory* this_ptr,
                                                                       const OrtKeyValuePairs* options) noexcept {
  auto* factory = static_cast<EpFactoryVirtualGpu*>(this_ptr);
  const char* value = factory->ort_api_.GetKeyValue(options, "allow_virtual_devices");

  if (value != nullptr) {
    factory->allow_virtual_devices_ = strcmp(value, "1") == 0;
  }

  return nullptr;
}

An EP factory can create a virtual hardware device within OrtEpFactory::GetSupportedDevices by using a new API function called CreateHardwareDevice. The EP factory is expected to own the hardware device instance, which should be released when the factory is destroyed via ReleaseHardwareDevice.

The test plugin EP shows an implementation of OrtEpFactory::GetSupportedDevices that creates a virtual GPU device.

/*static*/
OrtStatus* ORT_API_CALL EpFactoryVirtualGpu::GetSupportedDevicesImpl(OrtEpFactory* this_ptr,
                                                                     const OrtHardwareDevice* const* /*devices*/,
                                                                     size_t /*num_devices*/,
                                                                     OrtEpDevice** ep_devices,
                                                                     size_t max_ep_devices,
                                                                     size_t* p_num_ep_devices) noexcept {
  size_t& num_ep_devices = *p_num_ep_devices;
  auto* factory = static_cast<EpFactoryVirtualGpu*>(this_ptr);

  num_ep_devices = 0;

  // Create a virtual OrtHardwareDevice if application indicated it is allowed (e.g., for cross-compiling).
  // This example EP creates a virtual GPU OrtHardwareDevice and adds a new OrtEpDevice that uses the virtual GPU.
  if (factory->allow_virtual_devices_ && num_ep_devices < max_ep_devices) {
    OrtKeyValuePairs* hw_metadata = nullptr;
    factory->ort_api_.CreateKeyValuePairs(&hw_metadata);
    factory->ort_api_.AddKeyValuePair(hw_metadata, kOrtHardwareDevice_MetadataKey_IsVirtual, "1");

    auto* status = factory->ep_api_.CreateHardwareDevice(OrtHardwareDeviceType::OrtHardwareDeviceType_GPU,
                                                         factory->vendor_id_,
                                                         /*device_id*/ 0,
                                                         factory->vendor_.c_str(),
                                                         hw_metadata,
                                                         &factory->virtual_hw_device_);

    // ...

    OrtEpDevice* virtual_ep_device = nullptr;
    status = factory->ort_api_.GetEpApi()->CreateEpDevice(factory, factory->virtual_hw_device_, ep_metadata,
                                                          ep_options, &virtual_ep_device);

    // ...

    ep_devices[num_ep_devices++] = virtual_ep_device;
    

Motivation and Context

@adrianlizarraga adrianlizarraga changed the title [DRAFT] [Compile API] Support for offline/off-target compile for plugin EPs [DRAFT] Allow EP to provide additional HW devices Oct 9, 2025
Comment on lines 1016 to 1019
* \note New additional devices created by this EP factory are not provided to other EP factories. Only this
* EP factory receives the new additional hardware devices via OrtEpFactory::GetSupportedDevices().
* Any OrtEpDevice instances that this EP factory creates with an additional hardware device are visible to
* applications that call OrtApi::GetEpDevices().
Copy link
Contributor

@skottmckay skottmckay Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this complexity?

Alternatively, if we're only going to show the OrtEpDevice to the EP that created it, can the EP create it inside of GetSupportedDevices?

e.g. inside GetSupportedDevices

  • EP determines a virtual device is required
  • EP creates OrtHardwareDevice.
  • EP uses that to create OrtEpDevice.

EP could register the OrtHardwareDevice with ORT so we can release it when the EP is unregistered, or EP could own it if we provided a ReleaseHardwareDevice function. Only place the OrtHardwareDevice shows up is indirectly in the OrtEpDevice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that helps with complexity, but it lacks a signal for the EP to know it should create a virtual device.

GetAdditionalHardwareDevices is better, but that doesn't have a signal for device type (e.g. create a GPU or NPU or both).

Or do we expect it creates all possible virtual devices by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do we expect it creates all possible virtual devices by default?

Yeah, that was my original idea. The EP library knows

  1. The platform/arch the Ep library was built for (e.g., win x64).
  2. The set of actual OrtHardwareDevice instances that ORT found.
  3. The set of virtual devices that the EP library wants to expose to the application for that specific platform/arch.

So, based on the above, the EP library would create all possible/necessary virtual devices. The EP library also knows the correct provider options to associate with each virtual device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our offline discussion, this PR now allows app to signal the EP (via ".virtual" suffix to EP registration name) that virtual devices are allowed.

cpu_device.vendor_id = cpuid_info.GetCPUVendorId();
cpu_device.device_id = 0;
cpu_device.type = OrtHardwareDeviceType_CPU;
cpu_device.metadata.Add(kOrtHardwareDevice_MetadataKey_DiscoveredBy, "ONNX Runtime");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this as well as IsVirtual? Wary of having too much stuff in metadata, and possibly the IsVirtual implies it was not discovered by ORT unless there's some other feature that will rely on an explicit 'discovered by' value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is that:

  • DiscoveredBy tells the application who discovered the device. May be useful if the app only wants stuff discovered by ORT.
  • IsVirtual is "1" if it does not correspond to actual hardware. It's the negation of IsHardware used by dxcore api. This is a separate concept from DiscoveredBy. This would be useful for off-target compilation, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only show the virtual device to the EP that created it, is there value in DiscoveredBy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The application, which can see all OrtEpDevices via GetEpDevices, would be able to see who created the hw device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed DiscoveredBy and retained IsVirtual.

@skottmckay
Copy link
Contributor

How does an EP know what sort of device it should create? e.g. an arm64 vs x64 virtual device.

Would it create a generic virtual device so we have an OrtEpDevice for it, that gets explicitly selected using SessionOptionsAppendExecutionProvider_V2, and the EP options that can be provided there are used to specify things like the target architecture etc. if this is a cross compiling scenario?

@adrianlizarraga adrianlizarraga changed the title [DRAFT] Allow EP to provide additional HW devices [Plugin EP] Allow EP to provide additional virtual devices Oct 27, 2025
}

Status SetEpFactoryEnvironmentOptions(OrtEpFactory& factory, std::string_view lib_registration_name) {
if (factory.SetEnvironmentOptions == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs version check as well. could be random bits if the EP was built with ORT 1.23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants