-
Notifications
You must be signed in to change notification settings - Fork 476
Description
Description
When using the Vulkan backend on my machine, it frequently falls back to using the CPU for inference. This seems to be happening because we're not giving vulkaninfo enough time to complete when trying to detect the GPU.
Currently, we're using TimeSpan.FromSeconds(1) as the timeout for the vulkaninfo --summary command in GetVulkanSummary(). However, on my machine, running this command takes about 1.92 seconds on average. This means that most of the time, the command times out before it can return the necessary information, leading to the GPU not being detected and the fallback to CPU inference.
I did a quick test using a batch script to measure the execution time of vulkaninfo --summary (see code snippets below). This confirmed that the command consistently takes longer than 1 second to complete on my system.
To address this, I simply increased the timeout to 10 seconds (TimeSpan.FromSeconds(10)). This seems to have resolved the issue, as the GPU is now consistently detected and used for inference.
I would appreciate if we got this changed in next release.
Code Snippets:
Current GetVulkanSummary() function:
private static string? GetVulkanSummary()
{
// Note: on Linux, this requires `vulkan-tools` to be installed. (`sudo apt install vulkan-tools`)
try
{
// Start a process to read vulkan info
Process process = new()
{
StartInfo = new()
{
FileName = "vulkaninfo",
Arguments = "--summary",
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true
}
};
var (exitCode, output, error, ok) = process.SafeRun(TimeSpan.FromSeconds(1)); // Timeout is 1 second which needs to be increased
if (!ok)
return null;
// Return the output
return output;
}
catch
{
// Return null if we failed to get the Vulkan version
return null;
}
}Batch script for measuring vulkaninfo execution time:
@echo off
set start=%time%
vulkaninfo --summary
set end=%time%
echo Start Time: %start%
echo End Time: %end%
REM Calculate the difference in time
set /a diff_seconds=((1%end:~0,2%-1%start:~0,2%)*3600 + (1%end:~3,2%-1%start:~3,2%)*60 + (1%end:~6,2%-1%start:~6,2%)) %% 86400
set /a diff_milliseconds=((1%end:~9,2%-1%start:~9,2%) + (1%end:~6,2%-1%start:~6,2%) * 100 + (1%end:~3,2%-1%start:~3,2%)*6000 + (1%end:~0,2%-1%start:~0,2%)*360000) %% 8640000
echo.
echo Execution time: %diff_seconds%.%diff_milliseconds:~-3% seconds
pauseExample output from the batch script:
==========
VULKANINFO
==========
Vulkan Instance Version: 1.3.250
Instance Extensions: count = 13
-------------------------------
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_swapchain_colorspace : extension revision 4
VK_KHR_device_group_creation : extension revision 1
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_win32_surface : extension revision 6
VK_LUNARG_direct_driver_loading : extension revision 1
Instance Layers: count = 16
---------------------------
VK_LAYER_AMD_switchable_graphics AMD switchable graphics layer 1.3.260 version 1
VK_LAYER_EOS_Overlay Vulkan overlay layer for Epic Online Services 1.2.136 version 1
VK_LAYER_EOS_Overlay Vulkan overlay layer for Epic Online Services
1.2.136 version 1
VK_LAYER_KHRONOS_profiles Khronos Profiles layer
1.3.290 version 1
VK_LAYER_KHRONOS_shader_object Khronos Shader object layer
1.3.290 version 1
VK_LAYER_KHRONOS_synchronization2 Khronos Synchronization2 layer
1.3.290 version 1
VK_LAYER_KHRONOS_validation Khronos Validation Layer
1.3.290 version 1
VK_LAYER_LUNARG_api_dump LunarG API dump layer
1.3.290 version 2
VK_LAYER_LUNARG_crash_diagnostic Crash Diagnostic Layer is a crash/hang debugging tool that helps determines GPU progress in a Vulkan application. 1.3.290 version 1
VK_LAYER_LUNARG_gfxreconstruct GFXReconstruct Capture Layer Version 1.0.5
1.3.290 version 4194309
VK_LAYER_LUNARG_monitor Execution Monitoring Layer
1.3.290 version 1
VK_LAYER_LUNARG_screenshot LunarG image capture layer
1.3.290 version 1
VK_LAYER_OBS_HOOK Open Broadcaster Software hook
1.3.216 version 1
VK_LAYER_RTSS RTSS overlay hook bootstrap
1.3.224 version 1
VK_LAYER_VALVE_steam_fossilize Steam Pipeline Caching Layer
1.3.207 version 1
VK_LAYER_VALVE_steam_overlay Steam Overlay Layer
1.3.207 version 1
Devices:
========
GPU0:
apiVersion = 1.3.260
driverVersion = 2.0.279
vendorID = 0x1002
deviceID = 0x67df
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = Radeon RX 580 Series
driverID = DRIVER_ID_AMD_PROPRIETARY
driverName = AMD proprietary driver
driverInfo = 24.1.1 (AMD proprietary shader compiler)
conformanceVersion = 1.3.3.1
deviceUUID = 00000000-0100-0000-0000-000000000000
driverUUID = 414d442d-5749-4e2d-4452-560000000000
Start Time: 15:34:33.59
End Time: 15:34:34.51
Execution time: 1.92 seconds
Press any key to continue . . .
Reproduction Steps
- Install Vulkan backend
- Enable backend
NativeLibraryConfig.All
.WithCuda(false)
.WithVulkan(true)
.WithAutoFallback();
- Load any model
Environment & Configuration
- Operating system: Windows 11
- .NET runtime version: 8.0
- LLamaSharp version: v0.16.0
- CPU & GPU device: I5-11400F & Radeon RX 580 4GB
Known Workarounds
No response