Skip to content

Feature Request: add to llama-bench device info reporting of "bf16:1", if built with VK_KHR_bfloat16 support and driver also supports it.. #13274

@oscarbg

Description

@oscarbg

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Hi,
I compiled latest llama.cpp with Vulkan backend enabled on a system with Vulkan bfloat16 support:

cmake -B build -DGGML_VULKAN=ON
I see:

-- Vulkan found
-- GL_KHR_cooperative_matrix supported by glslc
-- GL_NV_cooperative_matrix2 supported by glslc
-- GL_EXT_integer_dot_product supported by glslc
-- GL_EXT_bfloat16 supported by glslc
-- Including Vulkan backend

driver also supports VK_KHR_bfloat16 extension so calling llama-bench I see:

ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 4070 (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2

similarly as "matrix cores: NV_coopmat2" show llama.cpp has been compiled with coopmat2 support and GPU+driver also supports that extension I would like to see similar info for bfloat16 ie like bf16:1 :

ggml_vulkan: 0 = NVIDIA GeForce RTX 4070 (NVIDIA) | uma: 0 | fp16: 1 | **bf16:1** |warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2

next if fp8 vulkan extension is added similarly a fp8:1 or may be display array of floating point accelerated formats with "fp:[fp16,bf16,fp8]" for example..

thanks..

Motivation

easy detection if llama.cpp has been built with support for Vulkan bfloat16 ext and also GPU driver supports that extension by reporting bf16:1 similar as how "matrix cores:" field or "int dot:" field only show "1" only if both llama.cpp and driver supports that extension..

Possible Implementation

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions