[release/2.6] Change gfx110x BLAS preferred backend #2053

amd-imilenko · 2025-04-25T14:28:52Z

Only AMD Instinct GPUs prefer hipblaslt by default, but user can still override using env var.

Cherry-picked to release/2.5 branch via #2169

rocm-repo-management-api · 2025-04-25T14:38:12Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-25T21:39:40Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit is in progress
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-28T05:41:38Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

jeffdaily · 2025-04-28T16:52:08Z

For upstream release/2.7 we applied this patch which adds a Default blas backend which then becomes cublas or cublaslt. pytorch#150212

For release/2.6, it should be as straightforward as this diff:

diff --git a/aten/src/ATen/Context.cpp b/aten/src/ATen/Context.cpp
index a0e3b3d638..fbdbe767e3 100644
--- a/aten/src/ATen/Context.cpp
+++ b/aten/src/ATen/Context.cpp
@@ -320,7 +320,7 @@ at::BlasBackend Context::blasPreferredBackend() {
       static const std::vector<std::string> archs = {
           "gfx90a", "gfx942"
 #if ROCM_VERSION >= 60300
-          , "gfx1100", "gfx1101", "gfx1200", "gfx1201"
+          , "gfx1200", "gfx1201"
 #endif
 #if ROCM_VERSION >= 60500
           , "gfx950"

rocm-repo-management-api · 2025-04-28T18:54:15Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-30T14:58:06Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-30T15:58:49Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-30T17:58:32Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as ABORTED
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-30T18:58:59Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-04-30T21:58:15Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-05-02T16:07:33Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

amd-imilenko · 2025-05-05T08:36:10Z

@jeffdaily Wouldn't the change demonstrated in the diff cause that preferred backend for gfx11* always be Cublas, even when environment variable TORCH_BLAS_PREFER_HIPBLASLT is set to true? Idea of the change was to set preferred backend to Cublas for gfx11*, but to be able to change to Cublaslt if TORCH_BLAS_PREFER_HIPBLASLT is explicitly set to true.

apakbin · 2025-05-05T18:29:16Z

given the widespread regression of hipBLASLt on gfx110x, can we disable it for gfx120x as well on rel/2.6? (CC. @pruthvistony)

rocm-repo-management-api · 2025-05-06T15:35:39Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-05-06T16:05:46Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-05-06T20:35:37Z

Jenkins build for d9b0d061725412951c47dce318dbf10d4823a297 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

This reverts commit d9b0d06.

jeffdaily · 2025-05-07T17:03:52Z

@amd-imilenko @apakbin I update this PR with a slightly different approach. Please review. Env var is respected, only instinct GPUs will default to hipblaslt.

rocm-repo-management-api · 2025-05-07T17:05:40Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

apakbin · 2025-05-07T19:54:08Z

thanks @jeffdaily. Seems to me that the flags TORCH_BLAS_PREFER_HIPBLASLT/TORCH_BLAS_PREFER_CUBLASLT are already being checked in aten/src/ATen/Context.h and what the function in question Context::blasPreferredBackend() in context.cpp does is that it reverts the setting back to rocBLAS if the user has indicated they want hipBLASLt but the system does not support it. So as far as I understand we don't need to check those flags again in Context::blasPreferredBackend().

rocm-repo-management-api · 2025-05-08T04:05:35Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[5936/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Dispatch.cpp.o
[5937/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyVmapMode.cpp.o
[5938/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ThreadLocalPythonObjects.cpp.o
[5939/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/MAIAHooksInterface.cpp.o
[5940/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o 
/opt/cache/bin/sccache /opt/cache/bin/c++ -DAT_PER_OPERATOR_HEADERS -DBUILD_ONEDNN_GRAPH -DCAFFE2_BUILD_MAIN_LIB -DCPUINFO_SUPPORTED_PLATFORM=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPYTORCH_LAYERNORM_FAST_RECIPROCAL -DROCM_VERSION=60400 -DTORCH_ENABLE_LLVM -DTORCH_HIP_VERSION=604 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_ROCM -DUSE_RPC -DUSE_TENSORPIPE -DXNN_LOG_LEVEL=0 -D_FILE_OFFSET_BITS=64 -D__HIP_PLATFORM_AMD__ -Dtorch_cpu_EXPORTS -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch -I/opt/rocm-6.4.0/include -I/var/lib/jenkins/pytorch/cmake/../third_party/benchmark/include -I/opt/llvm/include -I/var/lib/jenkins/pytorch/third_party/onnx -I/var/lib/jenkins/pytorch/build/third_party/onnx -I/var/lib/jenkins/pytorch/nlohmann -I/var/lib/jenkins/pytorch/torch/csrc/api -I/var/lib/jenkins/pytorch/torch/csrc/api/include -I/var/lib/jenkins/pytorch/caffe2/aten/src/TH -I/var/lib/jenkins/pytorch/build/caffe2/aten/src/TH -I/var/lib/jenkins/pytorch/build/caffe2/aten/src -I/var/lib/jenkins/pytorch/build/caffe2/../aten/src -I/var/lib/jenkins/pytorch/torch/csrc -I/var/lib/jenkins/pytorch/third_party/miniz-3.0.2 -I/var/lib/jenkins/pytorch/third_party/kineto/libkineto/include -I/var/lib/jenkins/pytorch/third_party/kineto/libkineto/src -I/var/lib/jenkins/pytorch/third_party/cpp-httplib -I/var/lib/jenkins/pytorch/aten/src/ATen/.. -I/var/lib/jenkins/pytorch/third_party/FXdiv/include -I/var/lib/jenkins/pytorch/c10/.. -I/var/lib/jenkins/pytorch/third_party/pthreadpool/include -I/var/lib/jenkins/pytorch/third_party/cpuinfo/include -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/deps/clog/include -I/var/lib/jenkins/pytorch/third_party/NNPACK/include -I/var/lib/jenkins/pytorch/third_party/fbgemm/include -I/var/lib/jenkins/pytorch/third_party/fbgemm -I/var/lib/jenkins/pytorch/third_party/fbgemm/third_party/asmjit/src -I/var/lib/jenkins/pytorch/third_party/ittapi/src/ittnotify -I/var/lib/jenkins/pytorch/third_party/FP16/include -I/var/lib/jenkins/pytorch/third_party/tensorpipe -I/var/lib/jenkins/pytorch/build/third_party/tensorpipe -I/var/lib/jenkins/pytorch/third_party/tensorpipe/third_party/libnop/include -I/var/lib/jenkins/pytorch/third_party/fmt/include -I/var/lib/jenkins/pytorch/build/third_party/ideep/mkl-dnn/include -I/var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/src/../include -I/var/lib/jenkins/pytorch/third_party/flatbuffers/include -isystem /var/lib/jenkins/pytorch/build/third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googletest/include -isystem /var/lib/jenkins/pytorch/third_party/protobuf/src -isystem /opt/conda/envs/py_3.12/include -isystem /var/lib/jenkins/pytorch/third_party/XNNPACK/include -isystem /var/lib/jenkins/pytorch/third_party/ittapi/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/eigen -isystem /var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -isystem /var/lib/jenkins/pytorch/third_party/ideep/include -isystem /var/lib/jenkins/pytorch/INTERFACE -isystem /var/lib/jenkins/pytorch/third_party/nlohmann/include -isystem /var/lib/jenkins/pytorch/build/include -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-error=dangling-reference -Wno-error=redundant-move -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -fPIC -DMKL_HAS_SBGEMM -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -fvisibility=hidden -O2 -pthread -DASMJIT_STATIC -fopenmp -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/EmptyTensor.cpp.o -c /var/lib/jenkins/pytorch/aten/src/ATen/EmptyTensor.cpp
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/EmptyTensor.cpp:5:
/var/lib/jenkins/pytorch/aten/src/ATen/Context.h: In lambda function:
/var/lib/jenkins/pytorch/aten/src/ATen/Context.h:431:45: error: cannot convert ‘const std::vector<std::__cxx11::basic_string<char> >’ to ‘c10::DeviceIndex’ {aka ‘signed char’}
  431 |       if (!detail::getCUDAHooks().isGPUArch(archs, index)) {

amd-vlarakic · 2025-05-08T07:26:37Z

Hi @jeffdaily and @apakbin,
Correct me if I am wrong, wouldn't excluding gfx120x from list of architectures that default to rocblaslt prevent fp8 workloads (gemms) from being executed on these devices out of the box, without setting env variable?

fjankovi · 2025-05-08T07:32:48Z

@amd-imilenko @apakbin I update this PR with a slightly different approach. Please review. Env var is respected, only instinct GPUs will default to hipblaslt.

@jeffdaily We also want gfx12 to default to hipblaslt (and probably also APUs if added later).

rocm-repo-management-api · 2025-05-08T18:35:38Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[5933/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/CPUGuardImpl.cpp.o
[5934/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Dispatch.cpp.o
[5935/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/detail/MetaGuardImpl.cpp.o
[5936/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/LegacyVmapMode.cpp.o
[5937/8040] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o 
/opt/cache/bin/sccache /opt/cache/bin/c++ -DAT_PER_OPERATOR_HEADERS -DBUILD_ONEDNN_GRAPH -DCAFFE2_BUILD_MAIN_LIB -DCPUINFO_SUPPORTED_PLATFORM=1 -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPYTORCH_LAYERNORM_FAST_RECIPROCAL -DROCM_VERSION=60400 -DTORCH_ENABLE_LLVM -DTORCH_HIP_VERSION=604 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_ROCM -DUSE_RPC -DUSE_TENSORPIPE -DXNN_LOG_LEVEL=0 -D_FILE_OFFSET_BITS=64 -D__HIP_PLATFORM_AMD__ -Dtorch_cpu_EXPORTS -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch -I/opt/rocm-6.4.0/include -I/var/lib/jenkins/pytorch/cmake/../third_party/benchmark/include -I/opt/llvm/include -I/var/lib/jenkins/pytorch/third_party/onnx -I/var/lib/jenkins/pytorch/build/third_party/onnx -I/var/lib/jenkins/pytorch/nlohmann -I/var/lib/jenkins/pytorch/torch/csrc/api -I/var/lib/jenkins/pytorch/torch/csrc/api/include -I/var/lib/jenkins/pytorch/caffe2/aten/src/TH -I/var/lib/jenkins/pytorch/build/caffe2/aten/src/TH -I/var/lib/jenkins/pytorch/build/caffe2/aten/src -I/var/lib/jenkins/pytorch/build/caffe2/../aten/src -I/var/lib/jenkins/pytorch/torch/csrc -I/var/lib/jenkins/pytorch/third_party/miniz-3.0.2 -I/var/lib/jenkins/pytorch/third_party/kineto/libkineto/include -I/var/lib/jenkins/pytorch/third_party/kineto/libkineto/src -I/var/lib/jenkins/pytorch/third_party/cpp-httplib -I/var/lib/jenkins/pytorch/aten/src/ATen/.. -I/var/lib/jenkins/pytorch/third_party/FXdiv/include -I/var/lib/jenkins/pytorch/c10/.. -I/var/lib/jenkins/pytorch/third_party/pthreadpool/include -I/var/lib/jenkins/pytorch/third_party/cpuinfo/include -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/var/lib/jenkins/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/deps/clog/include -I/var/lib/jenkins/pytorch/third_party/NNPACK/include -I/var/lib/jenkins/pytorch/third_party/fbgemm/include -I/var/lib/jenkins/pytorch/third_party/fbgemm -I/var/lib/jenkins/pytorch/third_party/fbgemm/third_party/asmjit/src -I/var/lib/jenkins/pytorch/third_party/ittapi/src/ittnotify -I/var/lib/jenkins/pytorch/third_party/FP16/include -I/var/lib/jenkins/pytorch/third_party/tensorpipe -I/var/lib/jenkins/pytorch/build/third_party/tensorpipe -I/var/lib/jenkins/pytorch/third_party/tensorpipe/third_party/libnop/include -I/var/lib/jenkins/pytorch/third_party/fmt/include -I/var/lib/jenkins/pytorch/build/third_party/ideep/mkl-dnn/include -I/var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/src/../include -I/var/lib/jenkins/pytorch/third_party/flatbuffers/include -isystem /var/lib/jenkins/pytorch/build/third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googletest/include -isystem /var/lib/jenkins/pytorch/third_party/protobuf/src -isystem /opt/conda/envs/py_3.12/include -isystem /var/lib/jenkins/pytorch/third_party/XNNPACK/include -isystem /var/lib/jenkins/pytorch/third_party/ittapi/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/eigen -isystem /var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -isystem /var/lib/jenkins/pytorch/third_party/ideep/include -isystem /var/lib/jenkins/pytorch/INTERFACE -isystem /var/lib/jenkins/pytorch/third_party/nlohmann/include -isystem /var/lib/jenkins/pytorch/build/include -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-error=dangling-reference -Wno-error=redundant-move -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -fPIC -DMKL_HAS_SBGEMM -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -fvisibility=hidden -O2 -pthread -DASMJIT_STATIC -fopenmp -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/DeviceAccelerator.cpp.o -c /var/lib/jenkins/pytorch/aten/src/ATen/DeviceAccelerator.cpp
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/DeviceAccelerator.cpp:1:
/var/lib/jenkins/pytorch/aten/src/ATen/Context.h: In lambda function:
/var/lib/jenkins/pytorch/aten/src/ATen/Context.h:431:45: error: cannot convert ‘const std::vector<std::__cxx11::basic_string<char> >’ to ‘c10::DeviceIndex’ {aka ‘signed char’}
  431 |       if (!detail::getCUDAHooks().isGPUArch(archs, index)) {

rocm-repo-management-api · 2025-05-08T19:35:35Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-05-08T20:35:35Z

Jenkins build for 744671327dcced25d3f72ab8bc7c86e0385106eb commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

apakbin · 2025-05-12T16:58:04Z

CC. @pruthvistony

apakbin · 2025-05-12T17:56:22Z

the compile error seems to stem from the PR pytorch#150473 not gotten cherry-picked in rel/2.6. That PR added index to the isGPUArch() function. If we could cherry-pick that here it would go away.

rocm-repo-management-api · 2025-05-12T18:05:34Z

Jenkins build for 14341d582f9184f6b9556e4252bbe2ccd921e3c6 commit is in progress
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-05-12T20:05:38Z

Jenkins build for 14341d582f9184f6b9556e4252bbe2ccd921e3c6 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[189/8040] Building CXX object third_party/protobuf/cmake/CMakeFiles/libprotoc.dir/__/src/google/protobuf/compiler/objectivec/objectivec_generator.cc.o
[190/8040] Building CXX object third_party/protobuf/cmake/CMakeFiles/protoc.dir/__/src/google/protobuf/compiler/main.cc.o
[191/8040] Building C object confu-deps/cpuinfo/CMakeFiles/cpuinfo_internals.dir/src/x86/name.c.o
[192/8040] Building C object confu-deps/cpuinfo/CMakeFiles/cpuinfo_internals.dir/src/x86/isa.c.o
[193/8040] Performing download step (download, verify and extract) for 'aotriton_external'
FAILED: aotriton_external-prefix/src/aotriton_external-stamp/aotriton_external-download /var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton_external-stamp/aotriton_external-download 
cd /var/lib/jenkins/pytorch/build && /opt/conda/envs/py_3.12/bin/cmake -DCMAKE_MESSAGE_LOG_LEVEL=VERBOSE -P /var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton_external-stamp/download-aotriton_external.cmake && /opt/conda/envs/py_3.12/bin/cmake -DCMAKE_MESSAGE_LOG_LEVEL=VERBOSE -P /var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton_external-stamp/verify-aotriton_external.cmake && /opt/conda/envs/py_3.12/bin/cmake -DCMAKE_MESSAGE_LOG_LEVEL=VERBOSE -P /var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton_external-stamp/extract-aotriton_external.cmake && /opt/conda/envs/py_3.12/bin/cmake -E touch /var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton_external-stamp/aotriton_external-download
-- Downloading...
   dst='/var/lib/jenkins/pytorch/build/aotriton_external-prefix/src/aotriton-0.9.2b-manylinux_2_28_x86_64-rocm6.4-shared.tar.gz'
   timeout='none'
   inactivity timeout='none'

fjankovi · 2025-05-13T15:04:19Z

!cherry-pick --onto release/2.7

Created this PR for 2.7: #2125

apakbin · 2025-05-13T15:07:13Z

great thanks @fjankovi. Deleted my comment to not apply it twice.

amd-imilenko · 2025-05-20T12:12:01Z

!cherry-pick --onto release/2.5

Only AMD Instinct GPUs and Navi 4x prefer hipblaslt by default, but user can still override using env var. --------- Co-authored-by: Jeff Daily <[email protected]>

okakarpa · 2025-05-20T12:16:14Z

Created branch autogenerated/release/2.5_cherry-pick_pr-2053 and #2169

…2169) Cherry-pick of #2053 --------- Co-authored-by: Ilija Milenkovic <[email protected]> Co-authored-by: Jeff Daily <[email protected]> Co-authored-by: Arash Pakbin <[email protected]>

[release/2.6] Change preferred BLAS backend for gfx110x

d9b0d06

amd-imilenko requested a review from jeffdaily April 25, 2025 14:28

jeffdaily added 3 commits May 7, 2025 16:56

Revert "[release/2.6] Change preferred BLAS backend for gfx110x"

6efc0d8

This reverts commit d9b0d06.

next attempt

2a1375e

allow user override to false

7446713

Add gfx12xx to _hipblaslt_preferred_default list and apply fixes

14341d5

jeffdaily approved these changes May 12, 2025

View reviewed changes

amd-imilenko merged commit 1ded221 into release/2.6 May 13, 2025
2 of 6 checks passed

amd-imilenko deleted the change_gfx110_blas_preferred_backend branch May 13, 2025 09:44

okakarpa mentioned this pull request May 13, 2025

[AUTOGENERATED] [release/2.7] [release/2.6] Change gfx110x BLAS preferred backend #2128

Closed

okakarpa mentioned this pull request May 20, 2025

[AUTOGENERATED] [release/2.5] Change gfx110x BLAS preferred backend #2169

Merged

[release/2.6] Change gfx110x BLAS preferred backend #2053

[release/2.6] Change gfx110x BLAS preferred backend #2053

Uh oh!

Conversation

amd-imilenko commented Apr 25, 2025 • edited by okakarpa Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 25, 2025

Uh oh!

rocm-repo-management-api bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffdaily commented Apr 28, 2025

Uh oh!

rocm-repo-management-api bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amd-imilenko commented May 5, 2025

Uh oh!

apakbin commented May 5, 2025

Uh oh!

rocm-repo-management-api bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffdaily commented May 7, 2025

Uh oh!

rocm-repo-management-api bot commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apakbin commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amd-vlarakic commented May 8, 2025

Uh oh!

fjankovi commented May 8, 2025

Uh oh!

rocm-repo-management-api bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apakbin commented May 12, 2025

Uh oh!

apakbin commented May 12, 2025

Uh oh!

rocm-repo-management-api bot commented May 12, 2025

amd-imilenko commented Apr 25, 2025 •

edited by okakarpa

Loading

rocm-repo-management-api bot commented Apr 25, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 28, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 28, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 30, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 30, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 30, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 30, 2025 •

edited

Loading

rocm-repo-management-api bot commented Apr 30, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 2, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 6, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 6, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 6, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 7, 2025 •

edited

Loading

apakbin commented May 7, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 8, 2025 •

edited

Loading

rocm-repo-management-api bot commented May 12, 2025 •

edited

Loading