Skip to content

Rel-1.23.0 incompatible with Nvidia DGX Spark on CUDA13 #26096

@yinggeh

Description

@yinggeh

Describe the issue

Unable to run inference on Nvidia DGX Spark. Unable to reproduce on other platforms though (e.g. A6000, A100, B100, AGX Thor). Seems to be an issue on this particular new platform.

To reproduce

  1. Setup environment/container for build
  2. Generate a simple add model add.onnx. generate_add_model.py
pip install onnx==1.18.0
python3 generate_add_model.py
  1. Clone repo branch rel-1.23.0
git clone -b rel-1.23.0 --recursive https://github.com/microsoft/onnxruntime.git onnxruntime
  1. Build command (used in Triton onnxruntime_backend)
./build.sh --config Release --skip_submodule_sync --parallel --build_shared_lib --compile_no_warning_as_error --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES='80-real;86-real;90-real;100-real;110-real;120'  --cmake_extra_defines CMAKE_POLICY_VERSION_MINIMUM=3.5 --update --build --use_cuda --cuda_home "/usr/local/cuda" --cudnn_home "/usr" --allow_running_as_root
  1. Compile and run infer_add.cpp
g++ infer_add.cpp -I<ORT_INCLUDE_DIR> -L<ORT_LIB_DIR> -lonnxruntime -o infer_add -std=c++17 -DUSE_CUDA
LD_LIBRARY_PATH=<ORT_LIB_DIR>:$LD_LIBRARY_PATH ./infer_add

You should see error

2025-09-19 05:33:20.001722449 [E:onnxruntime:, sequential_executor.cc:572 ExecuteKernel] Non-zero status code returned while running Add node. Name:'' Status Message: CUDA error cudaErrorSymbolNotFound:named symbol not found
ONNX Runtime error: 1: Non-zero status code returned while running Add node. Name:'' Status Message: CUDA error cudaErrorSymbolNotFound:named symbol not found

Urgency

Urgent. It's blocking our 25.08 and 25.09 release.

Platform

Linux

OS Version

24.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.23.0

ONNX Runtime API

C++

Architecture

ARM64

Execution Provider

CUDA

Execution Provider Library Version

13.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:CUDAissues related to the CUDA execution provider

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions