You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
Add Ascend NPU as a new backend.
Motivation
Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. For more information about Ascend, see Ascend Community.
CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI.
Pytorch has officially announced support for Ascend NPU (through key PrivateUse1), please see the PrivateUse1 tutorial here.
Provide new backend support for llama.cpp, allowing users who are using Ascend NPU to inference model with llama.cpp.
Possible Implementation
Currently, the community has provided a convenient backend access mechanism. Ascend NPU is a CUDA-LIKE device, and I plan to reference CUDA's implementation to complete the Ascend NPU backend.
Due to the large workload, I plan to complete this feature in multiple stages. First, I will focus on compiling, backend registration, and device runtime functionalities. Additionally, I will add a new test file to validate backend registration, memory allocation, tensor operations, and other functionalities.
Next, I will proceed to implement tensor operators and validate them.
Afterward, do performance implementation, including split tensor support.
Uh oh!
There was an error while loading. Please reload this page.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Add Ascend NPU as a new backend.
Motivation
Ascend is a full-stack AI computing infrastructure for industry applications and services based on Huawei Ascend processors and software. For more information about Ascend, see Ascend Community.
CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI.
Pytorch has officially announced support for Ascend NPU (through key PrivateUse1), please see the PrivateUse1 tutorial here.
Provide new backend support for llama.cpp, allowing users who are using Ascend NPU to inference model with llama.cpp.
Possible Implementation
Currently, the community has provided a convenient backend access mechanism. Ascend NPU is a CUDA-LIKE device, and I plan to reference CUDA's implementation to complete the Ascend NPU backend.
Due to the large workload, I plan to complete this feature in multiple stages. First, I will focus on compiling, backend registration, and device runtime functionalities. Additionally, I will add a new test file to validate backend registration, memory allocation, tensor operations, and other functionalities.
Next, I will proceed to implement tensor operators and validate them.
Afterward, do performance implementation, including split tensor support.
See also: very first commit #6035.
The text was updated successfully, but these errors were encountered: