You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a module from A import special_operation that operates on numpy arrays and implements the operation using numpy arrays
Add custom type annotations from A import memory_device_l1 and use it in annotating numpy arrays
Add a plugin that implements an ASR->ASR pass that transforms all these annotations and special operations into low level C / API calls for the specific hardware API
The module A and the plugin (as an so library) will be shipped externally, not as part of LPython.
This will allow anybody to extend LPython to work for their custom hardware.
The text was updated successfully, but these errors were encountered:
I'll be interested in adding some small support using MSL (for Apple M1). And we can see what design requirements are needed. I need to find some good resources for learning MSL.
@Smit-create here is how llama.cpp uses Metal to use the GPU (I think) on M1: ggml-org/llama.cpp#2615, so let's figure out how to run their kernels and then how to generate them using LPython.
The custom hardware backend will also be just a CPU with SIMD instructions. Annotating arrays to be able to write vectorized code using NumPy array, and the CPU/SIMD ASR backend will take it and ensure that correct LLVM code is generated, so that the final binary is using the CPU vector instructions and code runs at maximum speed.
Here is how it would work:
from A import special_operation
that operates on numpy arrays and implements the operation using numpy arraysfrom A import memory_device_l1
and use it in annotating numpy arraysThe module
A
and the plugin (as an so library) will be shipped externally, not as part of LPython.This will allow anybody to extend LPython to work for their custom hardware.
The text was updated successfully, but these errors were encountered: