Skip to content

Allow to extend LPython with custom hardware backends #2259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #2258
certik opened this issue Aug 8, 2023 · 3 comments
Open
Tracked by #2258

Allow to extend LPython with custom hardware backends #2259

certik opened this issue Aug 8, 2023 · 3 comments

Comments

@certik
Copy link
Contributor

certik commented Aug 8, 2023

Here is how it would work:

  • Add a module from A import special_operation that operates on numpy arrays and implements the operation using numpy arrays
  • Add custom type annotations from A import memory_device_l1 and use it in annotating numpy arrays
  • Add a plugin that implements an ASR->ASR pass that transforms all these annotations and special operations into low level C / API calls for the specific hardware API

The module A and the plugin (as an so library) will be shipped externally, not as part of LPython.

This will allow anybody to extend LPython to work for their custom hardware.

@certik certik mentioned this issue Aug 8, 2023
9 tasks
@Smit-create
Copy link
Collaborator

I'll be interested in adding some small support using MSL (for Apple M1). And we can see what design requirements are needed. I need to find some good resources for learning MSL.

@certik
Copy link
Contributor Author

certik commented Aug 14, 2023

@Smit-create here is how llama.cpp uses Metal to use the GPU (I think) on M1: ggml-org/llama.cpp#2615, so let's figure out how to run their kernels and then how to generate them using LPython.

Here is another repository how to run Metal from C++: https://github.com/larsgeb/m1-gpu-cpp

@certik
Copy link
Contributor Author

certik commented Aug 30, 2023

The custom hardware backend will also be just a CPU with SIMD instructions. Annotating arrays to be able to write vectorized code using NumPy array, and the CPU/SIMD ASR backend will take it and ensure that correct LLVM code is generated, so that the final binary is using the CPU vector instructions and code runs at maximum speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants