SIMD backend

```diff
diff --git a/src/libasr/ASR.asdl b/src/libasr/ASR.asdl
index 26e60e172..d6a29ecef 100644
--- a/src/libasr/ASR.asdl
+++ b/src/libasr/ASR.asdl
@@ -420,6 +420,7 @@ array_physical_type
     = DescriptorArray
     | PointerToDataArray
     | FixedSizeArray
+    | SIMDArray
     | NumPyArray
     | ISODescriptorArray
```
We'll use [Annotated](https://docs.python.org/3/library/typing.html#typing.Annotated):
```python
from typing import Annotated
from lpython import f32, SIMD
x: Annotated[f32[64], SIMD]
```
In ASR we use `SIMDArray` physical type, and then in the LLVM backend (or ASR->ASR pass) we ensure all such arrays get vectorized, otherwise we give a compile time error message. The conditions are:

* Must be 1D array
* Array element type and all operations must directly map into hardware instructions. For example on Apple M1 CPU, f32 plus and multiply is supported (it will compile), but f16 multiply is not (compile time error on Apple M1)
* Fixed compile time size
* The size must be a multiple of the hardware vector length. If we have 512 bits (AVX-512), the sizeof(element)*size must be equal to 512*n for n=1, 2, 3, .... If n=1, then the array is directly stored in a register. For n>2, the loop is unrolled (since the size is known at compile time), ensuring we hit maximum compute throughput (hide IO and latency).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SIMD backend #2310

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SIMD backend #2310

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions