Would you add arm support? Say arm64 or aarch64. I find [DLTcollab/sse2neon](https://github.com/DLTcollab/sse2neon) might be helpful.