-
Notifications
You must be signed in to change notification settings - Fork 170
WASM to X86 Backend #1222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For |
I think you can use the X86 assembler, which just generates a binary, but in Debug mode it can also do a text format. Alternatively, we just do a binary, and then do x86 -> text assembly. |
lpython/src/libasr/codegen/asr_to_wasm.cpp Lines 1935 to 1949 in fc08c06
In |
This idea of printing strings using its length and memory location is more of our own concept/choice. It is possible that |
Prospective Roadmap:
|
@Shaikh-Ubaid sorry, I was a little busy with this PR: #1256. |
Here is how to benchmark: N = 10000
A_functions = ""
calls = ""
for i in range(N):
func_A = f"""
def A{i}(x: i32) -> i32:
y: i32
z: i32
y = {i}
z = 5
x = x + y * z
return x
"""
A_functions += func_A
calls += f" y = A{i}(y)\n"
source = f"""\
from ltypes import i32
{A_functions}
def Driver(radius: i32) -> i32:
y: i32
y = radius
{calls}
return y
def Main0():
print(Driver(5))
Main0()
"""
print(source) |
There is a hackish idea to support printing strings in the
Sure, we can work together. I am hoping to follow the roadmap shared here #1222 (comment). You can proceed with supporting |
Here are the timing results. With $ time lpython --backend=llvm a.py -o a_llvm.x
lpython --backend=llvm a.py -o a_llvm.x 2.94s user 0.10s system 99% cpu 3.034 total
$ time lpython --backend=x86 a.py -o a_x86.x
lpython --backend=x86 a.py -o a_x86.x 0.09s user 0.01s system 93% cpu 0.104 total
$ time lpython --backend=wasm_x86 a.py -o a_wasm_x86.x
lpython --backend=wasm_x86 a.py -o a_wasm_x86.x 0.11s user 0.01s system 95% cpu 0.129 total The programs runs on x86 Linux:
Summary for this particular example:
The WASM x86 backend is roughly 25% slower than the direct x86 backend, which is not bad, given that both are over 20x faster than LLVM. The timings include parsing and semantics too, so the relative speeds of the backends themselves are larger, but the above is a good start to get an idea of the performance. |
@Shaikh-Ubaid can you please submit a PR with the script in #1222 (comment) and put it into |
Submitted here #1260 |
Debug Mode Benchmark results for (lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time python bench.py
249975005
real 0m0.461s
user 0m0.396s
sys 0m0.064s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend llvm
249975005
real 1m55.329s
user 1m55.093s
sys 0m0.154s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend x86
real 0m24.622s
user 0m24.588s
sys 0m0.028s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend wasm_x86
real 0m48.122s
user 0m47.993s
sys 0m0.081s Release Mode Benchmark results for (lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time python bench.py
249975005
real 0m0.511s
user 0m0.412s
sys 0m0.095s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend llvm
249975005
real 0m6.126s
user 0m6.021s
sys 0m0.084s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend x86
real 0m0.282s
user 0m0.241s
sys 0m0.040s
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ time lpython bench.py --backend wasm_x86
real 0m0.369s
user 0m0.308s
sys 0m0.061s Note: The above results (of both Debug mode and Release mode) are for single time execution. |
This might help with printing negative numbers, I looking for some other resources. |
How do I implement lpython/src/libasr/codegen/x86_assembler.h Lines 692 to 701 in 5084897
Where to look for hex value for shr ?@certik @Shaikh-Ubaid |
@Shaikh-Ubaid, I have a doubt? lpython/src/libasr/codegen/wasm_to_x86.cpp Lines 101 to 102 in d4a6d53
Here ConstI32 value is pushed into the stack, right? Does this mean it will be stored in the eax ?Instead why not we mov value to eax and pop the eax in the ConstI32 visitor?
|
It will be stored in the stack. It will not be stored in
We need the value to be on top of stack. Do you mean to first store the value in |
Okay, After pushing the value into the stack. |
I'm a little confused here, Can you share some useful resources to better understand the workings of |
Yes, the other instructions that follow use it. For example, After the |
Cool, got it. Thanks |
I mostly used a combination of https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf (also shared on Zulip, it seems slightly big), Google Search and (some) Youtube. I did not find any specific (great) resource for |
In here: void asm_shl_r32_imm8(X86Reg r32, uint8_t imm8) {
if (r32 == X86Reg::eax) {
m_code.push_back(m_al, 0xc1);
m_code.push_back(m_al, 0xe0);
m_code.push_back(m_al, imm8);
} else {
throw AssemblerError("Not implemented.");
}
EMIT("shl " + r2s(r32) + ", " + i2s(imm8));
} we see encoding of the The |
@certik do we need support for floating point numbers in the |
Yes, we should do floating point numbers. |
We now have WASM->x86 backends (both 32 and 64 bit), so I am closing this issue. We can open specific issues for missing features. |
There is an idea to implement a
wasm
tox86
backend. We have to benchmarkASR->x86
andASR->WASM->x86
. If the performance difference is not much, then the big advantage ofWASM->x86
would (hopefully) be that it would be a easier and quicker for us to deliver a working backend.*Notes:
The text was updated successfully, but these errors were encountered: