Skip to content

Design of SymbolicExpression types #1607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
certik opened this issue Mar 22, 2023 · 15 comments
Open

Design of SymbolicExpression types #1607

certik opened this issue Mar 22, 2023 · 15 comments
Labels

Comments

@certik
Copy link
Contributor

certik commented Mar 22, 2023

Here is a design to get started:

from lpython import S
from sympy import Symbol

def f():
    x: S = Symbol("x")
    y: S = Symbol("y")
    z: S = x + y
    print(z)

f()

In ASR, this gets represented using a new SymbolicExpression type (and use S in Python code, to keep things short). Later we can add more subtypes of S, such as Symbol, SAdd, STimes, SInteger, etc., as well as casting from S to some of these types (checked in Debug mode), so that one can write functions that only accept those, but for now we'll just use S. The + becomes SymbolicAdd, an IntrinsicFunction accepting two SymbolicExpression arguments. The Symbol becomes SymbolicSymbol(str) -> SymbolicExpression, also an InstrinsicFunction.

In the IntrinsicFunction rewriting pass we implement all these by calling into C's API of SymEngine (https://github.com/symengine/symengine/blob/2b575b9be9bb499d866dc3e411e6368ca0d1bb42/symengine/tests/cwrapper/test_cwrapper.c#L19), effectively transforming the code to something like:

basic x = symbol("x')
basic y = symbol("y")
basic z = add(x, y)
printf(str(z))
free(z)
free(x)
free(y)

We can use any backend, such as the LLVM backend to compile this to an object file. Then at link time we have to link in the SymEngine library (and C++ runtime library).

We probably set the allocatable attribute for x which is a Variable of type SymbolicExpression, and treat it like an allocatable array or a string. We will reuse our existing allocatable mechanism that ensures that things get properly deallocated when they get out of scope.

This minimal design does not close any doors and we can later extend it in various ways, as needed (such as compile time simplification/evaluation, many subtypes, adding more dedicated ASR nodes if needed, various ASR optimizations, etc.).

@rebcabin
Copy link
Contributor

rebcabin commented Mar 22, 2023

To model mixed-mode numeric and symbolic expressions like 2 * x, must Integer(2) be a sub-type of SymbolicExpression? Or should we insert an implicit cast of Integer(2) to SymbolicExpression(Integer(2))?

@certik
Copy link
Contributor Author

certik commented Mar 23, 2023

In ASR the * will be represented by SymbolicMul, and both operands must be SymbolicExpression. The integer 2 would be represented probably by SymbolicInteger(int) -> SymbolicExpression, similar to SymbolicSymbol above. We can also provide casting from our regular Integer, or perhaps SymbolicInteger can take a regular Integer as an argument.

@anutosh491
Copy link
Collaborator

Ohh okay I thought we would be emitting calls to SymEngine objects while dealing with the backend but a pass would also do the needful .

Also one small doubts I had was for something related to overloading of IntrinsicFunction.
Let's say we have the following

from lpython import S
from sympy import Symbol, sin

def f():
    x: S = Symbol("x")
    y: S = Symbol("y")
    z: S = x + y
    print(sin(z))

f()

Here the sin would be an IntrinsicFunction, SymbolicSin which deals with symbolic arguments ... and there would also co-exist a Sin, IntrinsicFunction to deal with other types like float, real etc . Is this how it goes ?
I was just thinking if we would want maintain them separately and create two exprs in the grammar , IntrinsicFunction and SymbolicIntrinsicFunction .

@anutosh491
Copy link
Collaborator

Also as you write about this

This minimal design does not close any doors and we can later extend it in various ways, as needed (such as compile time simplification/evaluation, many subtypes, adding more dedicated ASR nodes if needed, various ASR optimizations, etc.).

Simplifcations or optimizations during ASR creation sounds very cool . Do you have some operation or case in mind where this would come into the picture ?

@Smit-create
Copy link
Collaborator

Thanks, @certik. The idea looks awesome. Once we have a good frontend interface and a working ASR, we can possibly have many symbolic optimizations at the ASR level (using pass-rewrites) which can save us a lot in the backend and also with the runtime performance.
Such as:

  1. Rewrite: x/1 or x*1 -> x.
  2. Rewrite: x-0 or x+0 -> x.

and many more.

@certik
Copy link
Contributor Author

certik commented Mar 23, 2023

Yes, the kind of optimizations we can do at compile time as ASR->ASR passes are evaluating the symbolic expression at compile time if the values are known. So for example, the above input:

def f():
    x: S = Symbol("x")
    y: S = Symbol("y")
    z: S = x + y
    print(sin(z))

Can be fully transformed to just:

def f():
    print("sin(x+y)")

Since there is no other side effects and everything is known. That can be achieved by calling SymEngine at compile time, in an ASR pass and just evaluating everything.

But as I said, that comes later.

@certik
Copy link
Contributor Author

certik commented Mar 23, 2023

Regarding the sin function, either is fine, it just gets hooked in here: https://github.com/lfortran/lfortran/blob/208d6c6da0e523b19ef07579c8010d37b6bc80b7/src/libasr/pass/intrinsic_function_registry.h#L37 and I would probably just call it Sin and add SymbolicExpression as another overload, next to f32 and f64.

@anutosh491
Copy link
Collaborator

anutosh491 commented Mar 25, 2023

The + becomes SymbolicAdd, an IntrinsicFunction accepting two SymbolicExpression arguments.

Would it make sense to introduce all binary operators (Binop) as IntrinsicFunction and overload accordingly when we use operands of type SymbolicExpression (if either of the operands comes out to be symbolic , if just one of them is symbolic use SymbolicInteger(int) or some type casting for the other operand) ?

Just a thought because the ASR's for something like print(2 + 3) and something like print(x + 3) might not be very consistent .

Also however ,we handle this . Would we need Sub or Div for SymbolicExpression type (SymbolicDiv & SymbolicSub ) or can we handle these using Add and Mul / Times . If we introduce these , we might like having some ASR optimization and convert x / y kind of operation into x * 1/y sometime later

@anutosh491
Copy link
Collaborator

anutosh491 commented Mar 25, 2023

Also for handling scientific constants like pi, golden ratio etc , would the best way be to have an IntrinsicFunction defining each one of these , or could we have a SymbolicUniversalConstants(str) giving out symbolic values for whatever str (like "pi") or something is being passed .

@certik
Copy link
Contributor Author

certik commented Mar 25, 2023

I would start with just using IntrinsicFunction for everything symbolic, and then we'll see how it goes.

Yes, in principle any expression can be just IntrinsicFunction, but I do not recommend doing that, since that's a huge change and not clear it is a benefit. After we have more experience with IntrinsicFunction, and understand the pros and cons of each approach, we can unify things more.

@anutosh491
Copy link
Collaborator

Yeah I just thought about it because as you said for sin , we might want to stick to sin and not create another SymbolicSin and go for overloading accordinly when SymbolicExpression types are involved. Similarly if we introduce Binop as IntrinsicFunction, we could just overload similarly for SymbolicExpressions.

@anutosh491
Copy link
Collaborator

anutosh491 commented May 24, 2023

In the IntrinsicFunction rewriting pass we implement all these by calling into C's API of SymEngine

I have a couple doubts here . Like after we have our ASR down , we need to implement a pass . But as you've mentioned would these functions be present in the IntrinsicFunction rewriting pass ( replace_IntrinsicFunction in intrinsic_function.cpp) ?

Because we essentially need to convert the whole thing like assignment and print statements

1) x: S = Symbol('x') -> basic x = symbol('x')
2) print(x) -> printf(str(x))

So shouldn't we have a custom pass convert_symengine_assignment or convert_symengine_print where we override the relevant visit methods for handling print and assignment statements like (visit_Print and visit_Assignment) .

But once this is done , I'm not fully sure about how our ASR looks like after that pass as in how do we store/represent our generatred C code ( basic x = symbol('x')) in the ASR ?

@certik
Copy link
Contributor Author

certik commented May 24, 2023

I think most of these will be done in the backend simply emitting the proper code for each ASR node, like this:
x: S = Symbol("x") -> basic x = symbol("x")
y: S = Symbol("y") -> basic y = symbol("y")
z: S = x + y -> basic z = add(x, y)
print(z) -> printf(str(z))

It needs to track where to deallocate the symbols (free(x)), which we might want to eventually track in the ASR itself to keep the backend simple. For now ignore deallocation (we will design this properly later).

@certik
Copy link
Contributor Author

certik commented May 24, 2023

Here is how to call symengine_str from LLVM:

diff --git a/src/libasr/codegen/llvm_utils.h b/src/libasr/codegen/llvm_utils.h
index e3a06ba97..e1922c1c7 100644
--- a/src/libasr/codegen/llvm_utils.h
+++ b/src/libasr/codegen/llvm_utils.h
@@ -29,6 +29,19 @@ namespace LCompilers {
         builder.CreateCall(fn_printf, args);
     }
 
+    static inline void symengine_str(llvm::LLVMContext &context, llvm::Module &module,
+        llvm::IRBuilder<> &builder, const std::vector<llvm::Value*> &args)
+    {
+        llvm::Function *fn_printf = module.getFunction("symengine_str");
+        if (!fn_printf) {
+            llvm::FunctionType *function_type = llvm::FunctionType::get(
+                    llvm::Type::getVoidTy(context), {llvm::Type::getInt8PtrTy(context)}, true);
+            fn_printf = llvm::Function::Create(function_type,
+                    llvm::Function::ExternalLinkage, "symengine_str", &module);
+        }
+        builder.CreateCall(fn_printf, args);
+    }
+
     static inline void print_error(llvm::LLVMContext &context, llvm::Module &module,
         llvm::IRBuilder<> &builder, const std::vector<llvm::Value*> &args)
     {

@certik
Copy link
Contributor Author

certik commented May 24, 2023

Initially LPython will not know how to link, we just do it manually like this:

$ lpython -c expr2.py -o e.o
$ clang -o expr e.o -L/Users/ondrej/repos/lpython/src/bin/../runtime -llpython_runtime -L$CONDA_PREFIX/lib  -lsymengine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants