Skip to content

[ASRPass] Wrap all the global symbols into a module #1491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion integration_tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -429,4 +429,5 @@ RUN(NAME comp_01 LABELS cpython llvm wasm wasm_x64)
RUN(NAME bit_operations_i32 LABELS cpython llvm wasm wasm_x64)
RUN(NAME bit_operations_i64 LABELS cpython llvm wasm)

RUN(NAME test_argv_01 LABELS llvm) # TODO: Test using CPython
RUN(NAME test_argv_01 LABELS llvm) # TODO: Test using CPython
RUN(NAME global_syms_01 LABELS cpython)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not llvm and c here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In LLVM, the global lists are not yet handled. After merging this I will start working on it.
C: I didn't try it. How do I verify this works in C?
Using lpython integration_tests/global_syms_01.py --show-c?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In LLVM, the global lists are not yet handled. After merging this I will start working on it.

Any rough idea on how that will be managed. I think LLVM doesn't allow builder->CreateAlloca in global scope. Worth researching approaches (implementation in LLVM is not at all required just a rough idea that there is a possibility to make it work in LLVM) once this before merging this because the success of this might rely on the possibility of actually using it in LLVM.

One approach in my mind is to create void* typed global variables in the global scope and then bitcast them into list LLVM type inside LLVM functions. That way any manipulation will happen inside LLVM functions.

Other approach is to see how clang++ does it for global variables like std::vector in C++. For example the below code,

#include <iostream>
#include <vector>

std::vector<int> vec;

int main() {
    return 0;
}

clang++'s LLVM output,

; ModuleID = '/Users/czgdp1807/lpython_project/debug.cpp'
source_filename = "/Users/czgdp1807/lpython_project/debug.cpp"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx13.0.0"

%"class.std::__1::vector" = type { %"class.std::__1::__vector_base" }
%"class.std::__1::__vector_base" = type { i32*, i32*, %"class.std::__1::__compressed_pair" }
%"class.std::__1::__compressed_pair" = type { %"struct.std::__1::__compressed_pair_elem" }
%"struct.std::__1::__compressed_pair_elem" = type { i32* }
%"struct.std::__1::nullptr_t" = type { i8* }
%"struct.std::__1::__default_init_tag" = type { i8 }
%"class.std::__1::__vector_base_common" = type { i8 }
%"struct.std::__1::__compressed_pair_elem.0" = type { i8 }
%"class.std::__1::allocator" = type { i8 }
%"struct.std::__1::__non_trivial_if" = type { i8 }

@vec = global %"class.std::__1::vector" zeroinitializer, align 8
@__dso_handle = external hidden global i8
@llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I_debug.cpp, i8* null }]

; Function Attrs: noinline ssp uwtable
define internal void @__cxx_global_var_init() #0 section "__TEXT,__StaticInit,regular,pure_instructions" {
  %1 = call %"class.std::__1::vector"* @_ZNSt3__16vectorIiNS_9allocatorIiEEEC1Ev(%"class.std::__1::vector"* @vec)
  %2 = call i32 @__cxa_atexit(void (i8*)* bitcast (%"class.std::__1::vector"* (%"class.std::__1::vector"*)* @_ZNSt3__16vectorIiNS_9allocatorIiEEED1Ev to void (i8*)*), i8* bitcast (%"class.std::__1::vector"* @vec to i8*), i8* @__dso_handle) #2
  ret void
}

It can be seen that it creates global variables but initialises them inside a function. So you can do something similar for lists as well. Also, worth noting that we will have to create a function something like global_module_function in LLVM and then call inside main. But that should be doable.

C: I didn't try it. How do I verify this works in C?
Using lpython integration_tests/global_syms_01.py --show-c?

If LLVM is not enabled then C is not worth enabling. LLVM should always work with a given ASR design then other backends will automatically follow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am approving it as I think this is clean enough at the ASR as well as LLVM/C level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@czgdp1807 the question here is how to handle module variables, as well as program variables. They are scoped to the module or program, but are essentially global. WASM and x86 have the same issue (@Shaikh-Ubaid and I discussed this a few times). It seems one must allocate the global variable (using the platform dependent mechanism), and then initialize it from the main function as needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So seems like global statements are already being wrapped into a main function in the ASR itself? Well I would like to see what we will need to do when we enable llvm for this test. Let’s merge it for now (I have approved earlier today).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me review this first.

12 changes: 12 additions & 0 deletions integration_tests/global_syms_01.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
from ltypes import i32

x: list[i32]
x = [1, 2]
i: i32
i = x[0]

def test_global_symbols():
assert i == 1
assert x[1] == 2

test_global_symbols()
1 change: 1 addition & 0 deletions src/libasr/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ set(SRC
pass/for_all.cpp
pass/global_stmts.cpp
pass/global_stmts_program.cpp
pass/global_symbols.cpp
pass/select_case.cpp
pass/implied_do_loops.cpp
pass/array_op.cpp
Expand Down
101 changes: 101 additions & 0 deletions src/libasr/asr_scopes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,105 @@ std::string SymbolTable::get_unique_name(const std::string &name) {
return unique_name;
}

void SymbolTable::move_symbols_from_global_scope(Allocator &al,
SymbolTable *module_scope, Vec<char *> &syms,
Vec<char *> &mod_dependencies) {
// TODO: This isn't scalable. We have write a visitor in asdl_cpp.py
syms.reserve(al, 4);
mod_dependencies.reserve(al, 4);
for (auto &a : scope) {
switch (a.second->type) {
case (ASR::symbolType::Module): {
// Pass
break;
} case (ASR::symbolType::Function) : {
ASR::Function_t *fn = ASR::down_cast<ASR::Function_t>(a.second);
for (size_t i = 0; i < fn->n_dependencies; i++ ) {
ASR::symbol_t *s = fn->m_symtab->get_symbol(
fn->m_dependencies[i]);
if (s == nullptr) {
std::string block_name = "block";
ASR::symbol_t *block_s = fn->m_symtab->get_symbol(block_name);
int32_t j = 1;
while(block_s != nullptr) {
while(block_s != nullptr) {
ASR::Block_t *b = ASR::down_cast<ASR::Block_t>(block_s);
s = b->m_symtab->get_symbol(fn->m_dependencies[i]);
if (s == nullptr) {
block_s = b->m_symtab->get_symbol("block");
} else {
break;
}
}
if (s == nullptr) {
block_s = fn->m_symtab->get_symbol(block_name +
std::to_string(j));
j++;
} else {
break;
}
}
}
if (s == nullptr) {
s = fn->m_symtab->parent->get_symbol(fn->m_dependencies[i]);
}
if (s != nullptr && ASR::is_a<ASR::ExternalSymbol_t>(*s)) {
char *es_name = ASR::down_cast<
ASR::ExternalSymbol_t>(s)->m_module_name;
if (!present(mod_dependencies, es_name)) {
mod_dependencies.push_back(al, es_name);
}
}
}
fn->m_symtab->parent = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) fn);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::GenericProcedure) : {
ASR::GenericProcedure_t *es = ASR::down_cast<ASR::GenericProcedure_t>(a.second);
es->m_parent_symtab = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) es);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::ExternalSymbol) : {
ASR::ExternalSymbol_t *es = ASR::down_cast<ASR::ExternalSymbol_t>(a.second);
if (!present(mod_dependencies, es->m_module_name)) {
mod_dependencies.push_back(al, es->m_module_name);
}
es->m_parent_symtab = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) es);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::StructType) : {
ASR::StructType_t *st = ASR::down_cast<ASR::StructType_t>(a.second);
st->m_symtab->parent = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) st);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::EnumType) : {
ASR::EnumType_t *et = ASR::down_cast<ASR::EnumType_t>(a.second);
et->m_symtab->parent = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) et);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::UnionType) : {
ASR::UnionType_t *ut = ASR::down_cast<ASR::UnionType_t>(a.second);
ut->m_symtab->parent = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) ut);
syms.push_back(al, s2c(al, a.first));
break;
} case (ASR::symbolType::Variable) : {
ASR::Variable_t *v = ASR::down_cast<ASR::Variable_t>(a.second);
v->m_parent_symtab = module_scope;
module_scope->add_symbol(a.first, (ASR::symbol_t *) v);
syms.push_back(al, s2c(al, a.first));
break;
} default : {
throw LCompilersException("Moving the symbol:`" + a.first +
"` from global scope is not implemented yet");
};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't scalable. I would do it by writing a visitor in asdl_cpp.py. Will you be able to try the class design and implement it? Give it a try for a couple of days, if not then we will merge this with a TODO.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this in TODO

}
}

} // namespace LCompilers
5 changes: 5 additions & 0 deletions src/libasr/asr_scopes.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include <map>

#include <libasr/alloc.h>
#include <libasr/containers.h>

namespace LCompilers {

Expand Down Expand Up @@ -80,6 +81,10 @@ struct SymbolTable {
size_t n_scope_names, char **m_scope_names);

std::string get_unique_name(const std::string &name);

void move_symbols_from_global_scope(Allocator &al,
SymbolTable *module_scope, Vec<char *> &syms,
Vec<char *> &mod_dependencies);
};

} // namespace LCompilers
Expand Down
48 changes: 48 additions & 0 deletions src/libasr/codegen/asr_to_c.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -726,6 +726,54 @@ R"(
ds_funcs_defined + util_funcs_defined;
}

void visit_Module(const ASR::Module_t &x) {
std::string unit_src = "";
for (auto &item : x.m_symtab->get_scope()) {
if (ASR::is_a<ASR::Variable_t>(*item.second)) {
std::string unit_src_tmp;
ASR::Variable_t *v = ASR::down_cast<ASR::Variable_t>(
item.second);
unit_src_tmp = convert_variable_decl(*v);
unit_src += unit_src_tmp;
if(unit_src_tmp.size() > 0 &&
(!ASR::is_a<ASR::Const_t>(*v->m_type) ||
v->m_intent == ASRUtils::intent_return_var )) {
unit_src += ";\n";
}
}
}
std::map<std::string, std::vector<std::string>> struct_dep_graph;
for (auto &item : x.m_symtab->get_scope()) {
if (ASR::is_a<ASR::StructType_t>(*item.second) ||
ASR::is_a<ASR::EnumType_t>(*item.second) ||
ASR::is_a<ASR::UnionType_t>(*item.second)) {
std::vector<std::string> struct_deps_vec;
std::pair<char**, size_t> struct_deps_ptr = ASRUtils::symbol_dependencies(item.second);
for( size_t i = 0; i < struct_deps_ptr.second; i++ ) {
struct_deps_vec.push_back(std::string(struct_deps_ptr.first[i]));
}
struct_dep_graph[item.first] = struct_deps_vec;
}
}

std::vector<std::string> struct_deps = ASRUtils::order_deps(struct_dep_graph);
for (auto &item : struct_deps) {
ASR::symbol_t* struct_sym = x.m_symtab->get_symbol(item);
visit_symbol(*struct_sym);
}

// Topologically sort all module functions
// and then define them in the right order
std::vector<std::string> func_order = ASRUtils::determine_function_definition_order(x.m_symtab);
for (auto &item : func_order) {
ASR::symbol_t* sym = x.m_symtab->get_symbol(item);
ASR::Function_t *s = ASR::down_cast<ASR::Function_t>(sym);
visit_Function(*s);
unit_src += src;
}
src = unit_src;
}

void visit_Program(const ASR::Program_t &x) {
// Topologically sort all program functions
// and then define them in the right order
Expand Down
6 changes: 4 additions & 2 deletions src/libasr/codegen/asr_to_llvm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2446,14 +2446,16 @@ class ASRToLLVMVisitor : public ASR::BaseVisitor<ASRToLLVMVisitor>
ASR::Variable_t *v = down_cast<ASR::Variable_t>(
item.second);
visit_Variable(*v);
}
if (is_a<ASR::Function_t>(*item.second)) {
} else if (is_a<ASR::Function_t>(*item.second)) {
ASR::Function_t *v = down_cast<ASR::Function_t>(
item.second);
if (ASRUtils::get_FunctionType(v)->n_type_params == 0) {
instantiate_function(*v);
declare_needed_global_types(*v);
}
} else if (is_a<ASR::EnumType_t>(*item.second)) {
ASR::EnumType_t *et = down_cast<ASR::EnumType_t>(item.second);
visit_EnumType(*et);
}
}
finish_module_init_function_prototype(x);
Expand Down
22 changes: 21 additions & 1 deletion src/libasr/codegen/asr_to_x86.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,12 @@ class ASRToX86Visitor : public ASR::BaseVisitor<ASRToX86Visitor>
visit_symbol(*sym);
}

std::vector<std::string> build_order = ASRUtils::determine_module_dependencies(x);
for (auto &item : build_order) {
ASR::symbol_t *mod = x.m_global_scope->get_symbol(item);
visit_symbol(*mod);
}

// Then the main program:
for (auto &item : x.m_global_scope->get_scope()) {
if (ASR::is_a<ASR::Program_t>(*item.second)) {
Expand All @@ -89,6 +95,19 @@ class ASRToX86Visitor : public ASR::BaseVisitor<ASRToX86Visitor>
emit_elf32_footer(m_a);
}

void visit_Module(const ASR::Module_t &x) {
std::vector<std::string> func_order
= ASRUtils::determine_function_definition_order(x.m_symtab);
for (size_t i = 0; i < func_order.size(); i++) {
ASR::symbol_t* sym = x.m_symtab->get_symbol(func_order[i]);
// Ignore external symbols because they are already defined by the loop above.
if( !sym || ASR::is_a<ASR::ExternalSymbol_t>(*sym) ) {
continue;
}
visit_symbol(*sym);
}
}

void visit_Program(const ASR::Program_t &x) {


Expand Down Expand Up @@ -504,7 +523,8 @@ class ASRToX86Visitor : public ASR::BaseVisitor<ASRToX86Visitor>
}

void visit_SubroutineCall(const ASR::SubroutineCall_t &x) {
ASR::Function_t *s = ASR::down_cast<ASR::Function_t>(x.m_name);
ASR::Function_t *s = ASR::down_cast<ASR::Function_t>(
ASRUtils::symbol_get_past_external(x.m_name));

uint32_t h = get_hash((ASR::asr_t*)s);
if (x86_symtab.find(h) == x86_symtab.end()) {
Expand Down
30 changes: 30 additions & 0 deletions src/libasr/pass/array_op.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -831,6 +831,36 @@ class ArrayOpVisitor : public ASR::CallReplacerOnExpressionsVisitor<ArrayOpVisit
current_scope = current_scope_copy;
}

void visit_Module(const ASR::Module_t &x) {
// FIXME: this is a hack, we need to pass in a non-const `x`,
// which requires to generate a TransformVisitor.
ASR::Module_t &xx = const_cast<ASR::Module_t&>(x);
current_scope = xx.m_symtab;
for (auto &item : x.m_symtab->get_scope()) {
if (is_a<ASR::Function_t>(*item.second)) {
ASR::Function_t *s = ASR::down_cast<ASR::Function_t>(item.second);
if (s->m_return_var) {
/*
* A function which returns an array will be converted
* to a subroutine with the destination array as the last
* argument. This helps in avoiding deep copies and the
* destination memory directly gets filled inside the subroutine.
*/
if( PassUtils::is_array(s->m_return_var) ) {
ASR::symbol_t* s_sub = create_subroutine_from_function(s);
// Update the symtab with this function changes
xx.m_symtab->add_symbol(item.first, s_sub);
}
}
}
}

// Now visit everything else
for (auto &item : x.m_symtab->get_scope()) {
this->visit_symbol(*item.second);
}
}

void visit_Program(const ASR::Program_t &x) {
// FIXME: this is a hack, we need to pass in a non-const `x`,
// which requires to generate a TransformVisitor.
Expand Down
26 changes: 20 additions & 6 deletions src/libasr/pass/global_stmts_program.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include <libasr/asr_utils.h>
#include <libasr/asr_verify.h>
#include <libasr/pass/global_stmts.h>
#include <libasr/pass/global_symbols.h>


namespace LCompilers {
Expand All @@ -24,17 +25,30 @@ void pass_wrap_global_stmts_into_program(Allocator &al,
std::string prog_name = "main_program";
Vec<ASR::stmt_t*> prog_body;
prog_body.reserve(al, 1);
Vec<char *> prog_dep;
prog_dep.reserve(al, 1);
if (unit.n_items > 0) {
pass_wrap_global_stmts_into_function(al, unit, pass_options);
ASR::symbol_t *fn = unit.m_global_scope->get_symbol(program_fn_name);
if (ASR::is_a<ASR::Function_t>(*fn)
&& ASR::down_cast<ASR::Function_t>(fn)->m_return_var == nullptr) {
pass_wrap_global_syms_into_module(al, unit, pass_options);
ASR::Module_t *mod = ASR::down_cast<ASR::Module_t>(
unit.m_global_scope->get_symbol("_global_symbols"));
// Call `_lpython_main_program` function
ASR::symbol_t *fn_s = mod->m_symtab->get_symbol(program_fn_name);
if (ASR::is_a<ASR::Function_t>(*fn_s)
&& ASR::down_cast<ASR::Function_t>(fn_s)->m_return_var == nullptr) {
ASR::Function_t *fn = ASR::down_cast<ASR::Function_t>(fn_s);
fn_s = ASR::down_cast<ASR::symbol_t>(ASR::make_ExternalSymbol_t(
al, fn->base.base.loc, current_scope, s2c(al, program_fn_name),
fn_s, mod->m_name, nullptr, 0, s2c(al, program_fn_name),
ASR::accessType::Public));
current_scope->add_symbol(program_fn_name, fn_s);
ASR::asr_t *stmt = ASR::make_SubroutineCall_t(
al, unit.base.base.loc,
fn, nullptr,
fn_s, nullptr,
nullptr, 0,
nullptr);
prog_body.push_back(al, ASR::down_cast<ASR::stmt_t>(stmt));
prog_dep.push_back(al, s2c(al, "_global_symbols"));
} else {
throw LCompilersException("Return type not supported yet");
}
Expand All @@ -43,8 +57,8 @@ void pass_wrap_global_stmts_into_program(Allocator &al,
al, unit.base.base.loc,
/* a_symtab */ current_scope,
/* a_name */ s2c(al, prog_name),
nullptr,
0,
prog_dep.p,
prog_dep.n,
/* a_body */ prog_body.p,
/* n_body */ prog_body.n);
unit.m_global_scope->add_symbol(prog_name, ASR::down_cast<ASR::symbol_t>(prog));
Expand Down
Loading