Errors from imported files #1294

ubaidsk · 2022-11-13T17:39:45Z

Related: #1218

This attempts to support the situation in #1218 (comment).

a.py

i: i32 = 3.142

b.py

from a import i
j: i32 = i

c.py

from b import j
print(j)

Output:

(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$ lpython c.py 
semantic error: Type mismatch in annotation-assignment, the types must be compatible
 --> /home/ubaid/OpenSource/lpython/src/bin/../runtime/a.py:1:1
  |
1 | i: i32 = 3.142
  | ^        ^^^^^ type mismatch ('i32' and 'f64')


Note: if any of the above error or warning messages are not clear or are lacking
context please report it to us (we consider that a bug that needs to be fixed).
note: Imported here
 --> /home/ubaid/OpenSource/lpython/src/bin/../runtime/b.py:1:1
  |
1 | from a import i
  | ^^^^^^^^^^^^^^^ 


Note: if any of the above error or warning messages are not clear or are lacking
context please report it to us (we consider that a bug that needs to be fixed).
note: Imported here
 --> c.py:1:1
  |
1 | from b import j
  | ^^^^^^^^^^^^^^^ 

semantic error: The module 'b' cannot be loaded
 --> c.py:1:1
  |
1 | from b import j
  | ^^^^^^^^^^^^^^^ 


Note: if any of the above error or warning messages are not clear or are lacking
context please report it to us (we consider that a bug that needs to be fixed).
(lp) ubaid@ubaid-Lenovo-ideapad-330-15ARR:~/OpenSource/lpython$

certik · 2022-11-14T04:34:17Z

src/lpython/semantics/python_ast_to_asr.cpp

    std::cerr << diagnostics.render(input, lm, compiler_options);
    if (!r2.ok) {
-        LFORTRAN_ASSERT(diagnostics.has_error())
+        // LFORTRAN_ASSERT(diagnostics.has_error())


I think you have to keep this in there. This checks that the diagnostics contain an error message if r2 returned an Error.

It seems one of the possible use could be diagnostics just containing a note/warning and still throw SemanticAbort() being called to stop further processing of AST->ASR.

For example, for the below message there is no error (there is a note only), but the processing of the contents of b.y still needs to be stopped because we could not import a.py.

note: Imported here --> /home/ubaid/OpenSource/lpython/src/bin/../runtime/b.py:1:1 | 1 | from a import i | ^^^^^^^^^^^^^^^

I think you have to keep this in there. This checks that the diagnostics contain an error message if r2 returned an Error.

It seems throw SemanticAbort() can halt the current processing of AST->ASR, and on using it, due the LFORTRAN_ASSERT(diagnostics.has_error()), it is being compulsory that diagnostics would be containing an error. Although, in the above b.py example, it is not containing any error, instead it is containing a note.

Please, could someone possibly share if there is a way to stop processing of AST->ASR apart from using throw SemanticAbort()?

I guess first, from b import j will happen, then from a import i and then finally, i: i32 = 3.142. So when you get the error in a.py, diagnostics should know that from b import j and then from a import i before i: i32 = 3.142 have happened. More like a stack of diagnostics. Fill the diagnostic with a note when a module import happens because you know you are going to jump into external file. If the external file processes successfully then pop the note from the diagnostics. If an error happens then you already have the information of all the imports that have happened before this error and you can pop them and show to the user.

I agree with this approach! @Shaikh-Ubaid give it a try.
Any doubts please ask.

If an error happens then you already have the information of all the imports

One of the concerns it that if an error happens in a.py, we need to stop the processing of AST->ASR for b.py (given that b.py imports a.py). Now for diagnostics of b.py, we add a note saying Imported here. My doubt is how do we stop the processing of AST->ASR of b.py? If we use throw SemanticAbort(), then the diagnostics of b.py needs to contain an error because of LFORTRAN_ASSERT(diagnostics.has_error()), but instead it just contains a note.

Here's what I am imagining of the control flow that happens,

Control is in c.py and you processed, from b import j. The diagnostics stack (abstract concept, upto you how you implement it) contains that b was imported. A recursive call to python_ast_to_asr is made for processing b.py.

Now control jumped to b.py. Here it processed, from a import i. The diagnostics stack adds that a was imported. A recursive call to python_ast_to_asr is made for processing a.py.

Control is in a.py. Now, i: i32 = 3.142 resulting in a semantic error. Its at this point we first add this error to diagnostics stack and then while rendering diagnostics stack keep popping and displaying to the user. And halt the processing/compilation. No need to go back to b.py.

I don't know about the implementation details but conceptually and ideally IMO that's how it should be done. Whenever you jump to another file, i.e., make a recursive call to python_ast_to_asr just store the diagnostics of import statement (which is making you to jump to another file which is being imported) in a conceptual stack and move ahead. If you face an error somewhere deep in a file, you already know from the conceptual stack how you reached there and you can use that info along with current semantic error to show the complete trace.

Make sense to you folks @Shaikh-Ubaid, @certik, @Thirumalai-Shaktivel?

Plus I will look at your PR later today to understand the details of your issue better. Meanwhile may be just try removing the duplicated note I mentioned about in #1294 (comment).

certik · 2022-11-14T04:35:55Z

These error messages in different files must be fixed by implementing the location manager support for multiple files. Then you simply collect the location information in any file into the diagnostics, and then the diagnostics will correctly show error messages across files automatically.

Thirumalai-Shaktivel · 2022-11-14T09:11:35Z

@certik @czgdp1807 @Shaikh-Ubaid
Would this be the correct way of implementing this?
use vector in_filename

diff --git a/src/libasr/location.h b/src/libasr/location.h
index a38d8175c..0bc85dcf0 100644
@@ -102,7 +102,7 @@ struct LocationManager {
-    std::string in_filename;
+    std::vector<std::string> in_filename;

pass the LocationManager along with the diagnostics as an argument to python_ast_to_asr() and push_back the filename to lm.in_filename for each Import or ImportFrom visitor (Also, pass the diagnostics and LocationManager as an argument to the compile_module_till_asr()).

Now, if there is an error. We just push_back the diagnostics error. As diagnostics is a vector of diagnostic

lpython/src/libasr/diagnostics.h

Line 114 in dce819f

std::vector<Diagnostic> diagnostics;

lpython/src/libasr/diagnostics.h

Lines 122 to 124 in dce819f

    
           void add(const Diagnostic &d) { 
        
               diagnostics.push_back(d); 
        
           }

then, SemanticAbort() and return back to emit_asr().

lpython/src/bin/lpython.cpp

Lines 168 to 172 in dce819f

    
           std::cerr << diagnostics.render(input, lm, compiler_options); 
        
           if (!r.ok) { 
        
               LFORTRAN_ASSERT(diagnostics.has_error()) 
        
               return 2; 
        
           }

Here, while rendering the diagnostics.

lpython/src/libasr/diagnostics.cpp

Lines 67 to 76 in dce819f

    
           for (auto &d : this->diagnostics) { 
        
               if (compiler_options.no_warnings && d.level != Level::Error) { 
        
                   continue; 
        
               } 
        
               if (compiler_options.error_format == "human") { 
        
                   out += render_diagnostic_human(d, input, lm, 
        
                       compiler_options.use_colors, 
        
                       compiler_options.show_stacktrace); 
        
                   if (&d != &this->diagnostics.back()) out += "\n"; 
        
               } else if (compiler_options.error_format == "short") {

For each diagnostic, we can use lm.in_filename.back() to get the start indices of the file and render the current file error message.
P.S. I assumed that one diagnostic means one file's error. So, diagnostics contains multiple file's error messages.

czgdp1807 · 2022-11-14T16:03:28Z

@Thirumalai-Shaktivel Your comment makes sense to me. Why not just implement it and see the result? Plus the error messages in #1294 (comment) contain the note, Note: if any of the above error or warning messages are not clear or are lacking context please report it to us (we consider that a bug that needs to be fixed). thrice. It should appear only once at the end.

czgdp1807

See my comment in #1294 (comment) and try implementing it?

See https://github.com/lcompilers/lpython/pull/1294/files#r1021755155 as well.

Thirumalai-Shaktivel · 2022-11-14T16:05:43Z

Cool, I will submit another testing PR, that depends on this PR. @Shaikh-Ubaid keep it as a reference.

certik · 2022-11-15T17:03:36Z

I think it should be done like this:

diff --git a/src/libasr/location.h b/src/libasr/location.h
index a38d8175c..9a02dd210 100644
--- a/src/libasr/location.h
+++ b/src/libasr/location.h
@@ -94,23 +94,36 @@ struct LocationManager {
     //
     //
     //
-    std::vector<uint32_t> out_start; // consecutive intervals in the output code
-    std::vector<uint32_t> in_start; // start + size in the original code
-    std::vector<uint32_t> in_newlines; // position of all \n in the original code
-
-    // For preprocessor (if preprocessor==true).
-    // TODO: design a common structure, that also works with #include, that
-    // has these mappings for each file
-    bool preprocessor = false;
-    std::string in_filename;
-    uint32_t current_line=0;
-    std::vector<uint32_t> out_start0; // consecutive intervals in the output code
-    std::vector<uint32_t> in_start0; // start + size in the original code
-    std::vector<uint32_t> in_size0; // Size of the `in` interval
-    std::vector<uint32_t> interval_type0; // 0 .... 1:1; 1 ... many to many;
-    std::vector<uint32_t> in_newlines0; // position of all \n in the original code
-//    std::vector<uint32_t> filename_id; // file name for each interval, ID
-//    std::vector<std::string> filenames; // filenames lookup for an ID
+    struct FileLocations {
+        std::vector<uint32_t> out_start; // consecutive intervals in the output code
+        std::vector<uint32_t> in_start; // start + size in the original code
+        std::vector<uint32_t> in_newlines; // position of all \n in the original code
+
+        // For preprocessor (if preprocessor==true).
+        // TODO: design a common structure, that also works with #include, that
+        // has these mappings for each file
+        bool preprocessor = false;
+        std::string in_filename;
+        uint32_t current_line=0;
+        std::vector<uint32_t> out_start0; // consecutive intervals in the output code
+        std::vector<uint32_t> in_start0; // start + size in the original code
+        std::vector<uint32_t> in_size0; // Size of the `in` interval
+        std::vector<uint32_t> interval_type0; // 0 .... 1:1; 1 ... many to many;
+        std::vector<uint32_t> in_newlines0; // position of all \n in the original code
+    }
+    std::vector<FileLocations> files;
+    std::vector<uint32_t> file_ends; // position of all ends of files
+    // For a given Location we use the `file_ends` and bisection to determine
+    // the file (index) which the location is from. Then we use this index into
+    // the `files` vector and use `in_newlines` and other information to
+    // determin the line and column inside this file, and the `in_filename`
+    // field to determine the filename. This happens when the diagnostic is
+    // printed. The `pos_to_linecol` function below should be modified to
+    // return line, column and the filename:
+    //
+    // void pos_to_linecol(uint32_t position, uint32_t &line, uint32_t &col,
+    //     std::string &filename) const;
+
 
     // Converts a position in the output code to a position in the original code
     // Every character in the output code has a corresponding location in the

certik · 2022-11-15T17:07:30Z

@Thirumalai-Shaktivel do you want to give it a shot? @Shaikh-Ubaid if you want, you can focus on the wasm_x86 backend, I think you know what to do there.

certik · 2022-11-15T17:09:24Z

To test this, we simply create files in tests/errors and put some errors in an imported file and use ./run_tests.py to test for the correct error message. We should also test error messages across files, such as calling a function from another module with incorrect arguments, the error message should show the call as well as the declaration in two different files.

Thirumalai-Shaktivel · 2022-11-16T02:59:24Z

@Thirumalai-Shaktivel do you want to give it a shot? @Shaikh-Ubaid if you want, you can focus on the wasm_x86 backend, I think you know what to do there.

Yup, I will start working on this now.

ubaidsk · 2022-11-16T08:46:16Z

@Thirumalai-Shaktivel do you want to give it a shot? @Shaikh-Ubaid if you want, you can focus on the wasm_x86 backend, I think you know what to do there.

I would like to give this another try with the approach in #1294 (comment).

ubaidsk · 2022-11-16T18:30:26Z

and then the linear index is pointing into this concatenated stream.

Currently, every imported file is processed individually. Thus, the index available to us is with respect to the (single) imported file. If we managed to support storing (and accessing of) all the file names and their contents, how (or from where) would a linear index pointing into the concatenated stream be available to us?

Currently we are storing all the file names (in LocationManager). In src_code/input we are concatenating the contents of all the imported files. The above issue exists (was shared at #1218 (comment)). At the moment we do not have access to a linear index (that is the position of error) pointing into the concatenated stream. Instead the position of error points with respect to individual imported file.

(At few places there some unnecessary changes which are not needed but are temporarily present so that the project compiles. We can remove/fix them once we solve the above linear index issue.)

certik · 2022-11-16T18:44:00Z

We need this feature soon.

@Shaikh-Ubaid, unfortunately I don't have time right now to supervise the details here. Do you know how to finish it? If not, then let others finish it, and try to work on some other feature which you know how to implement. @czgdp1807 and I discussed this issue a few days ago, so he knows what to do, if needed.

ubaidsk · 2022-11-16T19:43:20Z

Do you know how to finish it?

I think I can, but I need my doubts to be clarified. My doubts at #1218 (comment) (and #1218 (comment) last paragraph) still persist. Without a clear idea, we are not sure if we are heading into the right direction and thus not sure if we our solution is correct.

try to work on some other feature

Even for other features I might get doubts and without their clarification we might get stuck in that feature.

ubaidsk · 2022-11-16T19:50:35Z

We need this feature soon.

Please take over in that case.

ubaidsk · 2022-11-24T15:22:28Z

Closing as it seems completed in #1297.

ubaidsk force-pushed the extend_loc_manager branch from 9507942 to e549cb0 Compare November 13, 2022 17:40

ubaidsk marked this pull request as draft November 13, 2022 17:40

ubaidsk mentioned this pull request Nov 13, 2022

Extend LocationManager to handle multiple files #1218

Closed

ubaidsk requested review from certik and Thirumalai-Shaktivel November 13, 2022 18:08

certik reviewed Nov 14, 2022

View reviewed changes

certik requested a review from czgdp1807 November 14, 2022 04:36

czgdp1807 requested changes Nov 14, 2022

View reviewed changes

ubaidsk added 3 commits November 16, 2022 11:50

tmp

e136517

tmp

6e1d51f

commit test files

30f052b

tmp

66dd29e

ubaidsk force-pushed the extend_loc_manager branch from e549cb0 to 66dd29e Compare November 16, 2022 18:11

ubaidsk closed this Nov 24, 2022

ubaidsk deleted the extend_loc_manager branch December 23, 2022 21:44

Errors from imported files #1294

Errors from imported files #1294

Uh oh!

Conversation

ubaidsk commented Nov 13, 2022

Uh oh!

certik Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

ubaidsk Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

Thirumalai-Shaktivel Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ubaidsk Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Nov 14, 2022

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

certik commented Nov 14, 2022

Uh oh!

Thirumalai-Shaktivel commented Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

czgdp1807 commented Nov 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

czgdp1807 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Thirumalai-Shaktivel commented Nov 14, 2022

Uh oh!

certik commented Nov 15, 2022

Uh oh!

certik commented Nov 15, 2022

Uh oh!

certik commented Nov 15, 2022

Uh oh!

Thirumalai-Shaktivel commented Nov 16, 2022

Uh oh!

ubaidsk commented Nov 16, 2022

Uh oh!

ubaidsk commented Nov 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

certik commented Nov 16, 2022

Uh oh!

ubaidsk commented Nov 16, 2022

Uh oh!

ubaidsk commented Nov 16, 2022

Uh oh!

ubaidsk commented Nov 24, 2022

Uh oh!

Uh oh!

Thirumalai-Shaktivel Nov 14, 2022 •

edited

Loading

czgdp1807 Nov 14, 2022 •

edited

Loading

Thirumalai-Shaktivel commented Nov 14, 2022 •

edited

Loading

czgdp1807 commented Nov 14, 2022 •

edited

Loading

czgdp1807 left a comment •

edited

Loading

ubaidsk commented Nov 16, 2022 •

edited

Loading