Design: syntax trees vs identity #3129

matklad · 2020-02-13T13:45:50Z

In #3120 (comment), I've once again noticed that working with syntax trees when you want to get semantic information is annoying. This issue is just a brain dump of the things we might do here.

The problem we have is that it's difficult to "index" semantic info by current syntax trees.

For example, here's how one gets a type of expression in Roslyn (with some details ommited):

            SyntaxTree tree = CSharpSyntaxTree.ParseText("some code");

            var model =  getSemanticModel(tree);

            var helloWorldString = root.DescendantNodes()
                                       .OfType<LiteralExpressionSyntax>()
                                       .First();

            var type = model.GetTypeInfo(helloWorldString).Type; // *

Note how at * we only pass the tree to the model to get the type.

In contrast, with the current API in Rust syntax trees are value types which don't remember the context they originated from. So, to get a type of an expression syntax, you also need to specify the identity of the file this syntax originates from.

The current way to specify identity is mostly the InFile<T> type, which is a pair of T and a (real or macro) file id. There are two problems with this approach:

It does not actually yield a true global identity, because the same file can be included twice into a module tree of a crate (Note how this makes syntax<->semantics not a bijection, which is painful to deal with).
The API is awkward because, if you have s: InFile<ast::StructDef>, you can't get an Option<InFile<ast::Name>> by calling an s.name(), instead you need to s.map(|it| it.name()).transpose(), which is a lot.

It also has it's benefits:

It is composable, you can have InFile<ast::SomeNod>, InFile<TextRange>, InFile<Token>.
It removes identity from the tree itself, making the boundaries between the components clearer.

What can we do here:

Maintain the status quo, requiring all the IDE code to explicitly mange, map and transpose InFile<T>'s.
Explicitely mirror AST API for InFile<T>. Ie, generate something like impl InFile<ast::StructDef> { fn name(&self) -> Option<InFile<ast::StructDef>> }
Add FIleId(u32) field to each syntax node, such that nodes always come with a strong identity.

The text was updated successfully, but these errors were encountered:

matklad · 2020-02-13T13:47:54Z

Zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Fwg-rls-2.2E0/topic/Design.20question.3A.20identity.20in.20syntax.20trees

matklad · 2020-02-13T16:44:37Z

Found a seemingly cute solution: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Fwg-rls-2.2E0/topic/Design.20question.3A.20identity.20in.20syntax.20trees/near/188124603

We add an identity to the tree, but this identity doesn't have any semantics attached.

matklad · 2020-03-02T18:32:29Z

For the time being this is closed by #3222.

matklad added E-unknown It's unclear if the issue is E-hard or E-easy without digging in fun A technically challenging issue with high impact labels Feb 13, 2020

flodiebold mentioned this issue Feb 16, 2020

When a single test is run, do not run others with overlapping names #3153

Merged

matklad added the C-Architecture Big architectural things which we need to figure up-front (or suggestions for rewrites :0) ) label Feb 17, 2020

matklad closed this as completed Mar 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Design: syntax trees vs identity #3129

Design: syntax trees vs identity #3129

matklad commented Feb 13, 2020

matklad commented Feb 13, 2020

Uh oh!

matklad commented Feb 13, 2020

Uh oh!

matklad commented Mar 2, 2020

Uh oh!

Design: syntax trees vs identity #3129

Design: syntax trees vs identity #3129

Comments

matklad commented Feb 13, 2020

matklad commented Feb 13, 2020

Uh oh!

matklad commented Feb 13, 2020

Uh oh!

matklad commented Mar 2, 2020

Uh oh!