-
Notifications
You must be signed in to change notification settings - Fork 215
Nullable definite assignment #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One comment added at line 799, where it looks like we should handle two cases.
I looked at this PR as a diff to 'simple definite assignment', simple_definite_assignment...nullable_definite_assignment.
At a join point in the program, it is an error if an **untyped variable** has | ||
been given an **initialization inferred type** on one of the reachable paths to | ||
the join point and not on another reachable path, unless the variable is | ||
declared as `late`, or using `var`. Note that it is valid for a variable to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it an error for an untyped final
, non-late
variable to be potentially both assigned and potentially unassigned.
We don't do that for non-untyped variables - it's going to be impossible to read or write to the variable anyway, so it's not like the variable is useful.
One advantage is that we don't need to find a merged type when one branch is unassigned, but we need to do that for non-final or late
variables anyway.
So do we need to make this restriction here?
Would it be sufficient to not make it an error, and treat it just as a var
variable wrt. typing. All reads and writes afterwards will be errors anyway.
Or should we make it an error to have a potentially unassigned and potentially assigned final variable anywhere, independent of type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we make it an error to have a potentially unassigned and potentially assigned final variable anywhere, independent of type?
It took me a while to figure out how this differs from the proposal. I guess the difference is that if you don't touch the variable at all, Leaf's proposal allows it where yours would make it an error:
foo() {
final x;
if (c) {
x = value;
print(x);
}
}
Is this program erroneous? It's silly, but I'm very slightly inclined to allow it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it an error for an untyped
final
, non-late
variable to be potentially both assigned and potentially unassigned.
Sorry, I don't follow. Nothing you quote says anything about definite assignment, and I think your statement is not true. This program is not an error in my proposal:
final x;
if (b) {
while(b) {
x = 3;
}
// x is potentially assigned and potentially unassigned
} else {
while(b) {
x = 3;
}
// x is potentially assigned, and potentially unassigned
} // not an error, x has no initialization inferred type on any path
I'm not sure if that's the example you had in mind? But it's not an error. In general, talking about definite assignment here is not useful: the only place that definite assignment/unassignment comes into this section is in the definition of an initializing assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
foo() { final x; if (c) { x = value; print(x); } }
No, my proposal makes this an error. See the first test below.
if (b) { | ||
y = 3; | ||
} // Ok, y has inferred type `int` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what happens for:
var x;
if (b) x = 2;
x = 3;
It's not an error, Is the second assignment (which is not an initializing assignment) allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. One path into the join has an initializing assignment, given an IIT of int
. At the join point, the other path has no IIT, so we make the IIT after the join int?
and the assignment is valid.
- If the variable is declared `late` and/or `var` and the **initialization | ||
inferred type** on one or more paths is the same type `T` (using syntactic | ||
equality), and the **initialization inferred type** on one or more paths is | ||
`Null`, then the **initialization inferred type** after the join is `T?`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't require those two sets to cover the initialized branches.
It's underspecified because it is satisfied by:
var x;
switch (c) {
case 0: x = 0; break;
case 1; x = "1"; break;
default: x = null;
}
// `x` has type `int?` *and* `String?`
We want this to handle late
variables (and var
where they overlap). Maybe instead just do late for itself. The rules would be:
- final, non-late: Must be assigned on all paths.
- late final, late var: Nullable if assigned null (or nullable, see below), otherwise not.
- var, non-late: Nullable if unassigned or assigned null (or nullable).
Maybe:
- If the variable is
final
and non-late
,
it's a compile-time error and there is no initialization inferred type after the join.- Otherwise, if the variable is declared
late
and the initialization inferred type is the same type T
(using syntactic equality) on all paths which has one,
then the initialization inferred type after the join is T.- Otherwise if there is a non-nullable type T such that the initialization inferred type
on some paths are the same as either T or T? (using syntactic equality),
and on all other paths which has one, the initialization inferred type isNull
,
then the initialization inferred type after the join is T?.- Otherwise it's a compile-time error.
This makes the late
variable not add nullability in the case where there is only one type assigned to it, and combines all other cases where there is more than one type assigned to it or the variable is var
and non-late
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
the same type `T` (using syntactic equality), and the **initialization | ||
inferred type** on all other reachable paths is `Null`, then the | ||
**initialization inferred type** after the join is `T?`. | ||
- Otherwise, it is an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the situation where three branches has the types int
, int?
and Null
?
Or just two branches having the types int
and int?
. Which can happen with consecutive joins like;
var x;
if (test1) {
x = 1;
} else if (test2) {
x = null;
} else {
x = 0;
}
Here the join points would first combine x=null
and x=0
to int?
, then fail to join that with x=1
.
So, reasonable code which should probably be accepted, because otherwise we can't explain this rule to users.
Maybe:
- Otherwise, if there exists a type T such that initialization inferred type on all paths are either the same as T or T? (using syntactic equality) or it is
Null
, then the initialization inferred type after the join is T?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll take a crack at making this work. Definitely starts getting complicated here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the proposed behavior here, modulo:
- Handling nullable types in join points like Lasse notes.
- Maybe making using an initialization inferred type variable an error instead of making the join itself an error.
I find the presentation here a little dense, but that may be somewhat necessary given the nature of the behavior. I may have read some stuff wrong because of that.
} | ||
``` | ||
|
||
At a join point in the program, it is an error if an **untyped variable** has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the "unless..." clause at the end here, how about what I think is equivalent:
At a join point in the program, it is an error if a non-
late
final
untyped variable has been given an initialization inferred type on one of the reachable paths to the join point and not on another reachable path.
Is this equivalent to saying it is an error at a join point if a non-late
final
untyped variable is definitely assigned on one path and not on all others? If so, would that be a clearer way to state this because it doesn't bring in initialization inferred types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find your rewording very confusing, because the binding of the non
part is ambiguous. I think you mean "non-late
, and final
,", which is correct but awkward when written out that way, but as written it can also be read "non-('lateand
final`) which is incorrect.
To your second question, no. You can have variables that are definitely assigned but have no initialization inferred types. That's one of the key points of all three of these proposals: that they separate the notion of having an initialization inferred type from being definitely assigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
English is weirdly ambiguous sometimes! In Danish I'd write non-late and -final
if I wanted non
to bind to both, and non-late and final
if I didn't.
Apart from that, I like using "assigned" rather than having an inferred type. Then it's just a matter of permuting the conjunctive phrases until it is readable.
If an untyped variable is
final
and non-late
, it is an error if, at a join point, the variable is potentially assigned on at least one reachable path, and potentially unassigned on at least one other reachable path.
or
It is an error if an untyped variable is declared
final
and not declaredlate
and there are two reachable paths leading to a join point where the variable is potentially assigned on one path and potentially unassigned on the other.
or ...
At a join point in the program, it is an error if an **untyped variable** has | ||
been given an **initialization inferred type** on one of the reachable paths to | ||
the join point and not on another reachable path, unless the variable is | ||
declared as `late`, or using `var`. Note that it is valid for a variable to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we make it an error to have a potentially unassigned and potentially assigned final variable anywhere, independent of type?
It took me a while to figure out how this differs from the proposal. I guess the difference is that if you don't touch the variable at all, Leaf's proposal allows it where yours would make it an error:
foo() {
final x;
if (c) {
x = value;
print(x);
}
}
Is this program erroneous? It's silly, but I'm very slightly inclined to allow it.
if (b) { | ||
x = 3; | ||
} | ||
// x has type int? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will not be what users want most of the time. It's probably a mistake that they didn't initialize it. But the idea is that you'll get a nullable type and then you'll get some other downstream error when you try to use that, so erroneous uses of this should still be caught. And then it is what you want, it lets you. Is that the idea?
If so, SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I mean, it's easy to not allow this at all - I can say that you must write a value on every path to the join, so you'd have to write this with an x=null
in an else branch. But the idea was to allow this, since anything else feels a little verbose. The one place I can see this not catching legit errors is if you're using a nullable API, but you didn't actually want to pass null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep thinking about this case and I still worry that we're implicitly giving a user a nullable type that they didn't request. I wish we could instead make this case an error and tell them to solve it by writing:
var x = null; // User opts in to null.
if (b) {
x = 3;
}
This is just a hunch, though, and I could be wrong. Here's the evidence I have in mind:
-
Users often tell us that they want more "missing return" errors even in functions that can return
null
. Many seem to want this to be an error:Object foo(bool c) { // Object? with NNBD. if (c) return "thing"; }
And instead want to have to write:
Object foo(bool c) { // Object? with NNBD. if (c) return "thing"; return null; }
-
Dart requires final fields to be definitely initialized in the constructor initialization list, even when
null
is a valid value. Users don't seem to mind having to put an explicit= null
there. This may be confounded by them having different preferences between final and non-final fields, but I'm not sure about that. -
I've definitely had a number of users over the years tell me they are surprised by the Effective Dart rule that tells them to rely on default initialization to
null
. They felt it should have been made explicit.
// u has inferred type num, but is promoted to int | ||
} else { | ||
u = 3; | ||
} // Error: u is inferred to different types on different paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subtle! This will trip people up when they run into it. Is it a crazy idea to use the promoted type at a join point instead of just the inferred type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can just make it have no inferred initialization type, and make it an error to read or write a variable with no type. The variable will still be potentially assigned so no further initialization can happen.
Then
foo() {
var x;
if (test) {
x = 4;
something(x);
} else if (test2) {
x = 4.0;
somethingElse(x);
} else {
throw "whatever";
}
return;
}
would be valid because even though there is no type for x
after the if-chain, there is also no use of it.
it's different because we have to let the program survive in a state where a variable has no type, rather than make it an error at the join point, which is why we have to explicitly call out that the variable is unusable afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will trip people up when they run into it. Is it a crazy idea to use the promoted type at a join point instead of just the inferred type?
Not sure, I can think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can just make it have no inferred initialization type, and make it an error to read or write a variable with no type.
It is already an error to read a variable with no IIT. We can make it an error to write, or not.
The general idea of making the IIT non-monotonic feels icky, but maybe would work out fine. You'd have a variable which was in a usable state for some period of time, and then becomes unusable. Feels weird, but probably doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subtle! This will trip people up when they run into it. Is it a crazy idea to use the promoted type at a join point instead of just the inferred type?
Where this starts getting complicated is deciding what combinations you allow. If I infer a variable as Object
on one path, then promoted it from there to num
, and then to int
, what joins do I allow? Can I join it with another path that inferred Object
(demoting to Object
)? Can I join it with another path that inferred num
(demoting to num
)? What about another path that inferred num?
? Do I have to look at every pair of types in the promotion chain, with their nullable + non-nullable versions? And what about when there are multiple paths into the join? I'll think about whether there's a clean story here, but it feels really icky on first glance.
shall be treated as having declared type `Never`. Note that despite the | ||
requirement that the **initialization inferred types** be equal at join points, | ||
it is possible for a variable to take on different **initialization inferred | ||
types** on paths that never join before exiting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then a stranger example is:
test(c) {
var x;
if (c) {
x = "a string";
return;
} else {
x = 123;
return;
}
}
So you can create a variable and use it as totally arbitrary unrelated types as long as you escape before those paths join. It's a little strange to me that the above example is allowed but this is not:
test(c) {
var x;
if (c) {
x = "a string";
} else {
x = 123;
}
}
With final
, we don't make it an error to join two paths where one is definitely assigned and the other is not. We just leave the variable in a half-assigned unusable state and then make it an error to touch it. But here we say the join itself is the point of error.
Should we pick one or the other strategy and have a more consistent model?
I lean slightly towards making the use of the variable the error and not the join itself. Pragmatically, that allows more programs that don't do anything harmful. And I think it draws the user's attention to a concrete entity — a variable they declared that they are using — instead of to a somewhat nebulous control flow concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After seeing these examples, I agree that it's not the join that's bad, it's the use after join.
It's not just reading the variable which is iffy afterwards, you also shouldn't assign to the variable (because it has no type).
It still means that we need to introduce a state after the join which can represent the "unusable" variable.
Maybe just:
- Variable is untyped and definitely unassigned: You can assign to it, and it's an initializing assignment. After the assignment it's not untyped any more, it has an inferred initialization type.
- Variable is untyped and potentially assigned: You cannot read or write the variable. Not even if it's late. You can't lose "potentially assigned" by joining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
final
, we don't make it an error to join two paths where one is definitely assigned and the other is not.
One of us is confused. Again, you need to stop talking about definite assignment, but assuming you mean "join two paths where one has an IIT and one does not", I believe my spec is intended to make this an error:
- If the variable is declared `final` and not `late` it is an error, and there
is no **initialization inferred type** after the join.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After seeing these examples, I agree that it's not the join that's bad, it's the use after join.
I
I can probably work something out around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still means that we need to introduce a state after the join which can represent the "unusable" variable.
Thinking about this a bit, an unpleasant consequence of this is that we need to take this state into account in joins. The following program shouldn't be legal:
var x;
if (b) {
if (b) {
x = 3.0;
} else {
x = 3;
} // x is joined, and loses its IIT
} else {
x = "hello";
} // x has no IIT on one path, and an IIT of `String` on another.
// As currently specified, we would infer `String?` which is wrongo.
// So we need to track the fact that `x` had an IIT at some point
// in the past. It might be sufficient to use "potentially assigned" as
// a proxy for this. Complicated....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lean slightly towards making the use of the variable the error and not the join itself. Pragmatically, that allows more programs that don't do anything harmful. And I think it draws the user's attention to a concrete entity — a variable they declared that they are using — instead of to a somewhat nebulous control flow concept.
Just a comment, as I run through updating the test cases. This isn't how it ends up feeling to me. When the error is at the join point, there's a clear error you can report: I can't infer a type for this variable. When the error is at the use point, it feels weirder. What do you say? "This variable can't be used because I couldn't infer a type" is the best I could get. You can say "Type can't be inferred for a variable", but that's going to be confusing because I had code that worked until I added a use of a variable. The user says "you could infer it before, why not now?".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a patch set implementing this. You can see the changes from that commit only here: ee0d27d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the complexity of working with code that relies on these rules, I agree that reporting the error at the join is preferable.
Otherwise we'll have a situation where any number of variables could have a broken type for any number of lines, and you can't readily see how far back you need to go to find (1) the join where the problem can first be detected, and (2) the predecessors in the flow graph where some unintended IIT caused the problem. The developer needs to find that second location, and we shouldn't make that task intractable.
@lrhn wrote:
With final, we don't make it an error to join two paths where one is
definitely assigned and the other is not.
Then maybe we should report that error at the join as well.
I don't think it's helpful to allow the situation where several variables are in scope, but it is a compile-time error to use each of them for any purpose (read/write), so the situation cannot be fixed.
variable are intended to apply to all form of writes. | ||
|
||
It is a compile time error to assign a value to a `final`, non-`late` local | ||
variable which is **potentially assigned**. Thus, it is *not* an error to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd drop the Thus
, the latter sentence is not implied by the former (unless you know that there are no other rules anywhere which affects the assignability), but the word made me think it should be.
if it is **definitely assigned**. Thus, it is *not* an error to assign to a | ||
**potentially unassigned** `final`, `late` local variable. | ||
|
||
*Note that a variable is always considered **definitely assigned** and not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is non-normative. I couldn't find the corresponding normative definition in a quick scan.
} | ||
``` | ||
|
||
Local variables with no explicitly written type but with an initializer are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comma before "but"? (And then probably after "initializer" too).
it is *promotable via initialization* as defined in | ||
the | ||
[flow analysis specification](https://github.com/dart-lang/language/blob/master/resources/type-system/flow-analysis.md). That | ||
is, the variable is in the **definitely unassigned** state, and the variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just: "is definitely unassigned".
Maybe: "is definitely unassigned prior to the assignment"
"State" sounds like it's temporal, able to change over time, but assignment is flow based, so it's a property of the same variable at each occurrence of it.
the latter case, the type is made nullable if the variable is declared as | ||
non-`late`, and the variable is treated as implicitly null initialized. | ||
Variables declared `late` are only made nullable if `Null` is explicitly | ||
assigned on one more paths into the join, and there is no implicit null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "and there is no implicit null initialization" mean?
Is it normative, or is it just explaining what the previous part of the statement already said?
Can it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this is normative. I can see that the "and there is no" is badly worded though, I will follow up if this PR stays live.
non-`late`, and the variable is treated as implicitly null initialized. | ||
Variables declared `late` are only made nullable if `Null` is explicitly | ||
assigned on one more paths into the join, and there is no implicit null | ||
initialization. In all cases, the values assigned must all have the same type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The values assigned" -> "the expression assigned", or rather the IIT of the paths which has one.
(Let's not confuse someone into looking at the actual values, if they are available).
|
||
|
||
```dart | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty example?
} // x has type `int?` | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to say that when the specification refers to the declared type of a variable, an untyped variable uses its IIT instead, if it has one (and it's probably an error if it doesn't have one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
Side note: I think the spec has similar problems with static and instance variables. E.g. the "variables" section of the spec says:
Let v be variable declared in an initializing variable declaration, and let e be the associated initializing expression. It is a compile-time error if the static type of e is not assignable to the declared type of v.
But if v is an instance variable, and the declaration of v doesn't include a type, and the type is inferred from a base class, then we want this rule to apply to the inferred type of v.
I wonder if it would be better to replace all instances of the term "declared type" in the spec with some other term, and define that term explicitly?
Looks very good now. I tried to find holes in the algorithm, but couldn't. |
written. The declared type of the variable is considered a "type of interest" | ||
in the sense defined in the flow analysis specification. If the variable has an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The declared type of the variable is considered a "type of interest" in the sense defined in the flow analysis specification.
This sentence contradicts what's currently written in flow-analysis.md (as do other things later in the pull request). I'm assuming that the text here should be considered definitive, and flow-analysis.md should be updated as a follow-up PR?
(I'm happy to take ownership of that follow-up work, BTW, just wanted to make sure that we're on the same page)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. To be clear, my intention is that the flow analysis spec be definitive. But the combination of the flow analysis spec not being generally readable + having bugs at the moment drove me to write a textual description in the feature spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yes, I intend there be follow up to land this in the flow analysis spec.
non-nullable type.* | ||
|
||
```dart | ||
void test() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this example is broken: x.isEven
is a compile-time error because x
has a potentially nullable type, and
It is an error to call a method, setter, getter or operator on an expression
whose type is potentially nullable and notdynamic
, except for the methods,
setters, getters, and operators onObject
.
We could "fix" the example by changing to x!.isEven
, but then it's not really an example of the benefit of considering the declared type to be a type of interest, because the promotion is happening due to !
, and !
promotes regardless of whether the type in question is a type of interest.
I think the example we want is more like:
void test(bool b, int? i) {
if (b) {
i = 0; // promotes to `int` since it's a type of interest
} else {
i ??= 0; // promotes to `int`
}
i.isEven; // Valid since i is promoted on both branches
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Leaf is trying to show an example of the last sentence — the initializer itself promotes to the non-nullable type. So these two would be equivalent:
void test1() {
int? i = 1; // Promote on initialization.
i.isEven;
}
void test2() {
int? i;
i = 1; // Promote on assignment.
i.isEven;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, understood. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, lines 690 + 693 are intended to make this code non-erroneous. @stereotype441 if you don't think I've correctly captured that we should discuss further.
then the **initialization inferred type** for the variable after the join point | ||
is computed as follows: | ||
- If the **initialization inferred type** on each path which has one | ||
(reachable or not) is the same type `T` (using syntactic equality), then the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Syntactic equality would be difficult to implement here (at least for the analyzer), since it does some pre-normalization in its representation of types. And I don't think there's really a benefit over normalized type equality. Can we change using syntactic equality
here (and below) to something like where "same" means equal after applying **NORM**
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use NORM but I'm not sure we really want to. To give a precise accounting of what the analyzer and the CFE are doing, we need a semi-normal form which collapses multiple ??
as appropriate. This is already implicitly assumed all over the places, so I'm inclined to just assume that here. But maybe NORM is reasonable? I'll think about it.
It is an error to read an **untyped variable** when it has no **initialization | ||
inferred type**. | ||
|
||
It is an error to write to an **untyped variable** when it has no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? In all the cases I can think of where an untyped variable is written to and there's no initialization inferred type, either the write is an initializing assignment or there is already some other error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the exact details of the proposal, but for this variant at least, yes:
var x;
if (true) {
while(b) {
x = 3; // Not an initializing write
}
// x is still untyped
} else {
x = "hello"; // x inferred to String
} // x is untyped on one branch, `String` on another, so infer `String?`
x?.length; // x is a non-null int, what does this mean?
type** is consistent across joins. Any variable which is given an | ||
**initialization inferred type** in this way may be treated by IDEs and tools as | ||
having this type for the purposes of documentation and error messages. An | ||
untyped cariable which is never read nor written may have no **initialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cariable -> variable
} // x has type `int?` | ||
} | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
Side note: I think the spec has similar problems with static and instance variables. E.g. the "variables" section of the spec says:
Let v be variable declared in an initializing variable declaration, and let e be the associated initializing expression. It is a compile-time error if the static type of e is not assignable to the declared type of v.
But if v is an instance variable, and the declaration of v doesn't include a type, and the type is inferred from a base class, then we want this rule to apply to the inferred type of v.
I wonder if it would be better to replace all instances of the term "declared type" in the spec with some other term, and define that term explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are a couple of gaps for how to handle joins where all IITs are Null
. I'm still not sold on implicitly null-initializing non-final variables, but I'm also not totally opposed. My intuition is that it would be better to make it explicit, but this proposal is certainly better than doing nothing at all.
take into account **definite assignment** and **definite unassignment** (see the | ||
section on Definite Assignment below). We say that a variable is **potentially | ||
assigned** if it is not **definitely unassigned**, and that a variable is | ||
**potentially unassigned** if it is not **definitely assigned**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was very confused about the two "potential" states for a while because I couldn't tell how they differed. That's because I assumed that these states were mutually exclusive. But on further reading, I guess that's not the intent? Am I correct that these mean:
- Definitely assigned: All paths assign the variable.
- Definitely assigned: No path assigns the variable.
- Potentially assigned: There exists at least one path which assigns the variable. There may be paths that do not, or it may be that all paths do. So every definitely assigned variable is also potentially assigned.
- Potentially unassigned: There exists at least one path which does not assign the variable. There may be paths that do, or it may be that no paths do. So every definitely unassigned variable is also potentially unassigned.
If the "potentially" states are intended to be exclusive of the "definitely" states, then I think that implies that there is only one potentially state:
- Either: There exists at least one path which assigns (to exclude it from the definitely unassigned state) and at least one path which does not assign (to exclude it from the definitely assigned state).
Am I understanding this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are not exclusive. I introduced these terms in response to feedback that just using "not definitely assigned" (i.e. could be "Either" or could be "Definitely unassigned") was confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those terms are helpful for the overall readability.
|
||
It is a compile time error to read a local variable when the variable is | ||
**potentially unassigned** unless the variable is non-final and has nullable | ||
type, or is `late`, or is declared using `var` with no type and no initializer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop the "and no initializer" because it the variable had an initializer, it would be definitely assigned and not potentially unassigned.
non-nullable type.* | ||
|
||
```dart | ||
void test() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Leaf is trying to show an example of the last sentence — the initializer itself promotes to the non-nullable type. So these two would be equivalent:
void test1() {
int? i = 1; // Promote on initialization.
i.isEven;
}
void test2() {
int? i;
i = 1; // Promote on assignment.
i.isEven;
}
if (b) { | ||
x = 3; | ||
} | ||
// x has type int? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep thinking about this case and I still worry that we're implicitly giving a user a nullable type that they didn't request. I wish we could instead make this case an error and tell them to solve it by writing:
var x = null; // User opts in to null.
if (b) {
x = 3;
}
This is just a hunch, though, and I could be wrong. Here's the evidence I have in mind:
-
Users often tell us that they want more "missing return" errors even in functions that can return
null
. Many seem to want this to be an error:Object foo(bool c) { // Object? with NNBD. if (c) return "thing"; }
And instead want to have to write:
Object foo(bool c) { // Object? with NNBD. if (c) return "thing"; return null; }
-
Dart requires final fields to be definitely initialized in the constructor initialization list, even when
null
is a valid value. Users don't seem to mind having to put an explicit= null
there. This may be confounded by them having different preferences between final and non-final fields, but I'm not sure about that. -
I've definitely had a number of users over the years tell me they are surprised by the Effective Dart rule that tells them to rely on default initialization to
null
. They felt it should have been made explicit.
*Note that a variable is always considered **definitely assigned** and not | ||
**definitely unassigned** if it has an explicit initializer, or an implicit | ||
initializer as part of a larger construct (e.g. the loop variable in a `for in` | ||
construct).* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is yet another place where model 3 (#946 (comment)) would require an exception: With var x = null;
, x
would be considered definitely unassigned, or no assignment after the declaration could be considered an initializing assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm thinking we'll need to have intiailziation/typing separate from assignment.
A variable is either typed, untyped (with nullable hint or not), or initialization-typed (or maybe also invalid-typed if we want to continue after incompatible joins).
If it's untyped and assigned to, then it becomes initialization-typed. We can join initialization-typed paths for variables, maybe untyped ones too (if we want to).
Abandoning this, we didn't take this approach. |
This a variant of the simple definite assignment proposal which requires all initializing assignment to assign values of the same type except that
Null
is also allowed, in which case the variable is made nullable. It also allows variables declaredvar
to not be assigned on every path, in which case the type is also made nullable.