You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dart currently allows string literals with useless "escapes", like "ab\cie\o{101}\0" or "\b0\d+".
We should disallow every "escape" which is not one of the currently defined Dart string escapes, those which actually do something. Those are currently:
The remaining escapes are called "identity escapes" in the specification, because the character introduced by the escaped character is the escaped character, and the string would be unchanged by removing the \.
It's an unnecessary and error-prone affordance to allow meaningless escape sequences.
For example, someone coming from C may use \a or \e in a string and expect it to work.
Best-case, they have enabled the lint which tells them that it's not a valid escape.
Semi-best case their tests catch that the code doesn't do what they expect.
Worst case it goes into production and doesn't do what they expect.
The last case has happened. When enabling the lint, we have caught RegExp code in production which had forgotten to make the string raw, and therefore, for example, the \b was interpreted as U+0008 rather than "word boundary", and \d as d instead of [0-9], or (possibly the most common and hardest to catch example), a RegExp("Some text\.") which intended to match a '.' character, but ended up matching any character at that point.
And by an unlucky accident, the tests didn't hit the case where it made a difference, the "validation" regexp was just too permissive, maybe only around some edge cases.
Also, the multi-line string production ignores the first line of the string if it contains only whitespace characters (space or tab), or "escaped whitespace" characters, which means a backslash before a space or tab, or just before the line terminator. There is no good reason to allow those "escape" backslashes. (They are not escapes, hence the quotes, since they are allowed even in raw strings, where escapes do not exist.)
That's a parser complication with absolutely no practical use. We should change it to only remove the first line if it's entirely non-escaped whitesapce. Then using an escape is actually a way prevent that removal, without introducing a non-whitespace character. Well, or would be if we still allow escaping whitspace.
(Here I'd perhaps suggest removing-trailing whitespace on every multiline string line, and allow a \ followed by only zero or more - to be removed - whitspace until the line break, to be allowed to end a non-raw multiline string line, as a way to allow trailing spaces. Like:
and then removing trailing whitespace from all other multiline string lines.
But making multiline strings more useful is a separate issue. For now I'd just not allow the escaped-whitspace which does asbolutely nothing anyway - removing it has no effect on the program.)
The text was updated successfully, but these errors were encountered:
lrhn
added
the
feature
Proposed language feature that solves one or more problems
label
Jul 19, 2023
Uh oh!
There was an error while loading. Please reload this page.
Dart currently allows string literals with useless "escapes", like
"ab\cie\o{101}\0"
or"\b0\d+"
.We should disallow every "escape" which is not one of the currently defined Dart string escapes, those which actually do something. Those are currently:
The remaining escapes are called "identity escapes" in the specification, because the character introduced by the escaped character is the escaped character, and the string would be unchanged by removing the
\
.It's an unnecessary and error-prone affordance to allow meaningless escape sequences.
For example, someone coming from C may use
\a
or\e
in a string and expect it to work.Best-case, they have enabled the lint which tells them that it's not a valid escape.
Semi-best case their tests catch that the code doesn't do what they expect.
Worst case it goes into production and doesn't do what they expect.
The last case has happened. When enabling the lint, we have caught RegExp code in production which had forgotten to make the string raw, and therefore, for example, the
\b
was interpreted as U+0008 rather than "word boundary", and\d
asd
instead of[0-9]
, or (possibly the most common and hardest to catch example), aRegExp("Some text\.")
which intended to match a '.' character, but ended up matching any character at that point.And by an unlucky accident, the tests didn't hit the case where it made a difference, the "validation" regexp was just too permissive, maybe only around some edge cases.
Also, the multi-line string production ignores the first line of the string if it contains only whitespace characters (space or tab), or "escaped whitespace" characters, which means a backslash before a space or tab, or just before the line terminator. There is no good reason to allow those "escape" backslashes. (They are not escapes, hence the quotes, since they are allowed even in raw strings, where escapes do not exist.)
That's a parser complication with absolutely no practical use. We should change it to only remove the first line if it's entirely non-escaped whitesapce. Then using an escape is actually a way prevent that removal, without introducing a non-whitespace character. Well, or would be if we still allow escaping whitspace.
(Here I'd perhaps suggest removing-trailing whitespace on every multiline string line, and allow a
\
followed by only zero or more - to be removed - whitspace until the line break, to be allowed to end a non-raw multiline string line, as a way to allow trailing spaces. Like:and then removing trailing whitespace from all other multiline string lines.
But making multiline strings more useful is a separate issue. For now I'd just not allow the escaped-whitspace which does asbolutely nothing anyway - removing it has no effect on the program.)
The text was updated successfully, but these errors were encountered: