Skip to content

Disallow "identity escapes" and "escaped whitespace" in string literals. #3228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lrhn opened this issue Jul 19, 2023 · 1 comment
Open
Labels
feature Proposed language feature that solves one or more problems

Comments

@lrhn
Copy link
Member

lrhn commented Jul 19, 2023

Dart currently allows string literals with useless "escapes", like "ab\cie\o{101}\0" or "\b0\d+".

We should disallow every "escape" which is not one of the currently defined Dart string escapes, those which actually do something. Those are currently:

\b, \n, \f, \v, \r, \t, \xHH, \uHHHH, \u{H....}, \', \", \\

The remaining escapes are called "identity escapes" in the specification, because the character introduced by the escaped character is the escaped character, and the string would be unchanged by removing the \.
It's an unnecessary and error-prone affordance to allow meaningless escape sequences.

For example, someone coming from C may use \a or \e in a string and expect it to work.
Best-case, they have enabled the lint which tells them that it's not a valid escape.
Semi-best case their tests catch that the code doesn't do what they expect.
Worst case it goes into production and doesn't do what they expect.

The last case has happened. When enabling the lint, we have caught RegExp code in production which had forgotten to make the string raw, and therefore, for example, the \b was interpreted as U+0008 rather than "word boundary", and \d as d instead of [0-9], or (possibly the most common and hardest to catch example), a RegExp("Some text\.") which intended to match a '.' character, but ended up matching any character at that point.
And by an unlucky accident, the tests didn't hit the case where it made a difference, the "validation" regexp was just too permissive, maybe only around some edge cases.

Also, the multi-line string production ignores the first line of the string if it contains only whitespace characters (space or tab), or "escaped whitespace" characters, which means a backslash before a space or tab, or just before the line terminator. There is no good reason to allow those "escape" backslashes. (They are not escapes, hence the quotes, since they are allowed even in raw strings, where escapes do not exist.)

That's a parser complication with absolutely no practical use. We should change it to only remove the first line if it's entirely non-escaped whitesapce. Then using an escape is actually a way prevent that removal, without introducing a non-whitespace character. Well, or would be if we still allow escaping whitspace.
(Here I'd perhaps suggest removing-trailing whitespace on every multiline string line, and allow a \ followed by only zero or more - to be removed - whitspace until the line break, to be allowed to end a non-raw multiline string line, as a way to allow trailing spaces. Like:

var boxText = """
Banana Watermelon Cantaloupe      \
Gemini Foxtrot Tango, Honolulu,   \
Bahama.                           \
""";

and then removing trailing whitespace from all other multiline string lines.
But making multiline strings more useful is a separate issue. For now I'd just not allow the escaped-whitspace which does asbolutely nothing anyway - removing it has no effect on the program.)

@lrhn lrhn added the feature Proposed language feature that solves one or more problems label Jul 19, 2023
@freemansoft
Copy link

The flutter recommended lint settings now catch \a and \0

Unnecessary escape in string literal. Remove the '' escape

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

2 participants