Skip to content

Conversation

@elibarzilay
Copy link
Contributor

@elibarzilay elibarzilay commented May 28, 2021

Fix template string refactoring and nodeFactory bug

Instead of letting createTemplate* generate a broken raw string from
the cooked one, grab the source code for it.

Also, add a missing bit to \-quote $s. As the comment in the code
says, it could just \-quote ${ since other $s are valid, but I
think that it's less confusing to always quote $s (but the change is in
the comment if minimalism is preferred).

Also, a small-but-confusing bug in getCookedText().

Many tests for all of this.

Fixes #40625

@typescript-bot typescript-bot added Author: Team For Milestone Bug PRs that fix a bug with a specific milestone labels May 28, 2021
@elibarzilay elibarzilay force-pushed the 40625 branch 2 times, most recently from e2da2bb to d069dcf Compare June 3, 2021 04:41
@elibarzilay
Copy link
Contributor Author

Question

This is using JSON.stringify to do the escaping, which is generally fine but one possible improvement is handling NUL characters. As is, they're converted to \u0000. It might be better to make these convert to \0 instead unless it's followed by a digit. But this is rare and I'm not sure if it's worth the extra code.

@sandersn sandersn requested review from rbuckton and weswigham June 8, 2021 22:13
@RyanCavanaugh
Copy link
Member

ping @rbuckton

@andrewbranch andrewbranch dismissed their stale review July 6, 2021 14:39

Review was incorrect

// @api
function createTemplateLiteralLikeNode(kind: TemplateLiteralToken["kind"], text: string, rawText: string | undefined, templateFlags: TokenFlags | undefined) {
const node = createBaseToken<TemplateLiteralLikeNode>(kind);
rawText ??= JSON.stringify(text).slice(1,-1).replace(/\\.|\$/g, s =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should try to synthesize rawText here. Generally when rawText is undefined we intend to pull the "raw" version of the template literal from the source file during emit. rawText was added to allow API consumers to pass their own "raw" text, and in those cases it goes through createTemplateLiteralLikeNodeChecked which verifies that the "cooked" text matches the supplied "raw" text.

If someone were to pass the string "\t " (byte sequence 5c 74 09) for text, this would generate a "raw" text of "\t\t". There's no way to be 100% sure what the user intended, so its better to be explicit.

I think the issue is actually in the implementation of the refactor, which should be reading the raw source text of the string literal rather than node.text.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its also important that rawText remains undefined for our tagged template transform to accurately get the actual raw text from the source file for non-user-created nodes:

function getRawLiteral(node: TemplateLiteralLikeNode, currentSourceFile: SourceFile) {
// Find original source text, since we need to emit the raw strings of the tagged template.
// The raw strings contain the (escaped) strings of what the user wrote.
// Examples: `\n` is converted to "\\n", a template string with a newline to "\n".
let text = node.rawText;
if (text === undefined) {
text = getSourceTextOfNodeFromSourceFile(currentSourceFile, node);

@elibarzilay elibarzilay force-pushed the 40625 branch 2 times, most recently from 30d1f7c to ee7b39d Compare July 15, 2021 00:16
@elibarzilay elibarzilay changed the title Generate escaped string in createTemplateHead* Fix template string refactoring and nodeFactory bug Jul 15, 2021

let token = rawTextScanner.scan();
if (token === SyntaxKind.CloseBracketToken) {
if (token === SyntaxKind.CloseBraceToken) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious typo I wish I'd caught sooner 😔. Good catch.

elibarzilay added a commit to elibarzilay/TypeScript that referenced this pull request Jul 19, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. Some `||` -> `??`.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.

Related to microsoft#44313 and microsoft#40625.
elibarzilay added a commit to elibarzilay/TypeScript that referenced this pull request Jul 19, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. Some `||` -> `??`.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.

Related to microsoft#44313 and microsoft#40625.
Copy link
Contributor

@rbuckton rbuckton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits but this looks good.

@elibarzilay elibarzilay force-pushed the 40625 branch 2 times, most recently from 93acbe8 to 8e01e23 Compare July 28, 2021 21:06
elibarzilay added a commit to elibarzilay/TypeScript that referenced this pull request Jul 28, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. One `||` -> `??` change.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.

Related to microsoft#44313 and microsoft#40625.
Instead of letting `createTemplate*` generate a broken raw string from
the cooked one, grab the source code for it.

Also, add a missing bit to `\`-quote `$`s.  As the comment in the code
says, it could just `\`-quote `${` since other `$`s are valid, but I
think that it's less confusing to always quote $s (but the change is in
the comment if minimalism is preferred).

Also, a small-but-confusing bug in `getCookedText()`.

Many tests for all of this.

Fixes microsoft#40625
@elibarzilay elibarzilay merged commit 7e8bba6 into microsoft:main Jul 29, 2021
@elibarzilay elibarzilay deleted the 40625 branch July 29, 2021 08:23
elibarzilay added a commit to elibarzilay/TypeScript that referenced this pull request Jul 31, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. One `||` -> `??` change.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.
   Added test.

Related to microsoft#44313 and microsoft#40625.
elibarzilay added a commit to elibarzilay/TypeScript that referenced this pull request Aug 4, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. One `||` -> `??` change.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.
   Added test.

Related to microsoft#44313 and microsoft#40625.
elibarzilay added a commit that referenced this pull request Aug 5, 2021
1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if
   it is, then the following `getSourceTextOfNodeFromSourceFile` will
   return a bogus `""`.

2. One `||` -> `??` change.

3. `backtickQuoteEscapedCharsRegExp`: escape the usual control
   characters except for a simple LF.  This code does get used to
   generate backtick strings when `rawText` is not given, and not
   escaping things like TAB characters can get mangled by editor
   settings.  Worse, not escaping a CRLF and putting it verbatim in sthe
   string source will interpret it as LF, so add a special case for
   escaping these as `\r\n`.
   Added test.

Related to #44313 and #40625.
@microsoft microsoft locked as resolved and limited conversation to collaborators Oct 21, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Author: Team For Milestone Bug PRs that fix a bug with a specific milestone

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

VSCode "Convert to template" refactor unescapes literals unexpectedly

5 participants