Skip to content

Break out of speculative parsing on bad parameter initializer #19158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
13 commits merged into from
Nov 13, 2017

Conversation

ghost
Copy link

@ghost ghost commented Oct 13, 2017

Fixes #19134
Sequel to #18417

This will break out of speculative parsing immediately if we see a bad initializer, rather than generating the entire AST and then iterating over it to look for things which failed parsing.

This seems to result in an average 1% decrease in parse time, although results varied greatly when I tested. (-3.61%, +5.79%, -2.08%, +1.63%, -6.47%, -1.23%).

@ghost ghost requested review from sandersn and rbuckton October 13, 2017 16:12
@sandersn
Copy link
Member

Are those results for trials of the same input or different inputs?

@ghost
Copy link
Author

ghost commented Oct 13, 2017

In order, those are the numbers for:

  • Unions - node (v8.2.1, x64)
  • Unions - tsc (x86)
  • Monaco - node (v8.2.1, x64)
  • Monaco - tsc (x86)
  • TFS - node (v8.2.1, x64)
  • TFS - tsc (x86)

I made a new commit that removed uses of finally. It seems to have better performance -- a 3% improvement vs the master branch. On two runs:

  • -0.84, -0.48, -6.76, -4.45, -9.04, -1.66
  • -0.72, -0.48, -5.25, -3.88, -8.29, -2.89

@ghost
Copy link
Author

ghost commented Oct 17, 2017

@sandersn Any comments?

@sandersn
Copy link
Member

So it sounds like performance is 4% better on node and 2% worse on tsc. This is likely fine, given how few people use anything besides node. @rbuckton would you agree?

Exceptions-for-control-flow seems like something that would get slower in future updates of V8 rather than faster (if it changes at all of course).

Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks OK, and I think the performance boost for node on V8 outweighs the mixed results for tsc on Chakra.

However, I'd like to have @rbuckton's opinion on the addition of an exception for control flow before merging.

@@ -17,6 +17,14 @@ namespace ts {
let IdentifierConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;
let SourceFileConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;

let inSpeculation = false;
const GIVE_UP_SPECULATION = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extremely minor: STOP_SPECULATION sounds better to me

@rbuckton
Copy link
Contributor

I'm generally not in favor of using exceptions for control flow and would prefer to keep them reserved for unexpected errors or asserts. @mhegazy can weigh in on his thoughts, but I'd prefer a solution that doesn't depend on exceptions for this.

@ghost
Copy link
Author

ghost commented Oct 20, 2017

Test failure fixed by #19387.
@rbuckton Please review again.

@ghost
Copy link
Author

ghost commented Oct 30, 2017

@rbuckton

@ghost
Copy link
Author

ghost commented Nov 2, 2017

@rbuckton

}
return result;
}

function startSpeculation(): SpeculationReset {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to use this approach over the previous approach? The previous approach guaranteed that saving/resetting state was properly balanced and didn't depend on an external party to remember to reset the state where necessary. Also, this results in object allocations every time we speculate, which could get expensive given how often we do speculation. I'm curious what the performance impact of this approach is over the previous approach.

One approach to avoid excessive allocations would be to reuse the same SpeculationReset object for shallow speculation, or maintain a pool of SpeculationReset objects to use for this purpose.

Another approach would be to have an array to use as a stack for each state variable and push/pop onto the stack, i.e.:

const posStack: number[] = [];
const startPosStack: number[] = [];
// ...
function startSpeculation() {
  posStack.push(pos);
  startPosStack.push(startPos);
  // ...
}
/** @param accept If true, accept the current state; otherwise, reset to the prior state */
function endSpeculation(accept: boolean) {
  const savePos = posStack.pop();
  const saveStartPos = startPosStack.pop();
  // ...
  if (!accept) {
    pos = savePos;
    startPos = saveStartPos;
    // ...
  }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted these changes. You can try that approach in a separate PR if you like.

signature.parameters = parseParameterList(flags);
const parameters = parseParameterList(flags, inSpeculation);
if (isFail(parameters)) {
return Fail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to return Fail? It could just return boolean.

fillSignature(SyntaxKind.ColonToken, isAsync | (allowAmbiguity ? SignatureFlags.None : SignatureFlags.RequireCompleteParameterList), node);

const sigFail = fillSignature(SyntaxKind.ColonToken, isAsync | (inSpeculation ? SignatureFlags.RequireCompleteParameterList : SignatureFlags.None), node, inSpeculation);
if (isFail(sigFail)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the local, inline sigFail.

}

function parseParameter(): ParameterDeclaration;
function parseParameter(inSpeculation?: boolean): ParameterDeclaration | Fail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inSpeculation shouldn't be optional in this overload.

list.push(parseListElement(kind, parseElement));
const elem = parseListElement(kind, parseElement, inSpeculation);
if (isFail(elem)) {
return Fail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning Fail here makes parseDelimitedList and anything that calls it polymorphic, since Fail isn't an array. Consider adding something like a ListFail sentinel value that has the properties of a NodeArray to reduce polymorphism.

const result = isLookAhead
? scanner.lookAhead(callback)
: scanner.tryScan(callback);
const reset = scanner.startSpeculation();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the performance cost of this approach. See my comments in scanner.ts.

@@ -17,6 +17,13 @@ namespace ts {
let IdentifierConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;
let SourceFileConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;

interface Fail { __FAIL: void; }
/** Only value of type Fail. */
const Fail: Fail = { __FAIL: undefined };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fail may introduce added polymorphism because it doesn't have the same hidden class as a Node. I would recommend the Fail sentinel be a Node instance (possibly using SyntaxKind.Unknown), though you would have to wait to allocate Fail until parse time to ensure the correct objectAllocator instance from services has already been loaded (if present). That would ensure Fail has the same shape and ordering of properties that any other Node instance would start with.

interface Fail { __FAIL: void; }
/** Only value of type Fail. */
const Fail: Fail = { __FAIL: undefined };
function isFail(x: {} | null | undefined | void): x is Fail {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is {} | null | undefined | void really any different from any in this case? This seems unnecessarily verbose.

function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: () => T, considerSemicolonAsDelimiter?: boolean): NodeArray<T> {
function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: () => T, considerSemicolonAsDelimiter?: boolean, inSpeculation?: false): NodeArray<T>;
function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: (inSpeculation: boolean) => T | Fail, considerSemicolonAsDelimiter?: boolean, inSpeculation?: boolean): NodeArray<T> | Fail;
function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: (inSpeculation: boolean) => T | Fail, considerSemicolonAsDelimiter?: boolean, inSpeculation?: boolean): NodeArray<T> | Fail {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to thread inSpeculation here? The function passed as the parseElement parameter could easily be a wrapper to parse the element with inSpeculation true, similar to parseParameterNoSpeculation below does for the inverse.

@@ -2243,7 +2260,13 @@ namespace ts {
isStartOfType(/*inStartOfParameter*/ true);
}

function parseParameter(requireEqualsToken?: boolean): ParameterDeclaration {
function parseParameterNoSpeculation(): ParameterDeclaration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this wrapper necessary, since the default behavior is the same as parseParameter()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed parseParameter, now should always use this or parseParameterInSpeculation.

@ghost
Copy link
Author

ghost commented Nov 10, 2017

@rbuckton I think I've responded to all of your review now.

return x === Fail;
interface Fail extends Node { kind: SyntaxKind.Unknown; }
function fail(): Fail {
return createNode(SyntaxKind.Unknown) as Fail;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that fail and failList should be singleton values local to the Parser namespace. You can easily initialize them for the first time in initializeState so that we don't create excess objects when speculative parsing fails.

Copy link
Contributor

@rbuckton rbuckton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more recommendations.

let Fail: Fail;
let FailList: FailList;
function isFail(x: Node | undefined): x is Fail {
return !!x && x.kind === SyntaxKind.Unknown;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Fail !== undefined && x === Fail; doesn't require the property lookup and verifies that Fail has a value.

return !!x && x.kind === SyntaxKind.Unknown;
}
function isFailList(x: NodeArray<Node> | undefined): x is FailList {
return !!x && x.pos === -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return FailList !== undefined && x === FailList; doesn't require the property lookup and verifies that FailList has a value.

node.name = parseIdentifierOrPattern();
const name = parseIdentifierOrPattern(inSpeculation);
if (isFail(name)) {
return name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

node.initializer = parseInitializer(/*inParameter*/ true, requireEqualsToken);
const initializer = parseInitializer(/*inParameter*/ true, inSpeculation);
if (isFail(initializer)) {
return initializer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

node.initializer = parseInitializer(/*inParameter*/ false);
const name = parseIdentifierOrPattern(inSpeculation);
if (isFail(name)) {
return name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

node.name = name;
const init = parseInitializer(/*inParameter*/ false, inSpeculation);
if (isFail(init)) {
return init;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

node.name = parseIdentifierOrPattern();
const name = parseIdentifierOrPattern(inSpeculation);
if (isFail(name)) {
return name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

}
const init = parseInitializer(/*inParameter*/ false, inSpeculation);
if (isFail(init)) {
return init;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would recommend you just return Fail.

Copy link
Contributor

@rbuckton rbuckton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few more minor comments but they are not critical. This looks good.

@@ -48,6 +48,7 @@ namespace ts {
// Invokes the provided callback then unconditionally restores the scanner to the state it
// was in immediately prior to invoking the callback. The result of invoking the callback
// is returned from this function.
// Present for backwards compatibility only.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still true?

@@ -2271,7 +2295,13 @@ namespace ts {
isStartOfType(/*inStartOfParameter*/ true);
}

function parseParameter(requireEqualsToken?: boolean): ParameterDeclaration {
function parseParameterInSpeculation(): ParameterDeclaration | Fail {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not important, but I'd be tempted to call this one tryParseParameter and the other one parseParameter. We generally use tryParse to indicate speculation vs. parse to indicate definitive parsing, though that's not always the case.

@@ -5227,18 +5265,38 @@ namespace ts {

// DECLARATIONS

function parseArrayBindingElement(): ArrayBindingElement {
function parseArrayBindingElementInSpeculation(): ArrayBindingElement | Fail {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I'd recommend tryParseArrayBindingElement and parseArrayBindingElement.

return finishNode(node);
}

function parseObjectBindingElement(): BindingElement {
function parseObjectBindingElementInSpeculation(): BindingElement | Fail {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I'd recommend tryParseObjectBindingElement and parseObjectBindingElement.

@ghost ghost merged commit e7df832 into master Nov 13, 2017
@ghost ghost deleted the giveUpSpeculation branch November 13, 2017 17:18
ghost pushed a commit that referenced this pull request Nov 13, 2017
ghost pushed a commit that referenced this pull request Nov 13, 2017
@microsoft microsoft locked and limited conversation to collaborators Jun 14, 2018
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants