Break out of speculative parsing on bad parameter initializer #19158

ghost · 2017-10-13T16:12:47Z

Fixes #19134
Sequel to #18417

This will break out of speculative parsing immediately if we see a bad initializer, rather than generating the entire AST and then iterating over it to look for things which failed parsing.

This seems to result in an average 1% decrease in parse time, although results varied greatly when I tested. (-3.61%, +5.79%, -2.08%, +1.63%, -6.47%, -1.23%).

sandersn · 2017-10-13T17:15:07Z

Are those results for trials of the same input or different inputs?

ghost · 2017-10-13T21:33:08Z

In order, those are the numbers for:

Unions - node (v8.2.1, x64)
Unions - tsc (x86)
Monaco - node (v8.2.1, x64)
Monaco - tsc (x86)
TFS - node (v8.2.1, x64)
TFS - tsc (x86)

I made a new commit that removed uses of finally. It seems to have better performance -- a 3% improvement vs the master branch. On two runs:

-0.84, -0.48, -6.76, -4.45, -9.04, -1.66
-0.72, -0.48, -5.25, -3.88, -8.29, -2.89

ghost · 2017-10-17T17:17:11Z

@sandersn Any comments?

sandersn · 2017-10-17T18:20:47Z

So it sounds like performance is 4% better on node and 2% worse on tsc. This is likely fine, given how few people use anything besides node. @rbuckton would you agree?

Exceptions-for-control-flow seems like something that would get slower in future updates of V8 rather than faster (if it changes at all of course).

sandersn

The code looks OK, and I think the performance boost for node on V8 outweighs the mixed results for tsc on Chakra.

However, I'd like to have @rbuckton's opinion on the addition of an exception for control flow before merging.

sandersn · 2017-10-17T18:22:55Z

src/compiler/parser.ts

@@ -17,6 +17,14 @@ namespace ts {
    let IdentifierConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;
    let SourceFileConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;

+    let inSpeculation = false;
+    const GIVE_UP_SPECULATION = {};


extremely minor: STOP_SPECULATION sounds better to me

rbuckton · 2017-10-17T18:39:41Z

I'm generally not in favor of using exceptions for control flow and would prefer to keep them reserved for unexpected errors or asserts. @mhegazy can weigh in on his thoughts, but I'd prefer a solution that doesn't depend on exceptions for this.

ghost · 2017-10-20T20:50:17Z

Test failure fixed by #19387.
@rbuckton Please review again.

ghost · 2017-10-30T18:39:18Z

@rbuckton

ghost · 2017-11-02T16:47:49Z

@rbuckton

rbuckton · 2017-11-10T20:44:35Z

src/compiler/scanner.ts

            }
            return result;
        }

+        function startSpeculation(): SpeculationReset {


Was there a reason to use this approach over the previous approach? The previous approach guaranteed that saving/resetting state was properly balanced and didn't depend on an external party to remember to reset the state where necessary. Also, this results in object allocations every time we speculate, which could get expensive given how often we do speculation. I'm curious what the performance impact of this approach is over the previous approach.

One approach to avoid excessive allocations would be to reuse the same SpeculationReset object for shallow speculation, or maintain a pool of SpeculationReset objects to use for this purpose.

Another approach would be to have an array to use as a stack for each state variable and push/pop onto the stack, i.e.:

const posStack: number[] = []; const startPosStack: number[] = []; // ... function startSpeculation() { posStack.push(pos); startPosStack.push(startPos); // ... } /** @param accept If true, accept the current state; otherwise, reset to the prior state */ function endSpeculation(accept: boolean) { const savePos = posStack.pop(); const saveStartPos = startPosStack.pop(); // ... if (!accept) { pos = savePos; startPos = saveStartPos; // ... } }

I've reverted these changes. You can try that approach in a separate PR if you like.

rbuckton · 2017-11-10T20:50:42Z

src/compiler/parser.ts

-            signature.parameters = parseParameterList(flags);
+            const parameters = parseParameterList(flags, inSpeculation);
+            if (isFail(parameters)) {
+                return Fail;


Does this need to return Fail? It could just return boolean.

rbuckton · 2017-11-10T20:52:33Z

src/compiler/parser.ts

-            fillSignature(SyntaxKind.ColonToken, isAsync | (allowAmbiguity ? SignatureFlags.None : SignatureFlags.RequireCompleteParameterList), node);
+
+            const sigFail = fillSignature(SyntaxKind.ColonToken, isAsync | (inSpeculation ? SignatureFlags.RequireCompleteParameterList : SignatureFlags.None), node, inSpeculation);
+            if (isFail(sigFail)) {


No need for the local, inline sigFail.

rbuckton · 2017-11-10T20:53:36Z

src/compiler/parser.ts

+        }
+
+        function parseParameter(): ParameterDeclaration;
+        function parseParameter(inSpeculation?: boolean): ParameterDeclaration | Fail;


inSpeculation shouldn't be optional in this overload.

rbuckton · 2017-11-10T20:55:06Z

src/compiler/parser.ts

-                    list.push(parseListElement(kind, parseElement));
+                    const elem = parseListElement(kind, parseElement, inSpeculation);
+                    if (isFail(elem)) {
+                        return Fail;


Returning Fail here makes parseDelimitedList and anything that calls it polymorphic, since Fail isn't an array. Consider adding something like a ListFail sentinel value that has the properties of a NodeArray to reduce polymorphism.

rbuckton · 2017-11-10T20:56:16Z

src/compiler/parser.ts

-            const result = isLookAhead
-                ? scanner.lookAhead(callback)
-                : scanner.tryScan(callback);
+            const reset = scanner.startSpeculation();


I'm concerned about the performance cost of this approach. See my comments in scanner.ts.

rbuckton · 2017-11-10T20:59:56Z

src/compiler/parser.ts

@@ -17,6 +17,13 @@ namespace ts {
    let IdentifierConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;
    let SourceFileConstructor: new (kind: SyntaxKind, pos: number, end: number) => Node;

+    interface Fail { __FAIL: void; }
+    /** Only value of type Fail. */
+    const Fail: Fail = { __FAIL: undefined };


Fail may introduce added polymorphism because it doesn't have the same hidden class as a Node. I would recommend the Fail sentinel be a Node instance (possibly using SyntaxKind.Unknown), though you would have to wait to allocate Fail until parse time to ensure the correct objectAllocator instance from services has already been loaded (if present). That would ensure Fail has the same shape and ordering of properties that any other Node instance would start with.

rbuckton · 2017-11-10T21:00:35Z

src/compiler/parser.ts

+    interface Fail { __FAIL: void; }
+    /** Only value of type Fail. */
+    const Fail: Fail = { __FAIL: undefined };
+    function isFail(x: {} | null | undefined | void): x is Fail {


How is {} | null | undefined | void really any different from any in this case? This seems unnecessarily verbose.

rbuckton · 2017-11-10T21:04:06Z

src/compiler/parser.ts

-        function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: () => T, considerSemicolonAsDelimiter?: boolean): NodeArray<T> {
+        function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: () => T, considerSemicolonAsDelimiter?: boolean, inSpeculation?: false): NodeArray<T>;
+        function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: (inSpeculation: boolean) => T | Fail, considerSemicolonAsDelimiter?: boolean, inSpeculation?: boolean): NodeArray<T> | Fail;
+        function parseDelimitedList<T extends Node>(kind: ParsingContext, parseElement: (inSpeculation: boolean) => T | Fail, considerSemicolonAsDelimiter?: boolean, inSpeculation?: boolean): NodeArray<T> | Fail {


Do we need to thread inSpeculation here? The function passed as the parseElement parameter could easily be a wrapper to parse the element with inSpeculation true, similar to parseParameterNoSpeculation below does for the inverse.

rbuckton · 2017-11-10T21:04:46Z

src/compiler/parser.ts

@@ -2243,7 +2260,13 @@ namespace ts {
                isStartOfType(/*inStartOfParameter*/ true);
        }

-        function parseParameter(requireEqualsToken?: boolean): ParameterDeclaration {
+        function parseParameterNoSpeculation(): ParameterDeclaration {


Is this wrapper necessary, since the default behavior is the same as parseParameter()?

Removed parseParameter, now should always use this or parseParameterInSpeculation.

ghost · 2017-11-10T22:36:17Z

@rbuckton I think I've responded to all of your review now.

rbuckton · 2017-11-10T22:44:13Z

src/compiler/parser.ts

-        return x === Fail;
+    interface Fail extends Node { kind: SyntaxKind.Unknown; }
+    function fail(): Fail {
+        return createNode(SyntaxKind.Unknown) as Fail;


I still think that fail and failList should be singleton values local to the Parser namespace. You can easily initialize them for the first time in initializeState so that we don't create excess objects when speculative parsing fails.

rbuckton

A few more recommendations.

rbuckton · 2017-11-10T23:02:45Z

src/compiler/parser.ts

+        let Fail: Fail;
+        let FailList: FailList;
+        function isFail(x: Node | undefined): x is Fail {
+            return !!x && x.kind === SyntaxKind.Unknown;


return Fail !== undefined && x === Fail; doesn't require the property lookup and verifies that Fail has a value.

rbuckton · 2017-11-10T23:03:11Z

src/compiler/parser.ts

+            return !!x && x.kind === SyntaxKind.Unknown;
+        }
+        function isFailList(x: NodeArray<Node> | undefined): x is FailList {
+            return !!x && x.pos === -1;


return FailList !== undefined && x === FailList; doesn't require the property lookup and verifies that FailList has a value.

rbuckton · 2017-11-10T23:04:00Z

src/compiler/parser.ts

-            node.name = parseIdentifierOrPattern();
+            const name = parseIdentifierOrPattern(inSpeculation);
+            if (isFail(name)) {
+                return name;