Skip to content

getSourceFileOfNode function implementation #12053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jmlopez-rod opened this issue Nov 4, 2016 · 11 comments
Closed

getSourceFileOfNode function implementation #12053

jmlopez-rod opened this issue Nov 4, 2016 · 11 comments
Labels
Bug A bug in TypeScript Fixed A PR has been merged for this issue

Comments

@jmlopez-rod
Copy link

This is more of a question about the internals of typescript. For some reason I ended up with a Node object which had no parent. I'm pretty sure this is due to the way I configured the compiler using the API but this led me to the following function in the typescript source:

    function getSourceFileOfNode(node) {
        while (node && node.kind !== 256 /* SourceFile */) {
            node = node.parent;
        }
        return node;
    }

This function is eventually called when I use

NodeObject.prototype.getStart = function (sourceFile, includeJsDocComment) {
    return ts.getTokenPosOfNode(this, sourceFile, includeJsDocComment);
};

getStart has all optional parameters and I never pass in the sourceFile so getSourceFileOfNode is always triggered.

Question:

Is there a reason why a node object doesn't have a reference to the source file object to which it belongs? We already store its kind, parent and other information, why not a reference to the sourceFile?

I know machines are fast, but going up the tree up to the source file every time we need to see the start of a node seems inefficient. Things could be faster especially for cases in which we have a tree with many levels. As an example consider this project: https://github.com/buzinas/tslint-eslint-rules

I've been writing a few rules here and I decided to check out just how many times the function gets called during the build process:

    var count = 0;
    var calls = 0;
    function getSourceFileOfNode(node) {
        calls ++;
        while (node && node.kind !== 256 /* SourceFile */) {
            count++;
            node = node.parent;
        }
        console.log('count: ', [calls, count]);
        return node;
    }

I modified the function to print out the number of calls to the function and the number of times we go up in the tree, the result:

count:  [ 19505, 78762 ]

So instead of accessing the source file 19505 times we have to execute 59,257 more checks and assignments. Would it decrease the efficiency of typescript if we were to add a reference to source file?

@mhegazy
Copy link
Contributor

mhegazy commented Nov 4, 2016

For some reason I ended up with a Node object which had no parent

how did you call createSourceFile?

@jmlopez-rod
Copy link
Author

This is how I set up the compiler:

 const servicesHost: ts.LanguageServiceHost = {
    getScriptFileNames: () => modifiedFiles,
    getScriptVersion: (fileName) => '',
    getCurrentDirectory: () => process.cwd(),
    getScriptSnapshot: (fileName) => {
      if (!fs.existsSync(fileName)) {
        return undefined;
      }

      return ts.ScriptSnapshot.fromString(fs.readFileSync(fileName).toString());
    },
    getCompilationSettings: () => tsOptions,
    getDefaultLibFileName: (options) => ts.getDefaultLibFilePath(options),
  };

  const services: ts.LanguageService = ts.createLanguageService(
    servicesHost,
    ts.createDocumentRegistry()
  );

  const program: ts.Program = ts.createProgram(modifiedFiles, tsOptions);
  const emitResult: ts.EmitResult = program.emit();

I haven't gotten much time to look into what is happening, I ended up passing program to TSLint and thats when I found out the error that a Node did not have a parent. I tested the same file with plain TSLint and their configuration works fine.

@jmlopez-rod
Copy link
Author

Anyway, there might be a bug but I'm not reporting it since I don't know yet how to duplicate it with a minimal set of information. The function just caught my eye and I was curious to see if you guys think adding a reference could give us a little performance.

@vladima
Copy link
Contributor

vladima commented Nov 4, 2016

The problem with adding extra property to Node is that it will increase the size of each and every Node object and currently keeping memory usage within bounds is much more complicated and brittle part of performance work we are doing.

@jmlopez-rod
Copy link
Author

@vladima Perhaps, but isn't that what browsers do? Do they have to traverse the whole tree just to get the root of the document? In a node, doesn't it make sense to have a reference to the parent, the children and the root node? Most if not all data structures have a reference to the root node.

@jmlopez-rod
Copy link
Author

jmlopez-rod commented Nov 4, 2016

@mhegazy I was able to recreate the problem.

//test.js
var ts = require('typescript');
var _ = require('lodash');
var fs = require('fs');

function compile() {
  var results = {};
  var tsOptions = {
    "noImplicitAny": false,
    "emitDecoratorMetadata": true,
    "experimentalDecorators": true,
    "noEmitHelpers": true
  };
  var outDirectory = '.';
  var modifiedFiles = ['./tmp.ts'];

  var servicesHost = {
    getScriptFileNames: function () { return modifiedFiles; },
    getScriptVersion: function (fileName) { return ''; },
    getCurrentDirectory: function () { return process.cwd(); },
    getScriptSnapshot: function (fileName) {
      if (!fs.existsSync(fileName)) {
        return undefined;
      }
      return ts.ScriptSnapshot.fromString(fs.readFileSync(fileName).toString());
    },
    getCompilationSettings: function () { return tsOptions; },
    getDefaultLibFileName: function (options) { return ts.getDefaultLibFilePath(options); }
  };
  var services = ts.createLanguageService(servicesHost, ts.createDocumentRegistry());
  var program = ts.createProgram(modifiedFiles, tsOptions);
  var emitResult = program.emit();
  var preDiagnostics = ts.getPreEmitDiagnostics(program);
  var allDiagnostics = preDiagnostics.concat(emitResult.diagnostics);
  var emittedFiles = program.getSourceFiles().filter(function (x) { return !_.includes(x.fileName, 'node_modules'); });
  var src = emittedFiles[0];
  checkParent(src);
}

function checkParent(node) {
  if (!node.parent) {
    console.log("node is an orphan:", node.kind);
  }
  _.each(node.getChildren(), function(x) { checkParent(x); });
}

compile();

And here is a short file which will contain orphan nodes

//tmp.ts
function rayIntersectsSegment(p, p1, p2) {
  var a, b, mAB, mAP, _ref;
  _ref = p1[1] < p2[1] ? [p1, p2] : [p2, p1], a = _ref[0], b = _ref[1];
  if (p[1] === b[1] || p[1] === a[1]) {
    p[1] += Number.MIN_VALUE;
  }
  if (p[1] > b[1] || p[1] < a[1]) {
    return false;
  } else if (p[0] > a[0] && p[0] > b[0]) {
    return false;
  } else if (p[0] < a[0] && p[0] < b[0]) {
    return true;
  } else {
    mAB = (b[1] - a[1]) / (b[0] - a[0]);
    mAP = (p[1] - a[1]) / (p[0] - a[0]);
    return mAP > mAB;
  }
}

Running it

$ node test.js 
node is an orphan: 256
node is an orphan: 53
node is an orphan: 54

The only orphan we should have is the source file. Yet we get two nodes that did not get a parent assigned.

@vladima
Copy link
Contributor

vladima commented Nov 4, 2016

I don't think there is a general "right" way to do things, usually it is all about pros and cons of every approach. In our case it was a deliberate decision not to put reference to the source file in every node because:

  • we don't want to pay the cost of increasing the node size even node especially since node is already pretty large.
  • it can be easily mitigated since usually source file can be passed as an argument

@jmlopez-rod
Copy link
Author

@vladima I agree, choosing between performance and memory is a tough choice. Unfortunately, the name getStart sounds as if it is a freeby. Meaning that the access has no cost. Either way, I will start writing future lint rules by providing the source file in whichever method that takes it.

@vladima
Copy link
Contributor

vladima commented Nov 4, 2016

BTW, thank you for raising the issue - the PR with the fix is out

@vladima vladima added the Bug A bug in TypeScript label Nov 4, 2016
@jmlopez-rod
Copy link
Author

That's awesome, thanks @vladima. One more thing before this issue gets closed. Is there any documentation that states that we should be providing the source file object for better performance? I always wondered why typescript was giving us the option of providing the source file, now it makes sense.

@vladima
Copy link
Contributor

vladima commented Nov 4, 2016

No, I don't think we have any documentation that specifically outlines this aspect.

@vladima vladima added the Fixed A PR has been merged for this issue label Nov 5, 2016
@vladima vladima closed this as completed Nov 5, 2016
@jmlopez-rod jmlopez-rod added this to the TypeScript 2.1.2 milestone Nov 7, 2016
@microsoft microsoft locked and limited conversation to collaborators Jun 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug A bug in TypeScript Fixed A PR has been merged for this issue
Projects
None yet
Development

No branches or pull requests

3 participants