Skip to content

Two optimizations for load_graph() #4294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 30, 2017
Merged

Conversation

gvanrossum
Copy link
Member

Profiling load_graph() found that there were some unnecessary stat() calls going on, and, more importantly, that Options.clone_for_module() is very inefficient if you have many sections in your mypy.ini file. I solved the latter by introducing a cache. An alternative design would move the cache into BuildManager -- that would avoid a reference cycle, but it turns out that having the cache in the root Options object means that it survives between dmypy check runs, and that's a nice win there. (This explains at least partly why load_graph() was slow even with everything loaded in memory.)

@JukkaL

Guido van Rossum added 3 commits November 29, 2017 20:28
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor things. Feel free to merge after you've updated the PR or you can merge as is and create a follow-up PR.

mypy/options.py Outdated
@@ -177,17 +180,22 @@ def __repr__(self) -> str:
return 'Options({})'.format(pprint.pformat(self.__dict__))

def clone_for_module(self, module: str) -> 'Options':
res = self.clone_cache.get(module)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring that mentions that after this method has been called the options object should be treated as immutable, because otherwise the cache would become stale.

@@ -1056,11 +1058,10 @@ def validate_meta(meta: Optional[CacheMeta], id: str, path: Optional[str],

# TODO: Share stat() outcome with find_module()
path = os.path.abspath(path)
# TODO: Don't use isfile() but check st.st_mode
if not os.path.isfile(path):
st = manager.get_stat(path) # TODO: Errors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding a is_file method to BuildManager instead and using it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bigger effort, we'd have to give BuildManager a cache results and use that everywhere. There's also an inefficiency in find_module(), which seems to make way to many stat() and listdir() calls, which I haven't tracked down.

@gvanrossum gvanrossum merged commit 9779e18 into python:master Nov 30, 2017
@gvanrossum gvanrossum deleted the opt-build branch November 30, 2017 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants