-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Parser redesign #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser redesign #880
Conversation
Use the @AbstractMethod decorator on abstract methods which are implemented in all subtypes. Part of python#730
This helps to point out the offending unimplemented abstract methods in subtypes. Part of python#730.
8372f6c
to
9cb9364
Compare
@JukkaL I tried dropping o11c@569d17e but it looks like it's still necessary, I have no idea what it should actually be doing though. |
Okay, rebased on top of rebased driver. Haven't worked on lint issues yet - still have testcases to write first. |
I still can't get it work in standalone parser mode:
Can you repro the issue? |
Ah, that's support for Thinking about merging the trees for testlistnocond, testlist, and sequenceitemlist, maybe slices ... and compfor/compif (just keep the different parsers) ... python3.5 is so inconsistent, e.g. |
Ok, I gave this a cursory look. Here are just some general high-level observations. As mentioned below, reviewing this will be a huge amount of work, and in order to make my time investment worthwhile this needs to improve things in general for mypy. I can't do a detailed review until it works and it looks likely that it will be a net benefit (benefit of having the PR minus the cost of reviewing and related overhead). I'm happy give high level feedback (similar to this) while you are working on this, however, now that I've finally merged the other PRs. PerformanceI mentioned previously that type checking performance is important. I still have no good idea of how this performs relative to the old parser, as I haven't been able to try it with any non-trivial pieces of code. I'd recommend addressing this first by posting some benchmarks (and instructions for how to run those benchmarks). Currently one of the biggest obstacles for wider use of mypy is type checking speed. If we get new features, it's okay to sacrifice a little speed, but even a 10% performance regression (for the whole type checker, which would mean maybe 30% parser performance regression) would be painful. The original parser wasn't written for efficiency really, so it may be possible to actually improve performance. I hope that's the case! SizeThis thing is frickin' huge! I didn't realize this until now. It's 11 kLOC (including tests) whereas the current mypy master (everything under Clarity/documentationI couldn't understand how the implementation works at a high level. Code clarity needs to be at least similar to the rest of the mypy implementation, and it isn't quite there yet. Adding comments and docstrings would probably help, but the most important thing would be to write a top-level summary or design doc of how things work and interact with the rest of mypy, with concrete examples. The current mypy implementation is mostly optimized for ease of modification and understanding, and I want to preserve that. ScopeThis seems to address many separate issues in one go. As I've mentioned before, I'd much prefer having changes done incrementally rather than in a rewrite-the-world fashion. Addressing single issue at a time makes things much easier to review and evaluate the impact (cost/benefit) to mypy. Maybe it's too late for this PR, but I hope any future changes approach things more incrementally. So in general a PR should be mostly about refactoring something (specific) or mostly adding a feature, but prefarably not both. This is clearly refactoring (it rewrites a ton of stuff) but it also adds features. StyleI'm worried that the design doesn't seem to quite follow the style used in the rest of mypy. Again, stylistic cohesion is important for the maintainability of mypy. Some individual things:
Costs/benefitsI'd like to see a short summary of the tradeoffs in the parser design. I.e., what it does better than the old parser (and how), with examples. Concrete next stepsIn summary, to avoid a similar bad experience as with the test driver, I suggest these concrete steps (in this order, before further work on this PR):
|
I'm not as likely to get this done as soon as I was thinking - there are some error cases that don't generate messages nearly as good as they should. That said, one thing I would like to do is replace more of the linear searches with hash lookups. The three major benefits of the new lexer/parser are:
I'm not sure what problem you have with scope - I don't see hardly any unrelated things being forced together here, though it probably would be possible to use the new lexer with the old parser if you really wanted to force that. Note that I have some local work in progress to refactor out I am quite aware that I haven't done benchmarking yet, but even with what I have, the (end-user) usability is greatly improved. The tree names I used follow the ones in Python documentation, I did not consult mypy's previous names at all. Unfortunately, Python's documentation contradicts itself with the names ... Let's look at number of lines of code (numbers from my working copy, might not line up exactly with the pushed version), by component:
|
Add classes for better dialect control
Catch incomplete TypeVisitor implementations
Don't merge yet! See below. The basic design is sound though.
This is a massive rewrite of the parser infrastructure to provide more useful information.
Features:
Todo:
mypy.syntax.parser
entry-point public)