Skip to content

Memoize some rules to avoid exponential time in deep nesting #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 4, 2020

Conversation

pablogsal
Copy link

Additionally, fix the script to calculate the nesting level using the
generated extension.

Closes: #37

@pablogsal
Copy link
Author

The test are failing to download openssl for some reason... :S

@lysnikolaou
Copy link
Member

Merging upstream master would probably solve the problem.

@pablogsal
Copy link
Author

pablogsal commented Apr 3, 2020

Merging upstream master would probably solve the problem.

Hummm, merging upstream master in this PR creates a mess. Do you mean merging upstream/master in the pegen branch?

@lysnikolaou
Copy link
Member

lysnikolaou commented Apr 3, 2020

Merging upstream master would probably solve the problem.

Hummm, merging upstream master in this PR creates a mess. Do you mean merging upstream/master in the pegen branch?

Yes. And then merging pegen into this branch. That's how I do it.

@lysnikolaou
Copy link
Member

Sorry for being a bit abrupt amd not explaining things carefully. I'm on my phone.

@pablogsal
Copy link
Author

Sorry for being a bit abrupt amd not explaining things carefully. I'm on my phone.

No problem, I know that feeling very well :)

@pablogsal
Copy link
Author

Seems that we need to update pegen to account for python#19283

@pablogsal pablogsal force-pushed the fix_nesting branch 2 times, most recently from de10536 to 320c0c5 Compare April 3, 2020 20:03
@lysnikolaou
Copy link
Member

What about del_targets? Does something like del <'a.a'*20> parse normally?

@pablogsal
Copy link
Author

pablogsal commented Apr 3, 2020

What about del_targets? Does something like del <'a.a'*20> parse normally?

I tried with:

del a.a.a.a.a......a.a (more than 4000) and seems to parse fine. Could you double check?

@pablogsal
Copy link
Author

pablogsal commented Apr 3, 2020

What about del_targets? Does something like del <'a.a'*20> parse normally?

I tried and something like f"del {'(' * nesting_depth}x,{'),' * nesting_depth}" also has exponential time, so we need to memorize del_targets as well.

Edit: PR updated

Additionally, fix the script to calculate the nesting level using the
generated extension.
Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Also, really good catch! Thanks, Pablo!

Edit: I forgot something. Before we merge this, have you measured what the memory cost is for these changes? Maybe the PEP will need updating.

@pablogsal
Copy link
Author

Edit: I forgot something. Before we merge this, have you measured what the memory cost is for these changes?

It increased 0.62% parsing xxl.txt and 0.83% parsing ((((((((((((((((((((1)))))))))))))))))))).

@pablogsal
Copy link
Author

Checked also with the largest python file in the standard library (Lib/test/test_socket.py) and it increased the memory 1.58%.

@pablogsal
Copy link
Author

Are you ok with these results for landing this? :)

@lysnikolaou
Copy link
Member

How does it compare to the current parser now? Still a little bit ahead?

@pablogsal
Copy link
Author

How does it compare to the current parser now? Still a little bit ahead?

The speed seems untouched (maybe a bit faster due to the extra memoization) if you parse something that relies heavily on target, del_target or star_targer . It would be good if you could confirm these findings on your side when possible

@lysnikolaou
Copy link
Member

How does it compare to the current parser now? Still a little bit ahead?

The speed seems untouched (maybe a bit faster due to the extra memoization) if you parse something that relies heavily on target, del_target or star_targer . It would be good if you could confirm these findings on your side when possible

Of course, I'm working on a bug-fix for #38 and I'll verify everything right afterwards. Shouldn't be long.

@lysnikolaou lysnikolaou closed this Apr 3, 2020
@lysnikolaou
Copy link
Member

Oops! Sorry, accidentally pushed the wrong button.

@lysnikolaou lysnikolaou reopened this Apr 3, 2020
Copy link

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I discovered that script was broken when moving the code into the cpython repo, but didn't take time to look into it.

Any chance of also committing the script you used to plot the times? (In a separate PR if you need to clean it up first.)

@lysnikolaou
Copy link
Member

Verified everything. My measurements are very similar to those by @pablogsal. Everything that was too difficult to parse before, can be parsed smoothly now. So let's land this.

@gvanrossum
Copy link

I confirmed the quadratic behavior as well, and the fix works. Land it.

@pablogsal
Copy link
Author

Any chance of also committing the script you used to plot the times? (In a separate PR if you need to clean it up first.)

I will make another PR once I clean it up a bit :)

@pablogsal pablogsal merged commit 74736fe into pegen Apr 4, 2020
@pablogsal pablogsal deleted the fix_nesting branch April 4, 2020 16:51
@pablogsal
Copy link
Author

Thanks for the reviews and the confirmation of the problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should we restrict somehow the nesting?
3 participants