-
Notifications
You must be signed in to change notification settings - Fork 13.3k
regex is less efficient than it could be #14029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's the expanded code I'm seeing with
|
I'm having trouble identifying where you're seeing heap allocation. I only see it for the return value. How much juice you want to squeeze really depends on how much analysis you want to do on the regex and how many implementations of a matching algorithm you want to write. For example, the current algorithm accounts for simple matching, the position of a match and the location of submatches. To get a tighter function, you might want to split these three cases out into their own separate matching algorithms. This is also complicated by the fact that we're trying to guarantee
The other direction we could go in is building a DFA. |
In-tree regex was removed in #21458 . This should probably be moved to https://github.com/rust-lang/regex |
This issue has been moved to the regex repo: rust-lang/regex#26 |
Consider this code:
Ideally this would optimize away to a small function that just iterates over the string and checks for characters other than 'a'.
Instead, it:
malloc
several times to start out;malloc
,char_range_at
,char_range_at_reverse
, etc.Without LTO, it generates about 7kb of code for one regex, or 34kb if I put 8 regexes in that function. Not the end of the world, but it adds up.
I recognize the regex implementation is new, but I thought this was worth filing anyway as room for improvement.
rustc 0.11-pre-nightly (2dcbad5 2014-05-06 22:01:43 -0700)
The text was updated successfully, but these errors were encountered: