Skip to content

generate less code for regex plugin #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rust-highfive opened this issue Jan 25, 2015 · 8 comments
Closed

generate less code for regex plugin #27

rust-highfive opened this issue Jan 25, 2015 · 8 comments

Comments

@rust-highfive
Copy link

Issue by huonw
Tuesday Apr 29, 2014 at 14:00 GMT

For earlier discussion, see rust-lang/rust#13842

This issue was labelled with: I-compiletime in the Rust repository


#![feature(phase)]
#![allow(dead_code)]

#[phase(syntax)] extern crate regex_macros;
extern crate regex;

#[cfg(short)]
fn short() {
    regex!("a");
}

#[cfg(medium)]
fn medium() {
    // 500
    regex!("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
}

#[cfg(long)]
fn long() {
    // 1000
    regex!("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
}

fn main() {}
$ for x in short medium long; do echo $x; time rustc regex.rs --cfg $x --no-trans; done
short

real    0m0.102s
user    0m0.092s
sys     0m0.012s
medium

real    0m1.384s
user    0m1.332s
sys     0m0.048s
long

real    0m3.612s
user    0m3.508s
sys     0m0.104s

They don't take nearly this long to just compile as dynamic ones, so I would guess it's the extra work that the generating macro is doing. (Note the --no-trans there, so it isn't just the extra code making LLVM slow.)

A perf trace identifies rustc librustc-4283bb68-0.11-pre.so [.] hashmap::HashMap$LT$K$C$$x20V$C$$x20H$GT$::search::h14555543045583792107::v0.11.pre as taking a lot (10.18%) of time.

@BurntSushi
Copy link
Member

@huonw I'm not sure this issue actually belongs on this repo, unless there is something about the regex! implementation that is abusing code generation?

@huonw
Copy link
Member

huonw commented Jun 19, 2015

I don't know the specifics of regex!, so it may very well be better suited to being returned to rust-lang/rust. (I assume the migration was mechanical.)

@BurntSushi BurntSushi changed the title Static regexes are very slow to create generate less code for regex plugin Jun 19, 2015
@BurntSushi
Copy link
Member

(Yup, was mechanical.) OK, I re-opened the issue on rust-lang/regex, but I'm also keeping this open because it would be nice to reduce the amount of code generated by regex! (which I think is one way of attacking the original problem).

@huonw
Copy link
Member

huonw commented Jun 19, 2015

I wonder if const fn allows for abstracting some of the code away. (That said, const fn can't do much computation so I guess it can't.)

@BurntSushi
Copy link
Member

Yeah, I doubt const fn would help (I read the original RFC long ago, not sure if it got more powerful since then). I think the real key to this will be two-fold:

  1. Do more analysis on the regex and generate cleverer code (like Ragel does and like OP's example).
  2. Stop embedding full NFA and fall back to the dynamic implementation. (There is not much difference between them now after recent perf improvements in the dynamic impl. regex! used to be better because it allocated less, but it no longer has that advantage!)

@arielb1
Copy link

arielb1 commented Sep 27, 2015

What is the code generated?

@BurntSushi
Copy link
Member

@arielb1 Right now, it's a full NFA simulation. The generator requires calling the regex parser and compiler before hand.

@BurntSushi
Copy link
Member

Closing in favor of #26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants