Skip to content

Adding --emit=asm speeds up generated code because of codegen units #57235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jrmuizel opened this issue Dec 31, 2018 · 5 comments
Open

Adding --emit=asm speeds up generated code because of codegen units #57235

jrmuizel opened this issue Dec 31, 2018 · 5 comments
Labels
A-driver Area: rustc_driver that ties everything together into the `rustc` compiler C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jrmuizel
Copy link
Contributor

With the following rust code:

pub fn main() {
    print_triples();
    println!("hello");
}


fn print_triples() {
    let mut i = 0 as i32;
    for z in 1.. {
        for x in 1..=z {
            for y in x..=z {
                if x*x + y*y == z*z {
                    i = i + 1;
                    if i == 1000 {
                        return;
                    }
                }
            }
        }
    }
}

I get:

/tmp/pythagoras$ rustc --emit=link -O simple.rs
/tmp/pythagoras$ time ./simple
hello

real	0m0.290s
user	0m0.287s
sys	0m0.002s
/tmp/pythagoras$ rustc --emit=asm,link -O simple.rs
/tmp/pythagoras$ time ./simple
hello

real	0m0.005s
user	0m0.002s
sys	0m0.002s
/tmp/pythagoras$ rustc --version
rustc 1.32.0-nightly (400c2bc5e 2018-11-27)
/tmp/pythagoras$
@jonas-schievink
Copy link
Contributor

AFAIK this forces rustc to use a single codegen unit. This generally makes code run faster because every function is available for inlining, although I don't see what might cause such a drastic difference.

You can try to reproduce by using -Ccodegen-units=1.

@jrmuizel
Copy link
Contributor Author

Indeed -Ccodegen-units=1 fixes the problem. It's pretty surprising/dangerous that --emit=asm changes the generated code. Why is a single codegen unit forced with --emit=asm?

@jonas-schievink
Copy link
Contributor

jonas-schievink commented Dec 31, 2018

This was done in #30208. Multiple codegen units would result in multiple compilation outputs, which is generally not expected when using --emit (which should only output a single file).

EDIT: Also see #30063, which is now obsolete since the build system changed, but the discussion there is still relevant.

@nikic
Copy link
Contributor

nikic commented Dec 31, 2018

@jrmuizel I think the answer is that nobody has implemented the necessary handling for that. If there are multiple codegen units and we want to produce a single artifact, we'd have to merge the LLVM modules prior to emitting IR/BC/asm (unless LTO, either thin or fat, already takes care of that).

I agree that the current behavior is not great, as these are often used for debugging performance issues and changing the number of codegen-units can impact optimization a lot.

@lambda
Copy link
Contributor

lambda commented Dec 31, 2018

One thing to note is that the difference in speed listed here is a bit artificial; this program doesn't actually do anything in the print_triples loop, so it looks like with one codegen unit, it's eliminated entirely, while with the default settings its actually running the loop even though it has no effect.

@jonas-schievink jonas-schievink added A-driver Area: rustc_driver that ties everything together into the `rustc` compiler T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 21, 2019
@jrmuizel jrmuizel changed the title Adding --emit=asm speeds up generated code Adding --emit=asm speeds up generated code because of codegen units Jun 10, 2020
@Enselic Enselic added the C-bug Category: This is a bug. label Dec 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-driver Area: rustc_driver that ties everything together into the `rustc` compiler C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants