Skip to content

rustc generates terrible code for is_char_boundary #32471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eefriedman opened this issue Mar 24, 2016 · 2 comments
Closed

rustc generates terrible code for is_char_boundary #32471

eefriedman opened this issue Mar 24, 2016 · 2 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@eefriedman
Copy link
Contributor

Testcase:

#![crate_type="rlib"]
trait ZZZ { fn is_char_boundaryx(&self, index: usize) -> bool; }
impl ZZZ for str {
    fn is_char_boundaryx(&self, index: usize) -> bool {
        if index == self.len() { return true; }
        match self.as_bytes().get(index) {
            None => false,
            Some(&b) => b < 128 || b >= 192,
        }
    }
}

Excerpt from the assembly:

    movzbl  (%rdi,%rdx), %eax
    cmpl    $191, %eax
    seta    %cl
    shrb    $7, %al
    xorb    $1, %al
    orb %cl, %al
    setne   %al

This is generated from the expression b < 128 || b >= 192. The code should look more like this:

    movsbl  (%rdi,%rdx), %eax
    cmpl    $-65, %eax
    setg    %al

LLVM probably should be able to perform this optimization, but it looks like it's confusing itself by converting b < 128 into a shift.

CC @bluss.

@steveklabnik steveklabnik added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Mar 24, 2016
@bluss
Copy link
Member

bluss commented Mar 24, 2016

Thanks. Do you want me to experiment here?

@eefriedman
Copy link
Contributor Author

I'm not planning on digging into this any further.

steveklabnik pushed a commit to steveklabnik/rust that referenced this issue Apr 11, 2016
The asm generated for b < 128 || b >= 192 is not ideal, as it computes
both sub-inequalities. This patch replaces it with bit magic.

Fixes rust-lang#32471
steveklabnik added a commit to steveklabnik/rust that referenced this issue Apr 11, 2016
Bit-magic for faster is_char_boundary

The asm generated for b < 128 || b >= 192 is not ideal, as it computes
both sub-inequalities. This patch replaces it with bit magic.

Fixes rust-lang#32471
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

3 participants