Skip to content

iter_mut codegen #105

Open
Open
@macpp

Description

@macpp

https://godbolt.org/z/v7fq5q7q8
following code:

pub fn example(x: &mut [f32; 1024]) {
    for x in x.iter_mut() {
        *x *= 4.0;
    }
}

when compiled with codegen_gcc gives following assembly:

example::example:
        test    rdi, rdi
        je      .L6
        movss   xmm1, DWORD PTR .LC0[rip]
        lea     rax, 4092[rdi]
        jmp     .L9
.L19:
        add     rdi, 4
        je      .L6
.L9:
        movss   xmm0, DWORD PTR [rdi]
        mulss   xmm0, xmm1
        movss   DWORD PTR [rdi], xmm0
        cmp     rdi, rax
        jne     .L19
.L6:
        ret
.LC0:
        .long   1082130432

compared to llvm codegen, it seems (at least to me, but i'm definietly not an asm expert) that:

  • llvm version is vectorised, gcc not
  • while llvm version has 1 label and conditional jump (which is not suprisings for simple loop), gcc version seems unnecessary convoluted, with 3 labels and 4 jump instructions
  • gcc version starts with something that looks like a null check , which is strange given that &mut cannot be null

i'm not sure if this is really a problem, but i figured out it is worth asking :)

Anyway, thank you for working on this project, it is really exciting development :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions