rustc has trouble inlining inside a Bencher closure #14149

SiegeLord · 2014-05-12T18:09:49Z

Consider this code:

#![crate_type="lib"]
#![crate_id="test"]

extern crate test;
extern crate rand;

use test::Bencher;
use rand::{weak_rng, Rng};

use std::cell::Cell;

fn bug(a: &Vector)
{
    for _ in range(0, 100)
    {
        let m = Multiplier::new(a, a);
        for i in range(0u, 10)
        {
            unsafe
            {
                a.unsafe_set(i, m.unsafe_get(i));
            }
        }
    }

    let mut sum = 0f32;
    for i in range(0u, 10)
    {
        unsafe
        {
            sum += a.unsafe_get(i);
        }
    }
    assert!(sum != 96.0);
}

#[bench]
fn vec_speed_vec1(bh: &mut Bencher) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        bug(a)
    })
}

#[bench]
fn vec_speed_vec2(bh: &mut Bencher) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        for _ in range(0, 100)
        {
            let m = Multiplier::new(a, a);
            for i in range(0u, 10)
            {
                unsafe
                {
                    a.unsafe_set(i, m.unsafe_get(i));
                }
            }
        }

        let mut sum = 0f32;
        for i in range(0u, 10)
        {
            unsafe
            {
                sum += a.unsafe_get(i);
            }
        }
        assert!(sum != 96.0);
    })
}

pub trait VectorGet
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32;
}

pub trait VectorSet
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32);
}

pub struct Vector
{
    data: Vec<Cell<f32>>
}

impl Vector
{
    pub fn new(data: &[f32]) -> Vector
    {
        Vector{ data: data.iter().map(|&v| Cell::new(v)).collect() }
    }
}

impl<'l>
VectorGet for
&'l Vector
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        (*self.data.as_slice().unsafe_ref(idx)).get()
    }
}

impl<'l>
Container for
&'l Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

impl<'l>
VectorSet for
Vector
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32)
    {
        self.data.as_slice().unsafe_ref(idx).set(val);
    }
}

impl<'l>
Container for
Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

pub struct Multiplier<TA, TB>
{
    a: TA,
    b: TB,
}

impl<TA: Container,
     TB: Container>
Multiplier<TA, TB>
{
    pub fn new(a: TA, b: TB) -> Multiplier<TA, TB>
    {
        assert!(a.len() == b.len());
        Multiplier{ a: a, b: b }
    }
}

impl<'l,
     TA: VectorGet + Container,
     TB: VectorGet + Container>
VectorGet for
Multiplier<TA, TB>
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        self.a.unsafe_get(idx) * self.b.unsafe_get(idx)
    }
}

impl<'l,
     TA: Container,
     TB: Container>
Container for Multiplier<TA, TB>
{
    fn len(&self) -> uint
    {
        self.a.len()
    }
}

Compiling and running it gives the following result:

$ rustc test.rs --test --opt-level 3
$ ./test --bench

running 2 tests
test vec_speed_vec1 ... bench:       498 ns/iter (+/- 17)
test vec_speed_vec2 ... bench:      1386 ns/iter (+/- 313)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

Note that the only difference between the two tests is that one has the Bencher closure contents hoisted out into a separate function. This speed difference is a regression since at least January 2014.

The text was updated successfully, but these errors were encountered:

SiegeLord · 2014-05-14T12:50:07Z

Here's a code that compiles with an old rustc (e139b49):

#[crate_type="lib"];
#[crate_id="old"];

extern mod extra;

use extra::test::BenchHarness;
use std::rand::{weak_rng, Rng};

use std::cell::Cell;

#[bench]
fn vec_speed_vec2(bh: &mut BenchHarness) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        for _ in range(0, 100)
        {
            let m = Multiplier::new(a, a);
            for i in range(0u, 10)
            {
                unsafe
                {
                    a.unsafe_set(i, m.unsafe_get(i));
                }
            }
        }

        let mut sum = 0f32;
        for i in range(0u, 10)
        {
            unsafe
            {
                sum += a.unsafe_get(i);
            }
        }
        assert!(sum != 96.0);
    })
}
pub trait VectorGet
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32;
}

pub trait VectorSet
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32);
}

pub struct Vector
{
    data: ~[Cell<f32>]
}

impl Vector
{
    pub fn new(data: &[f32]) -> Vector
    {
        Vector{ data: data.iter().map(|&v| Cell::new(v)).collect() }
    }
}

impl<'l>
VectorGet for
&'l Vector
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        (*self.data.as_slice().unsafe_ref(idx)).get()
    }
}

impl<'l>
Container for
&'l Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

impl<'l>
VectorSet for
Vector
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32)
    {
        self.data.as_slice().unsafe_ref(idx).set(val);
    }
}

impl<'l>
Container for
Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

pub struct Multiplier<TA, TB>
{
    a: TA,
    b: TB,
}

impl<TA: Container,
     TB: Container>
Multiplier<TA, TB>
{
    pub fn new(a: TA, b: TB) -> Multiplier<TA, TB>
    {
        assert!(a.len() == b.len());
        Multiplier{ a: a, b: b }
    }
}

impl<'l,
     TA: VectorGet + Container,
     TB: VectorGet + Container>
VectorGet for
Multiplier<TA, TB>
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        self.a.unsafe_get(idx) * self.b.unsafe_get(idx)
    }
}

impl<'l,
     TA: Container,
     TB: Container>
Container for Multiplier<TA, TB>
{
    fn len(&self) -> uint
    {
        self.a.len()
    }
}

The benchmark results:

running 2 tests
test vec_speed_vec1 ... bench:       491 ns/iter (+/- 4)
test vec_speed_vec2 ... bench:       841 ns/iter (+/- 28)

The vec_speed_vec2 speed getting slower in the new rustc is my main issue (I guess the difference between vec1 and vec2 is a red-herring).

emberian · 2014-05-14T14:30:58Z

I'll try bisecting this later today.

emberian · 2015-01-21T02:27:17Z

Never got arround to that...

huonw · 2016-01-05T12:37:17Z

This is more than a year old (nearly two), we've gone through many LLVM upgrades in that time, and it no longer reproduces---an updated version gives:

test vec_speed_vec1 ... bench:         136 ns/iter (+/- 4)
test vec_speed_vec2 ... bench:         137 ns/iter (+/- 6)

So, closing.

Trigger call info for more completions of signature having things

steveklabnik added the A-codegen Area: Code generation label Jan 23, 2015

huonw closed this as completed Jan 5, 2016

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 20, 2023

Auto merge of rust-lang#14149 - Veykril:completion, r=Veykril

523fea8

Trigger call info for more completions of signature having things

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rustc has trouble inlining inside a Bencher closure #14149

rustc has trouble inlining inside a Bencher closure #14149

SiegeLord commented May 12, 2014

SiegeLord commented May 14, 2014

Uh oh!

emberian commented May 14, 2014

Uh oh!

emberian commented Jan 21, 2015

Uh oh!

huonw commented Jan 5, 2016

Uh oh!

rustc has trouble inlining inside a Bencher closure #14149

rustc has trouble inlining inside a Bencher closure #14149

Comments

SiegeLord commented May 12, 2014

SiegeLord commented May 14, 2014

Uh oh!

emberian commented May 14, 2014

Uh oh!

emberian commented Jan 21, 2015

Uh oh!

huonw commented Jan 5, 2016

Uh oh!