Skip to content

rustc has trouble inlining inside a Bencher closure #14149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SiegeLord opened this issue May 12, 2014 · 4 comments
Closed

rustc has trouble inlining inside a Bencher closure #14149

SiegeLord opened this issue May 12, 2014 · 4 comments
Labels
A-codegen Area: Code generation

Comments

@SiegeLord
Copy link
Contributor

Consider this code:

#![crate_type="lib"]
#![crate_id="test"]

extern crate test;
extern crate rand;

use test::Bencher;
use rand::{weak_rng, Rng};

use std::cell::Cell;

fn bug(a: &Vector)
{
    for _ in range(0, 100)
    {
        let m = Multiplier::new(a, a);
        for i in range(0u, 10)
        {
            unsafe
            {
                a.unsafe_set(i, m.unsafe_get(i));
            }
        }
    }

    let mut sum = 0f32;
    for i in range(0u, 10)
    {
        unsafe
        {
            sum += a.unsafe_get(i);
        }
    }
    assert!(sum != 96.0);
}

#[bench]
fn vec_speed_vec1(bh: &mut Bencher) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        bug(a)
    })
}

#[bench]
fn vec_speed_vec2(bh: &mut Bencher) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        for _ in range(0, 100)
        {
            let m = Multiplier::new(a, a);
            for i in range(0u, 10)
            {
                unsafe
                {
                    a.unsafe_set(i, m.unsafe_get(i));
                }
            }
        }

        let mut sum = 0f32;
        for i in range(0u, 10)
        {
            unsafe
            {
                sum += a.unsafe_get(i);
            }
        }
        assert!(sum != 96.0);
    })
}

pub trait VectorGet
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32;
}

pub trait VectorSet
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32);
}

pub struct Vector
{
    data: Vec<Cell<f32>>
}

impl Vector
{
    pub fn new(data: &[f32]) -> Vector
    {
        Vector{ data: data.iter().map(|&v| Cell::new(v)).collect() }
    }
}

impl<'l>
VectorGet for
&'l Vector
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        (*self.data.as_slice().unsafe_ref(idx)).get()
    }
}

impl<'l>
Container for
&'l Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

impl<'l>
VectorSet for
Vector
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32)
    {
        self.data.as_slice().unsafe_ref(idx).set(val);
    }
}

impl<'l>
Container for
Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

pub struct Multiplier<TA, TB>
{
    a: TA,
    b: TB,
}

impl<TA: Container,
     TB: Container>
Multiplier<TA, TB>
{
    pub fn new(a: TA, b: TB) -> Multiplier<TA, TB>
    {
        assert!(a.len() == b.len());
        Multiplier{ a: a, b: b }
    }
}

impl<'l,
     TA: VectorGet + Container,
     TB: VectorGet + Container>
VectorGet for
Multiplier<TA, TB>
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        self.a.unsafe_get(idx) * self.b.unsafe_get(idx)
    }
}

impl<'l,
     TA: Container,
     TB: Container>
Container for Multiplier<TA, TB>
{
    fn len(&self) -> uint
    {
        self.a.len()
    }
}

Compiling and running it gives the following result:

$ rustc test.rs --test --opt-level 3
$ ./test --bench

running 2 tests
test vec_speed_vec1 ... bench:       498 ns/iter (+/- 17)
test vec_speed_vec2 ... bench:      1386 ns/iter (+/- 313)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

Note that the only difference between the two tests is that one has the Bencher closure contents hoisted out into a separate function. This speed difference is a regression since at least January 2014.

@SiegeLord
Copy link
Contributor Author

Here's a code that compiles with an old rustc (e139b49):

#[crate_type="lib"];
#[crate_id="old"];

extern mod extra;

use extra::test::BenchHarness;
use std::rand::{weak_rng, Rng};

use std::cell::Cell;

#[bench]
fn vec_speed_vec2(bh: &mut BenchHarness) {
    let mut rng = weak_rng();

    let a = &Vector::new(rng.gen_vec(10).slice(0, 10));

    bh.iter(|| {
        for _ in range(0, 100)
        {
            let m = Multiplier::new(a, a);
            for i in range(0u, 10)
            {
                unsafe
                {
                    a.unsafe_set(i, m.unsafe_get(i));
                }
            }
        }

        let mut sum = 0f32;
        for i in range(0u, 10)
        {
            unsafe
            {
                sum += a.unsafe_get(i);
            }
        }
        assert!(sum != 96.0);
    })
}
pub trait VectorGet
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32;
}

pub trait VectorSet
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32);
}

pub struct Vector
{
    data: ~[Cell<f32>]
}

impl Vector
{
    pub fn new(data: &[f32]) -> Vector
    {
        Vector{ data: data.iter().map(|&v| Cell::new(v)).collect() }
    }
}

impl<'l>
VectorGet for
&'l Vector
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        (*self.data.as_slice().unsafe_ref(idx)).get()
    }
}

impl<'l>
Container for
&'l Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

impl<'l>
VectorSet for
Vector
{
    unsafe fn unsafe_set(&self, idx: uint, val: f32)
    {
        self.data.as_slice().unsafe_ref(idx).set(val);
    }
}

impl<'l>
Container for
Vector
{
    fn len(&self) -> uint
    {
        self.data.len()
    }
}

pub struct Multiplier<TA, TB>
{
    a: TA,
    b: TB,
}

impl<TA: Container,
     TB: Container>
Multiplier<TA, TB>
{
    pub fn new(a: TA, b: TB) -> Multiplier<TA, TB>
    {
        assert!(a.len() == b.len());
        Multiplier{ a: a, b: b }
    }
}

impl<'l,
     TA: VectorGet + Container,
     TB: VectorGet + Container>
VectorGet for
Multiplier<TA, TB>
{
    unsafe fn unsafe_get(&self, idx: uint) -> f32
    {
        self.a.unsafe_get(idx) * self.b.unsafe_get(idx)
    }
}

impl<'l,
     TA: Container,
     TB: Container>
Container for Multiplier<TA, TB>
{
    fn len(&self) -> uint
    {
        self.a.len()
    }
}

The benchmark results:

running 2 tests
test vec_speed_vec1 ... bench:       491 ns/iter (+/- 4)
test vec_speed_vec2 ... bench:       841 ns/iter (+/- 28)

The vec_speed_vec2 speed getting slower in the new rustc is my main issue (I guess the difference between vec1 and vec2 is a red-herring).

@emberian
Copy link
Member

I'll try bisecting this later today.

@emberian
Copy link
Member

Never got arround to that...

@steveklabnik steveklabnik added the A-codegen Area: Code generation label Jan 23, 2015
@huonw
Copy link
Member

huonw commented Jan 5, 2016

This is more than a year old (nearly two), we've gone through many LLVM upgrades in that time, and it no longer reproduces---an updated version gives:

test vec_speed_vec1 ... bench:         136 ns/iter (+/- 4)
test vec_speed_vec2 ... bench:         137 ns/iter (+/- 6)

So, closing.

@huonw huonw closed this as completed Jan 5, 2016
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 20, 2023
Trigger call info for more completions of signature having things
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation
Projects
None yet
Development

No branches or pull requests

4 participants