Skip to content
This repository was archived by the owner on Dec 1, 2023. It is now read-only.

get rid of unnecessary buffering during decode #65

Closed
wants to merge 3 commits into from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Feb 26, 2015

babysteps for #57
sadly there's no impl<T:Iterator, E> Iterator for Result<T, E>, so I stole some code from FromIterator::from_iterator::<Result<...>>

@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Contributor

Hm could you take a look at some benchmarks for this as well? I think that calling .chars() on an I/O object is not performant compared to calling it on a str

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 27, 2015

benchmark:

    #[bench]
    fn bench_large_file(b: &mut Bencher) {
        use std::fs::File;
        use std::io::BufReader;
        let f = File::open("src/big_file.json").unwrap();
        let mut buf = BufReader::new(f);
        b.iter( || { let _ = Json::from_reader(&mut buf); });
    }

big_file.json consists of ~1000 lines of
{ "a": true, "b": null, "c":3.1415, "d": "Hello world", "e": [1,2,3]},

this pr:

test json::tests::bench_large_file      ... bench:      4515 ns/iter (+/- 273)

old:

test json::tests::bench_large_file      ... bench:     15356 ns/iter (+/- 711)

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 27, 2015

second benchmark:

    #[bench]
    fn bench_large_file_reopening(b: &mut Bencher) {
        use std::fs::File;
        use std::io::BufReader;
        b.iter( || {
            let f = File::open("src/big_file.json").unwrap();
            let mut buf = BufReader::new(f);
            let _ = Json::from_reader(&mut buf);
        });
    }

this pr:

test json::tests::bench_large_file_reopening ... bench:   3506554 ns/iter (+/- 54927)

old:

test json::tests::bench_large_file_reopening ... bench:   2793185 ns/iter (+/- 160453)

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 27, 2015

I don't even know what the first benchmark does... I'd have said after the first iteration all further iterations should be a nop...

@oli-obk
Copy link
Contributor Author

oli-obk commented Feb 27, 2015

There is a huge difference in the implementations of std::io::Chars and std::str::Chars.
So... to get the performance to the same level we need to use BufRead instead of Read and fill_buf instead of read. Should be simple enough to have std::io::Chars use a BufRead internally to the Read it was created with. Since ReadExt::chars consumes self, everything should be fine as long as noone calls it on a &mut (which apparently is fine: rust-lang/rust#22802)

Oliver Schneider added 2 commits February 27, 2015 10:11
@alexcrichton
Copy link
Contributor

The first benchmark may be skewed because it decodes once and then never decodes again (e.g. you don't seek back to the beginning). The second benchmark is what I would expect in that using std::io::Chars is much slower.

Even if we use BufRead I think that using std::io::Chars will be slower, but it would be worth benchmarking.

Also, could this avoid checking in a large file? It looks like it should be easy to generate in-memory.

@oli-obk oli-obk closed this Apr 29, 2015
@oli-obk oli-obk deleted the iterators branch April 29, 2015 14:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants