Skip to content

Add performance note for read_to_end and read_to_string #27213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

AlisdairO
Copy link
Contributor

In light of #27159, I thought it might be appropriate to add a little note to Read::read_to_end and Read::read_to_string regarding the fact that it's faster to size the buffer in advance where possible.

r? @steveklabnik

@steveklabnik
Copy link
Member

Hmmm, while this is nice to say, once it's in the documentation, it's part of the spec. @alexcrichton what do you think?

@AlisdairO
Copy link
Contributor Author

Very good point! At the least it would be better to couch it in terms of 'may' rather than 'will' - depends on if there's an overriding implementation after all.

I guess I'm partly concerned about first impressions - if a user comes along, tests reading in a file, and finds that it's significantly slower than a different language's less pure implementation, then it ought to be obvious from the documentation how to achieve better performance.

@alexcrichton
Copy link
Member

I think this may want to be worded in such a way as this is a pretty standard behavior of vectors rather than being special to just these two functions. If you pass &mut Vec<T> to any function then it will be faster 100% of the time if you pre-allocate space, so in that sense there's nothing special about these I/O functions. It's generally common knowledge that a vector which doesn't have to resize is faster than one that does.

In that sense I'm not sure this note is really needed?

@AlisdairO
Copy link
Contributor Author

I guess what I would mean here is that it was non-obvious to me (although it may be just me!) that the implementation of read_to_end for File wouldn't just handle optimal allocation of the Vec for me - in my thoughts at the time there was no reason why it couldn't.

The point regarding not wanting to use stat() is obviously a legitimate reason not to change the code, but I suspect that users coming from slightly higher level languages wouldn't really know to consider that when using the function - the fact that the implementation is relatively bare-bones is only revealed by looking at the code.

@AlisdairO
Copy link
Contributor Author

Also I realise that I might be labouring the point rather between two PRs - my apologies if so. Happy to close it if it's not a clear gain!

@@ -458,7 +458,8 @@ pub trait Read {
/// All bytes read from this source will be appended to the specified buffer
/// `buf`. This function will continuously call `read` to append more data to
/// `buf` until `read` returns either `Ok(0)` or an error of
/// non-`ErrorKind::Interrupted` kind.
/// non-`ErrorKind::Interrupted` kind. If the size of the input is known in
/// advance, performance will be improved by pre-reserving capacity in `buf`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between reserving and pre-reserving? Both are things that happen in advance isn't it?

@AlisdairO
Copy link
Contributor Author

I'm going to assume that we're leaving this one out, closing for now!

@AlisdairO AlisdairO closed this Aug 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants