-
Notifications
You must be signed in to change notification settings - Fork 53
Optimize FileSerializationSink by using parking_lot::Mutex and avoiding heap allocations in write_atomic. #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think our benchmarks generate too little data to be realistic (less than a megabyte, I think). I'll look into that later. Also, a multi-threaded benchmark would be good. |
FYI, as I recall, the issue was that mmap was quite a bit slower on Windows. |
|
Here are some benchmark numbers with and without the patch (numbers in milliseconds):
The patch should be very good for Windows, especially for the multithreaded case. On Linux mmap is as fast (1 thread) or much faster (8 threads). I don't have any numbers for macOS. |
|
I can gather some macOS numbers if you're interested. |
|
If you have time, I'd be interested, yeah. I got the numbers by running |
|
Update: I just pushed the "baseline" version to https://github.com/michaelwoerister/measureme/tree/opt-file-sink-ref. |
|
Here's what I'm seeing on my 4 core 8 thread MBP:
|
|
Thanks, @wesleywiser! Looks like we actually might want to switch to |
wesleywiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the second commit still WIP? If not, this looks good to me overall.
|
The second commit is still work-in-progress, yes. I want to update the code in testing_common to handle multiple threads, which shouldn't be too hard. Thanks for review! |
|
I did some new Windows measurements of
|
3a6a0df to
58dbbd9
Compare
…ng heap allocations in write_atomic.
58dbbd9 to
b9d0111
Compare
|
This is ready for review now. |
| let file = fs::File::create(path)?; | ||
|
|
||
| Ok(FileSerializationSink { | ||
| data: Mutex::new((BufWriter::new(file), 0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the gain by removing the BufWriter? feels like the new code is similar to the BufWriter code
so will BufWriter::with_capacity(1024*512, file) give the same result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is that BufWriter does not allow directly writing to its buffer, so we are basically re-implementing BufWriter here. The interface of write_atomic requires there to be a writable output buffer.
|
have someone looked at how mush the result differ for the "fast path" where the buffer is updated and the "slow path" where the file is written? as we are blocking all threads from writing during the file write this can affect many events. |
|
@andjo403 I have not investigated variance. The bigger buffer should reduce the number of writes, while making each write larger. So they are less evenly distributed but the fixed overhead might amortize better. It would be nice to do the actual file writing in a background thread via some kind of double buffering scheme. I haven't tried to implement something like that though. |
This PR makes the
FileSerializationSinkexactly as fast as theMmapSerializationSinkin the benchmarks we have. But I only tested on Linux and I also remember that the benchmarks were no good indication of performance when used inrustc.It would be nice if we could get rid of the
MmapSerializationSinkbecause it keeps everything in memory until the end.I wonder how much work it would be to have benchmarks that actually run
rustc. It's a hassle to test this manually.