Skip to content

Conversation

@joshtriplett
Copy link
Member

@joshtriplett joshtriplett commented Nov 14, 2025

As discussed extensively in libs-api, the initialized-bytes tracking primarily benefits calls to read_buf that end up initializing the buffer and calling read, at the expense of calls to read_buf that don't need to initialize the buffer. Essentially, this optimizes for the past at the expense of the future. If people observe performance issues using read_buf (or something that calls it) with a given Read impl, they can fix those performance issues by implementing read_buf for that Read.

Update the documentation to stop talking about initialized-but-unfilled bytes.

Remove all functions that just deal with those bytes and their tracking, and remove usage of those methods.

Remove BorrowedCursor::advance as there's no longer a safe case for advancing within initialized-but-unfilled bytes. Rename BorrowedCursor::advance_unchecked to advance.

Update tests.

r? @Amanieu

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 14, 2025
@joshtriplett joshtriplett force-pushed the borrowed-buf-no-init-tracking branch from 7de0924 to d131ff2 Compare November 14, 2025 08:43
@joshtriplett joshtriplett added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 14, 2025
@rust-log-analyzer

This comment has been minimized.

@zachs18

This comment was marked as resolved.

@Amanieu

This comment was marked as resolved.

/// `BorrowedCursor`. The cursor has write-only access to the unfilled portion of the buffer.
///
/// The lifetime `'data` is a bound on the lifetime of the underlying data.
pub struct BorrowedBuf<'data> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we previously discussed, this should be made generic over T. For now it's fine to limit it to Copy types, but we may want to relax this later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amanieu My intention was to do that in a separate PR, to make this PR more manageable to review.

@joshtriplett
Copy link
Member Author

If we allow making a BorrowedBuf from both a &mut [u8] and &mut [MaybeUninit<u8>],

This is a great catch, thank you.

I would be inclined to remove the support for making a BorrowedBuf from a &mut [u8], since that largely defeats the purpose. It made sense when BorrowedBuf tracked initialization, but it doesn't provide as much value when not tracking initialization.

And removing the support for &mut [u8] seems preferable to avoid removing the support for getting a safe slice of [MaybeUninit<u8>].

@Amanieu
Copy link
Member

Amanieu commented Nov 22, 2025

We need support for constructing from a &mut [u8] in order to implement read using read_buf. I think the proper approach is to restrict the API so a cursor can only initialize but not de-initialize elements of the buffer.

@joshtriplett
Copy link
Member Author

joshtriplett commented Nov 26, 2025

I think the proper approach is to restrict the API so a cursor can only initialize but not de-initialize elements of the buffer.

Wouldn't that still require effectively all of the initialized-bytes tracking we currently have? Or did you have some other strategy in mind here?

Would it suffice to drop the change making as_mut safe and leave out the rest of the initialization tracking?

(That does still seem unfortunate, though, as it'd be that much harder to use this API from safe code.)

@Amanieu
Copy link
Member

Amanieu commented Nov 29, 2025

append can write data into the buffer without any issues related to initialization, which should be sufficient for safe code. Direct access to the uninitialized part of the buffer is only needed for unsafe code.

@joshtriplett
Copy link
Member Author

append can write data into the buffer without any issues related to initialization, which should be sufficient for safe code. Direct access to the uninitialized part of the buffer is only needed for unsafe code.

I see. Then I'll remove the as_mut change and keep everything else as it is.

…sor`

As discussed extensively in libs-api, the initialized-bytes tracking
primarily benefits calls to `read_buf` that end up initializing the
buffer and calling `read`, at the expense of calls to `read_buf` that
*don't* need to initialize the buffer. Essentially, this optimizes for
the past at the expense of the future. If people observe performance
issues using `read_buf` (or something that calls it) with a given `Read`
impl, they can fix those performance issues by implementing `read_buf`
for that `Read`.

Update the documentation to stop talking about initialized-but-unfilled
bytes.

Remove all functions that just deal with those bytes and their tracking,
and remove usage of those methods.

Remove `BorrowedCursor::advance` as there's no longer a safe case for
advancing within initialized-but-unfilled bytes. Rename
`BorrowedCursor::advance_unchecked` to `advance`.

Update tests.
@joshtriplett joshtriplett force-pushed the borrowed-buf-no-init-tracking branch from d131ff2 to 3825099 Compare December 2, 2025 09:32
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added O-SGX Target: SGX O-windows Operating system: Windows labels Dec 2, 2025
@Amanieu
Copy link
Member

Amanieu commented Dec 3, 2025

@bors r+

@bors
Copy link
Collaborator

bors commented Dec 3, 2025

📌 Commit 1ef636d has been approved by Amanieu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 3, 2025
bors added a commit that referenced this pull request Dec 3, 2025
Rollup of 3 pull requests

Successful merges:

 - #148937 (Remove initialized-bytes tracking from `BorrowedBuf` and `BorrowedCursor`)
 - #149553 (added default_uwtables=true to aarch64_unknown_none targets)
 - #149578 (rustdoc: Fix broken link to `Itertools::format`)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit that referenced this pull request Dec 3, 2025
Rollup of 3 pull requests

Successful merges:

 - #148937 (Remove initialized-bytes tracking from `BorrowedBuf` and `BorrowedCursor`)
 - #149553 (added default_uwtables=true to aarch64_unknown_none targets)
 - #149578 (rustdoc: Fix broken link to `Itertools::format`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 45b2a71 into rust-lang:main Dec 3, 2025
11 checks passed
@rustbot rustbot added this to the 1.93.0 milestone Dec 3, 2025
rust-timer added a commit that referenced this pull request Dec 3, 2025
Rollup merge of #148937 - joshtriplett:borrowed-buf-no-init-tracking, r=Amanieu

Remove initialized-bytes tracking from `BorrowedBuf` and `BorrowedCursor`

As discussed extensively in libs-api, the initialized-bytes tracking primarily benefits calls to `read_buf` that end up initializing the buffer and calling `read`, at the expense of calls to `read_buf` that *don't* need to initialize the buffer. Essentially, this optimizes for the past at the expense of the future. If people observe performance issues using `read_buf` (or something that calls it) with a given `Read` impl, they can fix those performance issues by implementing `read_buf` for that `Read`.

Update the documentation to stop talking about initialized-but-unfilled bytes.

Remove all functions that just deal with those bytes and their tracking, and remove usage of those methods.

Remove `BorrowedCursor::advance` as there's no longer a safe case for advancing within initialized-but-unfilled bytes. Rename `BorrowedCursor::advance_unchecked` to `advance`.

Update tests.

r? ``@Amanieu``
@joshtriplett joshtriplett deleted the borrowed-buf-no-init-tracking branch December 5, 2025 16:52
/// has write-only access to the unfilled portion of the buffer (you can think of it as a
/// write-only iterator).
/// Note that `BorrowedBuf` does not distinguish between uninitialized data and data that was
/// previously initialized but no longer contains valid data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"data... that no longer contains valid data" sounds a bit odd?

Copy link
Contributor

@a1phyr a1phyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I missed that PR before it was merged, but I don't think that it should have been merged as-is. I may have missed some discussion, though.

If people observe performance issues using read_buf (or something that calls it) with a given Read impl, they can fix those performance issues by implementing read_buf for that Read.

Well, actually they can't, because read_buf is unstable.

The design that was removed did not optimize for the past, but the present: most Read impls outside of std cannot implement read_buf, and it costs them quite a bit to remove all initialized bytes tracking.

I agree that such a change is desirable for the API, but removing all "compatibility" code now, when there is no clear path to stabilization, is too early IMHO. It at least should have been discussed a bit more.

What I would have liked to see:

  • Make init bytes tracking cheaper by only tracking initialization at BorrowedCursor level, not byte level, eg changing init from usize to bool. (I had a draft branch at https://github.com/a1phyr/rust/tree/improve_buf_api if you want, but doing read_to_end right is a bit tricky).
  • and/or keeping the public API as you propose to make it ready for stabilization, and keep the old one private to keep compatibility until people can actually implement read_buf

Also, in the future could you link the tracking issue (#78485 or #117693) please? It makes stuff easier to follow.

View changes since this review

Comment on lines 139 to -149
let mut buf = BorrowedBuf::from(&mut *self.buf);
// SAFETY: `self.filled` bytes will always have been initialized.
unsafe {
buf.set_init(self.initialized);
}

let result = reader.read_buf(buf.unfilled());

self.pos = 0;
self.filled = buf.len();
self.initialized = buf.init_len();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that for most Read impls, filling the buffer will cost an additional memset, which is not great.

// SAFETY: init is either 0 or the init_len from the previous iteration.
read_buf.set_init(init);
}

if read_buf.capacity() >= DEFAULT_BUF_SIZE {
let mut cursor = read_buf.unfilled();
match reader.read_buf(cursor.reborrow()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here but for all copied bytes.

Comment on lines +560 to +566
// SAFETY: We do not uninitialize any part of the buffer.
let n = read(unsafe { cursor.as_mut().write_filled(0) })?;
assert!(n <= cursor.capacity());
// SAFETY: We've initialized the entire buffer, and `read` can't make it uninitialized.
unsafe {
cursor.advance(n);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too bad there is unsafe code here now... I really liked the fact that read_buf was easy to implement before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, for all copied bytes (this is the base io::copy case).

unsafe {
read_buf.set_init(initialized);
}

let mut cursor = read_buf.unfilled();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this will be a perf regression for slow readers (eg one that returns one byte at a time).

@ChrisDenton
Copy link
Member

Well, actually they can't, because read_buf is unstable.

Right this is all unstable APIs so it only affects nightly users, no?

...when there is no clear path to stabilization...

The point of this change is to put it on a path to stabilization.

@a1phyr
Copy link
Contributor

a1phyr commented Dec 13, 2025

Right this is all unstable APIs so it only affects nightly users, no?

No, because read_buf is used a lot within std by stable API. As I pointed at above, there is at least io::BufReader, io::Read::read_to_end (and io::Read::read_to_string), and io::copy.

I'm not sure that we really want to regress all these use cases while there is no fix for users. As I said, we can probably provide these API but private to std to avoid unnecessary perf regressions.

The point of this change is to put it on a path to stabilization.

Granted, but unfortunately we are not there yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

O-SGX Target: SGX O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants