-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Document MaybeUninit bit validity #140463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
d95de62
adfb5b8
c5ff49d
1eac18c
4612df8
b8217de
1078e9b
9695b3a
c16ab6e
0e8cd4c
7de7617
470e856
4db50aa
fc3ba91
197ad6a
fe753c2
48261cd
25e8127
24f8f42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -252,6 +252,46 @@ use crate::{fmt, intrinsics, ptr, slice}; | |
/// std::process::exit(*code); // UB! Accessing uninitialized memory. | ||
/// } | ||
/// ``` | ||
/// | ||
/// # Validity | ||
/// | ||
/// A `MaybeUninit<T>` has no validity requirement – any sequence of [bytes][reference-byte] of the | ||
/// appropriate length, initialized or uninitialized, are a valid representation of `MaybeUninit<T>`. | ||
/// | ||
/// However, "round-tripping" via `MaybeUninit` does not always result in the original value. | ||
joshlf marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/// `MaybeUninit` can have padding, and the contents of that padding are not preserved. | ||
/// Concretely, given distinct `T` and `U` where `size_of::<T>() == size_of::<U>()`, the following | ||
/// code is not guaranteed to be sound: | ||
/// | ||
/// ```rust,no_run | ||
/// # use core::mem::{MaybeUninit, transmute}; | ||
/// # struct T; struct U; | ||
/// fn identity(t: T) -> T { | ||
/// unsafe { | ||
/// let u: MaybeUninit<U> = transmute(t); | ||
/// transmute(u) | ||
/// } | ||
/// } | ||
/// ``` | ||
/// | ||
/// If the representation of `t` contains initialized bytes at byte offsets where `U` contains padding bytes, these | ||
/// may not be preserved in `MaybeUninit<U>`. Transmuting `u` back to `T` (i.e., `transmute(u)` above) may thus | ||
/// be undefined behavior or yield a value different from `t` due to those bytes being lost. This is an active area of discussion, and this code | ||
/// may become sound in the future. | ||
/// | ||
/// However, so long as no such byte offsets exist, then the preceding `identity` example *is* sound. | ||
/// In particular, since `[u8; N]` has no padding bytes, transmuting `t` to `MaybeUninit<[u8; size_of::<T>]>` | ||
/// and back will always produce the original value `t` again. This is true even if `t` contains [provenance]: | ||
/// the resulting value will have the same provenance as the original `t`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a bit of a footgun for potential misunderstandings here but maybe I am being too nitpicky -- and I don't know what else we could say, anyway: if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe the core issue here is how we use the term "value"? E.g. I might say something like: " That doesn't really resolve your concern, but some random thoughts. Maybe it'll prompt you to think of better language we could use here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another possibility: Instead of saying that the value is fully preserved, maybe we could say that the following contents of the value are preserved?
...and then explicitly disclaim any other value contents that we add to the AM in the future? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's more a problem of abstraction: the Rust code (In fact we sometimes even insert reborrows, making this more like
Alias tracking is based on provenance. Provenance is also just data. It may reference other data, such as indicating a position in a tree. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In that case, I'm leaning towards this option unless you have thoughts about language we could use that captures specifically the subset of "value" that we want to address here. How does that sound? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here's the best I can come up with so far. But I feel like this may actually be easier to write for someone who's less deeply entrenched in these discussions. ;)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay here's my version of this (which is in the PR now):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That part isn't given (there can be reborrows inside In fact my comment explicitly stated "the value returned by identity may not be exactly the same as its argument" so not sure how you got from there to your version. It also seems very useful to state that it is still equivalent to a "boring" identity function. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I thought you were being overly-conservative in your wording, which I realize now isn't the case. Changed to be closer to your wording. Better?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That works for me :) |
||
/// | ||
/// Note a potential footgun: if `t` contains a reference, then there may be implicit reborrows of the reference | ||
/// any time it is copied, which may alter its provenance. In that case, the value returned by `identity` may | ||
/// not be exactly the same as its argument. However, even in this case, it remains true that `identity` behaves | ||
/// the same as a function that just returns `t` immediately (i.e., `fn identity<T>(t: T) -> T { t }`). | ||
/// | ||
/// [provenance]: crate::ptr#provenance | ||
/// | ||
/// [reference-byte]: ../../reference/memory-model.html#bytes | ||
#[stable(feature = "maybe_uninit", since = "1.36.0")] | ||
// Lang item so we can wrap other types in it. This is useful for coroutines. | ||
#[lang = "maybe_uninit"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this discussion here:
@RalfJung would you like me to add language like this to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I've added the following as a more concrete and fleshed out draft. I can edit or remove as preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to reference the definition of a byte.