-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Docs] Expand the section about pointer aliasing in TRPL: 4.2. Unsafe Code #21159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,32 +49,15 @@ and mutable (`&mut`) references have some aliasing and freezing | |
guarantees, required for memory safety. | ||
|
||
In particular, if you have an `&T` reference, then the `T` must not be | ||
modified through that reference or any other reference. There are some | ||
standard library types, e.g. `Cell` and `RefCell`, that provide inner | ||
mutability by replacing compile time guarantees with dynamic checks at | ||
runtime. | ||
modified through that reference or any other reference or pointer and must not | ||
alias with any `&mut` reference. There are some standard library types, e.g. | ||
`Cell` and `RefCell`, that provide inner mutability by replacing compile time | ||
guarantees with dynamic checks at runtime. | ||
|
||
An `&mut` reference has a different constraint: when an object has an | ||
`&mut T` pointing into it, then that `&mut` reference must be the only | ||
such usable path to that object in the whole program. That is, an | ||
`&mut` cannot alias with any other references. | ||
|
||
Using `unsafe` code to incorrectly circumvent and violate these | ||
restrictions is undefined behaviour. For example, the following | ||
creates two aliasing `&mut` pointers, and is invalid. | ||
|
||
``` | ||
use std::mem; | ||
let mut x: u8 = 1; | ||
|
||
let ref_1: &mut u8 = &mut x; | ||
let ref_2: &mut u8 = unsafe { mem::transmute(&mut *ref_1) }; | ||
|
||
// oops, ref_1 and ref_2 point to the same piece of data (x) and are | ||
// both usable | ||
*ref_1 = 10; | ||
*ref_2 = 20; | ||
``` | ||
`&mut T` pointing into it, then it can be modified only through this reference | ||
and not through any other reference or pointer. Moreover, an | ||
`&mut T` must not alias with any other `&` or `&mut` reference. | ||
|
||
## Raw pointers | ||
|
||
|
@@ -127,6 +110,111 @@ The latter assumption allows the compiler to optimize more | |
effectively. As can be seen, actually *creating* a raw pointer is not | ||
unsafe, and neither is converting to an integer. | ||
|
||
## Reference and pointer aliasing | ||
|
||
Several examples can give a better comprehension of the reference and pointer | ||
aliasing rules. | ||
|
||
First of all, using `unsafe` code to circumvent the restrictions on references | ||
leads to undefined behaviour. For example, the following code creates two | ||
aliasing `&mut` references, and is therefore illegal. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this code from #[unstable = "waiting for `DerefMut` to become stable"]
impl<'b, T> DerefMut<T> for RefMut<'b, T> {
#[inline]
fn deref_mut<'a>(&'a mut self) -> &'a mut T {
unsafe { &mut *self._parent.value.get() }
}
} It's using unsafe code to create an aliasing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This code was removed in this commit 5048953 The statement "if you create two aliasing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure why I had an old branch set to my current Rust HEAD, but the code is still there there without the stability comment in 6d8342f: impl<'b, T> DerefMut<T> for RefMut<'b, T> {
#[inline]
fn deref_mut<'a>(&'a mut self) -> &'a mut T {
unsafe { &mut *self._parent.value.get() }
}
} |
||
|
||
``` | ||
use std::mem; | ||
let mut x: u8 = 1; | ||
|
||
let ref_1: &mut u8 = &mut x; | ||
let ref_2: &mut u8 = unsafe { mem::transmute(&mut *ref_1) }; | ||
|
||
// oops, ref_1 and ref_2 point to the same piece of data (x) and are | ||
// both usable | ||
*ref_1 = 10; | ||
*ref_2 = 20; | ||
``` | ||
|
||
When raw pointers come into play, the situation becomes more complex. | ||
Raw pointers can alias with anything, but at the same time they should respect | ||
the reference aliasing rules, so it's forbidden to write through a raw pointer, | ||
when the memory location it points to is borrowed by a safe Rust reference. | ||
|
||
``` | ||
// Technically this function invokes undefined behavior and can do anything, | ||
// but usually it prints 1, 2, 2 and one more number. | ||
// The fourth number can be, for example, 2 in optimized build and 3 in | ||
// unoptimized build. | ||
unsafe fn f(p: *mut i32, r: &mut i32) { | ||
let mut val1 = *p; | ||
println!("{}", val1); | ||
*r = 2; // This value change should be visible through the pointer p. | ||
val1 = *p; // This load operation cannot be optimized out. | ||
println!("{}", val1); | ||
|
||
let mut val2 = *r; | ||
println!("{}", val2); | ||
// This value change is illegal and may be invisible through the reference r. | ||
*p = 3; | ||
// This load operation can be optimized out based on the assumption of | ||
// uniqueness of the reference r. | ||
val2 = *r; | ||
println!("{}", val2); | ||
} | ||
|
||
fn main() { | ||
let mut val = 1i32; | ||
unsafe { f(&mut val, &mut val) }; | ||
} | ||
``` | ||
|
||
So, raw pointers can be relatively safely used as observers, but an extreme | ||
care should be taken when performing write operations through them. You should | ||
be sure that no one else refers to the modified value by `&T` or `&mut T`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. relatively safely? This just leaves the reader wondering which rules you have left out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, these raw pointers still can be null or point to garbage, they are not completely safe. That's just not related to aliasing. |
||
|
||
The structure `UnsafeCell` is special-cased in the compiler to allow the | ||
language to work correctly with interior mutability. A value of `UnsafeCell` | ||
can always be safely read through a reference, even if it is updated through | ||
something else. | ||
|
||
``` | ||
use std::cell::UnsafeCell; | ||
|
||
// This function should print 1 and 2. | ||
// If UnsafeCell is replaced with equivalent user-defined structure, then the | ||
// second printed number is undefined, for example, the function can print 2 in | ||
// unoptimized build and 1 in optimized build. | ||
unsafe fn f_cell(p: *mut i32, r: &UnsafeCell<i32>) { | ||
let mut val = r.value; | ||
println!("{}", val); | ||
*p = 2; // This value change should be visible through the reference r. | ||
val = r.value; // This load operation cannot be optimized out. | ||
println!("{}", val); | ||
} | ||
|
||
fn main() { | ||
let mut val_cell = UnsafeCell { value: 1i32 }; | ||
unsafe { f_cell(&mut val_cell.value, &mut val_cell) }; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When not done in statics, the more conventional way to construct an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But this piece of code is not conventional, |
||
} | ||
``` | ||
|
||
Finally, Rust doesn't have any kind of type based aliasing rules for raw | ||
pointers like C, e.g. pointers to unrelated types (e.g. `i32` and `f32`) may | ||
alias. | ||
|
||
``` | ||
// This program should print 1 and 0. | ||
unsafe fn f(pi: *mut i32, pf: *mut f32) { | ||
let mut val = *pi; | ||
println!("{}", val); | ||
*pf = 0.0; // This value change should be visible through the pointer pi. | ||
val = *pi; // This load operation cannot be optimized out. | ||
println!("{}", val); | ||
} | ||
|
||
fn main() { | ||
let mut val = 1i32; | ||
unsafe { f(&mut val, &mut val as *mut i32 as *mut f32) }; | ||
} | ||
``` | ||
|
||
### References and raw pointers | ||
|
||
At runtime, a raw pointer `*` and a reference pointing to the same | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distinguishing between pointers and references seems a bit strange and overly verbose. It seems like we should just use a single word to refer to all pointer types. The language reference refers to them as 'pointers', but I've noticed in the past that some people in the Rust community dislike the P-word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can adjust the text to use the term "reference" for
&T/&mut T
, "raw pointer" for*const T/*mut T
and "pointer" for both references and raw pointers. Would it be ok?