Skip to content

Commit 268b1d9

Browse files
committed
New Section - Type Layout
1 parent 4b49378 commit 268b1d9

File tree

7 files changed

+289
-23
lines changed

7 files changed

+289
-23
lines changed

src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
- [Type system](type-system.md)
6262
- [Types](types.md)
6363
- [Dynamically Sized Types](dynamically-sized-types.md)
64+
- [Type layout](type-layout.md)
6465
- [Interior mutability](interior-mutability.md)
6566
- [Subtyping](subtyping.md)
6667
- [Type coercions](type-coercions.md)

src/attributes.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -357,7 +357,7 @@ pub mod m3 {
357357
}
358358
```
359359

360-
### Inline attributes
360+
### Inline attribute
361361

362362
The inline attribute suggests that the compiler should place a copy of
363363
the function or static in the caller, rather than generating code to

src/dynamically-sized-types.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Most types have a fixed size that is known at compile time and implement the
44
trait [`Sized`][sized]. A type with a size that is known only at run-time is
5-
called a _dynamically sized type_ (_DST_) or (informally) an unsized type.
5+
called a _dynamically sized type_ (_DST_) or, informally, an unsized type.
66
[Slices] and [trait objects] are two examples of <abbr title="dynamically sized
77
types">DSTs</abbr>. Such types can only be used in certain cases:
88

src/glossary.md

+10
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@
55
An ‘abstract syntax tree’, or ‘AST’, is an intermediate representation of
66
the structure of the program when the compiler is compiling it.
77

8+
### Alignment
9+
10+
The *alignment* of a value specifies what addresses are valid to store the value
11+
at.
12+
813
### Arity
914

1015
Arity refers to the number of arguments a function or operation takes.
@@ -57,6 +62,11 @@ can create such an lvalue without initializing it.
5762
Prelude, or The Rust Prelude, is a small collection of items - mostly traits - that are
5863
imported into very module of every crate. The traits in the prelude are pervasive.
5964

65+
### Size
66+
67+
The *size* of a value is the offset in bytes between successive elements in an
68+
array with that item type including alignment padding.
69+
6070
### Slice
6171

6272
A slice is dynamically-sized view into a contiguous sequence, written as `[T]`.

src/items/enumerations.md

+33-19
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@ An _enumeration_ is a simultaneous definition of a nominal [enumerated type] as
44
well as a set of *constructors*, that can be used to create or pattern-match
55
values of the corresponding enumerated type.
66

7-
[enumerated type]: types.html#enumerated-types
8-
97
Enumerations are declared with the keyword `enum`.
108

119
An example of an `enum` item and its use:
@@ -24,7 +22,7 @@ Enumeration constructors can have either named or unnamed fields:
2422

2523
```rust
2624
enum Animal {
27-
Dog (String, f64),
25+
Dog(String, f64),
2826
Cat { name: String, weight: f64 },
2927
}
3028

@@ -34,36 +32,52 @@ a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
3432

3533
In this example, `Cat` is a _struct-like enum variant_, whereas `Dog` is simply
3634
called an enum variant. Each enum instance has a _discriminant_ which is an
37-
integer associated to it that is used to determine which variant it holds.
35+
integer associated to it that is used to determine which variant it holds. An
36+
opaque reference to this variant can be obtained with the [`mem::discriminant`]
37+
function.
3838

3939
## C-like Enumerations
4040

41-
If there is no data attached to *any* of the variants of an enumeration it is
42-
called a *c-like enumeration*. If a discriminant isn't specified, they start at
43-
zero, and add one for each variant, in order. Each enum value is just its
44-
discriminant which you can specify explicitly:
41+
If there is no data attached to *any* of the variants of an enumeration and
42+
there is at least one variant then it is called a *c-like enumeration*.
43+
44+
C-like enumerations can be cast to integer types with the `as` operator by a
45+
[numeric cast]. The enumeration can optionaly specify which integer each
46+
discriminant gets by following the variant name with `=` and then an integer
47+
literal. If the first variant in the declaration is unspecified, then it is set
48+
to zero. For every unspecified discriminant, it is set to one higher than the
49+
previous variant in the declaration.
4550

4651
```rust
4752
enum Foo {
4853
Bar, // 0
49-
Baz = 123,
54+
Baz = 123, // 123
5055
Quux, // 124
5156
}
57+
58+
let baz_discriminant = Foo::Baz as u32;
59+
assert_eq!(baz_discriminant, 123u32);
5260
```
5361

54-
The right hand side of the specification is interpreted as an `isize` value,
55-
but the compiler is allowed to use a smaller type in the actual memory layout.
56-
The [`repr` attribute] can be added in order to change the type of the right
57-
hand side and specify the memory layout.
62+
Under the [default representation], the specified discriminant is interpreted as
63+
an `isize` value although the compiler is allowed to use a smaller type in the
64+
actual memory layout. The size and thus acceptable values can be changed by
65+
using a [primitive representation] or the [`C` representation].
66+
67+
It is an error when either two variants share the same discriminant or for an
68+
unspecified discriminant, the previous discriminant is the maximum value for the
69+
size of the discriminant. <!-- Need examples here. -->
5870

59-
[`repr` attribute]: attributes.html#ffi-attributes
71+
## Zero-variant Enumerations
6072

61-
You can also cast a c-like enum to get its discriminant:
73+
Enums with zero variants are known as *zero-variant enumerations*. As they have
74+
no valid values, they cannot be instantiated.
6275

6376
```rust
64-
# enum Foo { Baz = 123 }
65-
let x = Foo::Baz as u32; // x is now 123u32
77+
enum ZeroVariants {}
6678
```
6779

68-
This only works as long as none of the variants have data attached. If it were
69-
`Baz(i32)`, this is disallowed.
80+
[enumerated type]: types.html#enumerated-types
81+
[`mem::discriminant`]: std/mem/fn.discriminant.html
82+
[numeric cast]: expressions/operator-expr.html#semantics
83+
[`repr` attribute]: attributes.html#ffi-attributes

src/type-layout.md

+241
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
# Type Layout
2+
3+
The layout of a type is the way the size, alignment, and the offsets of any
4+
fields and discriminants for the values of that type.
5+
6+
While specific releases of the compiler will have the same layout for types,
7+
there is a lot of room for new versions of the compiler to do different things.
8+
Instead of trying to document exactly what is done, we only document what is
9+
guaranteed today.
10+
11+
## Size and Alignment
12+
13+
All values have an alignment and size.
14+
15+
The *alignment* of a value specifies what addresses are valid to store the value
16+
at. A value of alignment `n` must only be stored at an address that is a
17+
multiple of n. For example, a value with an alignment of 2 must be stored at an
18+
even address, while a value with an alignment of 1 can be stored at any address.
19+
Alignment is measured in bytes, and must be at least 1, and always a power of 2.
20+
The alignment of a value can be checked with the [`align_of_val`] function.
21+
22+
The *size* of a value is the offset in bytes between successive elements in an
23+
array with that item type including alignment padding. The size of a value is
24+
always a multiple of its alignment. The size of a value can be checked with the
25+
[`size_of_val`] function.
26+
27+
Types where all values have the same size and alignment known at compile time
28+
implement the [`Sized`] trait and can be checked with the [`size_of`] and
29+
[`align_of`] functions. Types that are not [`Sized`] are known as [dynamically
30+
sized types]. Since all values of a `Sized` type share the same size and
31+
alignment, we refer to those shared values as the size of the type and the
32+
alignment of the type respectively.
33+
34+
## Primitive Data Layout
35+
36+
The size of most primitives is given in this table.
37+
38+
Type | `size_of::\<Type>()`
39+
- | - | -
40+
bool | 1
41+
u8 | 1
42+
u16 | 2
43+
u32 | 4
44+
u64 | 8
45+
i8 | 1
46+
i16 | 2
47+
i32 | 4
48+
i64 | 8
49+
f32 | 4
50+
f64 | 8
51+
char | 4
52+
53+
`usize` and `isize` have a size big enough to contain every address on the
54+
target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
55+
bit target, this is 8 bytes.
56+
57+
Most primitives are generally aligned to their size, although this is
58+
platform-specific behavior. In particular, on x86 u64 and f64 may be only
59+
aligned to 32 bits.
60+
61+
## Pointers and References Layout
62+
63+
Pointers and references have the same layout. Mutability of the pointer or
64+
reference does not change the layout.
65+
66+
Pointers to sized types have the same size and alignment as `usize`.
67+
68+
Pointers to unsized types are sized. The size and alignemnt is guaranteed to be
69+
at least equal to the size and alignment of a pointer.
70+
71+
> Note: Though you should not rely on this, all pointers to <abbr
72+
> title="Dynamically Sized Types">DSTs</abbr> are currently twice the size of
73+
> the size of `usize` and have the same alignment.
74+
75+
## Array Layout
76+
77+
Arrays are laid out so that the `nth` element of the array is offset from the
78+
start of the array by `n * the size of the type` bytes. An array of `[T; n]`
79+
has a size of `size_of::<T>() * n` and the same alignment of `T`.
80+
81+
## Slice Layout
82+
83+
Slices have the same layout as the section of the array they slice.
84+
85+
## Tuple Layout
86+
87+
Tuples do not have any guarantes about their layout.
88+
89+
The exception to this is the unit tuple (`()`) which is guaranteed as a
90+
zero-sized type to have a size of 0 and an alignment of 1.
91+
92+
## Trait Object Layout
93+
94+
Trait objects have the same layout as the value the trait object is of.
95+
96+
## Closure Layout
97+
98+
Closures have no layout guarantees.
99+
100+
## Representations
101+
102+
All user-defined composite types (`struct`s, `enum`, and `union`s) have a
103+
*representation* that specifies what the layout is for the type.
104+
105+
> Note: The representation does not depend upon the type's fields or generic
106+
> parameters.
107+
108+
The possible representations for a type are the default representation, `C`, the
109+
primitive representations, and `packed`. Multiple representations can be applied
110+
to a single type.
111+
112+
The representation of a type can be changed by applying the [`repr` attribute]
113+
to it. The following example shows a struct with a `C` representation.
114+
115+
```
116+
#[repr(C)]
117+
struct ThreeInts {
118+
first: i16,
119+
second: i8,
120+
third: i32
121+
}
122+
```
123+
124+
The representation of a type does not change the layout of its fields. For
125+
example, a struct with a `C` representation that contains a struct `Inner` with
126+
the default representation will not change the layout of Inner.
127+
128+
### The Default Representation
129+
130+
Nominal types without a `repr` attribute have the default representation.
131+
Informally, this representation is also called the `rust` representation.
132+
133+
There are no guarantees of data layout made by this representation.
134+
135+
### The `C` Representation
136+
137+
The `C` representation is designed for creating types that are interoptable with
138+
the C Language and soundly performing operations that rely on data layout such
139+
as reinterpreting values as a different type.
140+
141+
This representation can be applied to structs, unions, and enums.
142+
143+
#### \#[repr(C)] Structs
144+
145+
The alignment of the struct is the alignment of the most-aligned field in it.
146+
147+
The size and offset of fields is determine by the following algorithm.
148+
149+
Start with a current offset of 0 bytes.
150+
151+
For each field in declaration order in the struct, first determine the size and
152+
alignment of the field. If the current offset is not a multiple of the field's
153+
alignment, then add padding bytes increasing the current offset until the
154+
current offset is a multiple of the field's alignment. The offset for the field
155+
is what the current offset is now. Then increase the current offset by the size
156+
of the field.
157+
158+
Finally, the size of the struct is the current offset rounded up to the nearest
159+
multiple of the struct's alignment.
160+
161+
> Note: You can have zero-sized structs from this algorithm. This differs from
162+
> C where structs without data still have a size of one byte.
163+
164+
#### \#[repr(C)] Unions
165+
166+
A union declared with `#[repr(C)]` will have the same size and alignment as an
167+
equivalent C union declaration in the C language for the target platform.
168+
Usually, a union would have the maximum size of the maximum size of all of its
169+
fields, and the maximum alignment of the maximum alignment of all of its fields.
170+
These maximums may come from different fields.
171+
172+
```
173+
#[repr(C)]
174+
union Union {
175+
f1: u16,
176+
f2: [u8; 4],
177+
}
178+
179+
assert_eq!(std::mem::size_of::<Union>(), 4); // From f2
180+
assert_eq!(std::mem::align_of::<Union>(), 2); // From f1
181+
```
182+
183+
#### \#[repr(C)] Enums
184+
185+
For [C-like enumerations], the `C` representation has the size and alignment of
186+
the default `enum` size and alignment for the target platform's C ABI.
187+
188+
> Note: The enum representation in C is implementation defined, so this is
189+
> really a "best guess". In particular, this may be incorrect when the C code
190+
> of interest is compiled with certain flags.
191+
192+
> Warning: There are crucial differences between an `enum` in the C language and
193+
> Rust's C-like enumerations with this representation. An `enum` in C is
194+
> mostly a `typedef` plus some named constants; in other words, an object of an
195+
> `enum` type can hold any integer value. For example, this is often used for
196+
> bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold
197+
> the discrimnant values, everything else is undefined behaviour. Therefore,
198+
> using a C-like enumeration in FFI to model a C `enum` is often wrong.
199+
200+
It is an error for [zero-variant enumerations] to have the `C` representation.
201+
202+
For all other enumerations, the layout is unspecified.
203+
204+
### Primitive representations
205+
206+
The *primitive representations* are the representations with the same names as
207+
the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `usize`, `i8`,
208+
`i16`, `i32`, `i64`, and `isize`.
209+
210+
Primitive representations can only be applied to enumerations.
211+
212+
For [C-like enumerations], they set the size and alignment to be the same as the
213+
primitive type of the same name. For example, a C-like enumeration with a `u8`
214+
representation can only have discriminants between 0 and 255 inclusive.
215+
216+
It is an error for [zero-variant enumerations] to have a primitive
217+
representation.
218+
219+
For all other enumerations, the layout is unspecified.
220+
221+
### The `packed` Representation
222+
223+
The `packed` representation can only be used on `struct`s and `union`s.
224+
225+
It modifies the representation (either the default or `C`) by removing any
226+
padding bytes and forcing the alignment of the type to `1`.
227+
228+
> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and is
229+
> possible to [safely create unaligned pointers to `packed` fields][27060].
230+
> Like all ways to create undefined behavior in safe Rust, this is a bug.
231+
232+
[`align_of_val`]: ../std/mem/fn.align_of_val.html
233+
[`size_of_val`]: ../std/mem/fn.size_of_val.html
234+
[`align_of`]: ../std/mem/fn.align_of.html
235+
[`size_of`]: ../std/mem/fn.size_of.html
236+
[`Sized`]: ../std/marker/trait.Sized.html
237+
[dynamically sized types]: dynamically-sized-types.html
238+
[C-like enumerations]: items/enumerations.html#c-like-enumerations
239+
[zero-variant enumerations]: items/enumerations.html#zero-variant-enumerations
240+
[undefined behavior]: behavior-considered-undefined.html
241+
[27060]: https://github.com/rust-lang/rust/issues/27060

src/types.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -146,8 +146,8 @@ let slice: &[i32] = &boxed_array[..];
146146
All elements of arrays and slices are always initialized, and access to an
147147
array or slice is always bounds-checked in safe methods and operators.
148148

149-
The [`Vec<T>`] standard library type provides a heap allocated resizable array
150-
type.
149+
> Note: The [`Vec<T>`] standard library type provides a heap allocated resizable
150+
> array type.
151151
152152
[dynamically sized type]: dynamically-sized-types.html
153153
[`Vec<T>`]: ../std/vec/struct.Vec.html

0 commit comments

Comments
 (0)