Skip to content

Commit 35d7db1

Browse files
committed
Gankro's suggestions plus more.
1 parent f6c95a8 commit 35d7db1

File tree

3 files changed

+125
-38
lines changed

3 files changed

+125
-38
lines changed

src/glossary.md

+16-4
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,9 @@ the structure of the program when the compiler is compiling it.
77

88
### Alignment
99

10-
The *alignment* of a value specifies what addresses are valid to store the value
11-
at.
10+
The alignment of a value specifies what addresses values are preferred to
11+
start at. Always a power of two. References to a value must be aligned.
12+
[More][alignment].
1213

1314
### Arity
1415

@@ -76,8 +77,18 @@ imported into very module of every crate. The traits in the prelude are pervasiv
7677

7778
### Size
7879

79-
The *size* of a value is the offset in bytes between successive elements in an
80-
array with that item type including alignment padding.
80+
The size of a value has two definitions.
81+
82+
The first is that it is how much memory must be allocated to store that value.
83+
84+
The second is that it is the offset in bytes between successive elements in an
85+
array with that item type.
86+
87+
It is a multiple of the alignment, including zero. The size can change
88+
depending on compiler version (as new optimizations are made) and target
89+
platform (as `usize` varies).
90+
91+
[More][alignment].
8192

8293
### Slice
8394

@@ -114,6 +125,7 @@ It allows a type to make certain promises about its behavior.
114125

115126
Generic functions and generic structs can use traits to constrain, or bound, the types they accept.
116127

128+
[alignment]: type-layout.html#size-and-alignment
117129
[enums]: items/enumerations.html
118130
[structs]: items/structs.html
119131
[unions]: items/unions.html

src/items/enumerations.md

+41-12
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@
2525
> _EnumItemDiscriminant_ :
2626
>    `=` [_Expression_]
2727
28-
An _enumeration_ is a simultaneous definition of a nominal [enumerated type] as
29-
well as a set of *constructors*, that can be used to create or pattern-match
30-
values of the corresponding enumerated type.
28+
An *enumeration*, also referred to as *enum* is a simultaneous definition of a
29+
nominal [enumerated type] as well as a set of *constructors*, that can be used
30+
to create or pattern-match values of the corresponding enumerated type.
3131

3232
Enumerations are declared with the keyword `enum`.
3333

@@ -43,7 +43,7 @@ let mut a: Animal = Animal::Dog;
4343
a = Animal::Cat;
4444
```
4545

46-
Enumeration constructors can have either named or unnamed fields:
46+
Enum constructors can have either named or unnamed fields:
4747

4848
```rust
4949
enum Animal {
@@ -58,8 +58,8 @@ a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
5858
In this example, `Cat` is a _struct-like enum variant_, whereas `Dog` is simply
5959
called an enum variant. Each enum instance has a _discriminant_ which is an
6060
integer associated to it that is used to determine which variant it holds. An
61-
opaque reference to this variant can be obtained with the [`mem::discriminant`]
62-
function.
61+
opaque reference to this discriminant can be obtained with the
62+
[`mem::discriminant`] function.
6363

6464
## Custom Discriminant Values for Field-Less Enumerations
6565

@@ -81,21 +81,50 @@ enum Foo {
8181
}
8282

8383
let baz_discriminant = Foo::Baz as u32;
84-
assert_eq!(baz_discriminant, 123u32);
84+
assert_eq!(baz_discriminant, 123);
8585
```
8686

8787
Under the [default representation], the specified discriminant is interpreted as
8888
an `isize` value although the compiler is allowed to use a smaller type in the
8989
actual memory layout. The size and thus acceptable values can be changed by
9090
using a [primitive representation] or the [`C` representation].
9191

92-
It is an error when either two variants share the same discriminant or for an
93-
unspecified discriminant, the previous discriminant is the maximum value for the
94-
size of the discriminant. <!-- Need examples here. -->
92+
It is an error when two variants share the same discriminant.
9593

96-
## Zero-variant Enumerations
94+
```rust,ignore
95+
enum SharedDiscriminantError {
96+
SharedA = 1,
97+
SharedB = 1
98+
}
99+
100+
enum SharedDiscriminantError2 {
101+
Zero, // 0
102+
One, // 1
103+
OneToo = 1 // 1 (collision with previous!)
104+
}
105+
```
106+
107+
It is also an error to have an unspecified discriminant where the previous
108+
discriminant is the maximum value for the size of the discriminant.
109+
110+
```rust,ignore
111+
#[repr(u8)]
112+
enum OverflowingDiscriminantError {
113+
Max = 255,
114+
MaxPlusOne // Would be 256, but that overflows the enum.
115+
}
116+
117+
#[repr(u8)]
118+
enum OverflowingDiscriminantError2 {
119+
MaxMinusOne = 254, // 254
120+
Max, // 255
121+
MaxPlusOne // Would be 256, but that overflows the enum.
122+
}
123+
```
124+
125+
## Zero-variant Enums
97126

98-
Enums with zero variants are known as *zero-variant enumerations*. As they have
127+
Enums with zero variants are known as *zero-variant enumss*. As they have
99128
no valid values, they cannot be instantiated.
100129

101130
```rust

src/type-layout.md

+68-22
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
# Type Layout
22

3-
The layout of a type is the way the size, alignment, and the offsets of any
4-
fields and discriminants for the values of that type.
3+
The layout of a type is its size, alignment, and the relative offsets of its
4+
fields. For enums, how the discriminant is laid out and interpreted is also part
5+
of type layout.
56

6-
While specific releases of the compiler will have the same layout for types,
7-
there is a lot of room for new versions of the compiler to do different things.
8-
Instead of trying to document exactly what is done, we only document what is
9-
guaranteed today.
7+
Type layout can be changed with each compilation. Instead of trying to document
8+
exactly what is done, we only document what is guaranteed today.
109

1110
## Size and Alignment
1211

@@ -37,7 +36,6 @@ The size of most primitives is given in this table.
3736

3837
Type | `size_of::\<Type>()`
3938
- | - | -
40-
bool | 1
4139
u8 | 1
4240
u16 | 2
4341
u32 | 4
@@ -55,7 +53,7 @@ target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
5553
bit target, this is 8 bytes.
5654

5755
Most primitives are generally aligned to their size, although this is
58-
platform-specific behavior. In particular, on x86 u64 and f64 may be only
56+
platform-specific behavior. In particular, on x86 u64 and f64 are only
5957
aligned to 32 bits.
6058

6159
## Pointers and References Layout
@@ -82,6 +80,9 @@ has a size of `size_of::<T>() * n` and the same alignment of `T`.
8280

8381
Slices have the same layout as the section of the array they slice.
8482

83+
> Note: This is about the raw `[T]` type, not pointers (`&[T]`, `Box<[T]>`,
84+
> etc.) to slices.
85+
8586
## Tuple Layout
8687

8788
Tuples do not have any guarantes about their layout.
@@ -93,6 +94,9 @@ zero-sized type to have a size of 0 and an alignment of 1.
9394

9495
Trait objects have the same layout as the value the trait object is of.
9596

97+
> Note: This is about the raw trait object types, not pointers (`&Trait`,
98+
> `Box<Trait>`, etc.) to trait objects.
99+
96100
## Closure Layout
97101

98102
Closures have no layout guarantees.
@@ -102,9 +106,6 @@ Closures have no layout guarantees.
102106
All user-defined composite types (`struct`s, `enum`, and `union`s) have a
103107
*representation* that specifies what the layout is for the type.
104108

105-
> Note: The representation does not depend upon the type's fields or generic
106-
> parameters.
107-
108109
The possible representations for a type are the default representation, `C`, the
109110
primitive representations, and `packed`. Multiple representations can be applied
110111
to a single type.
@@ -121,6 +122,11 @@ struct ThreeInts {
121122
}
122123
```
123124

125+
> Note: As a consequence of the representation being an attribute on the item,
126+
> the representation does not depend on generic parameters. Any two types with
127+
> the same name have the same representation. For example, `Foo<Bar>` and
128+
> `Foo<Baz>` both have the same representation.
129+
124130
The representation of a type does not change the layout of its fields. For
125131
example, a struct with a `C` representation that contains a struct `Inner` with
126132
the default representation will not change the layout of Inner.
@@ -134,39 +140,63 @@ There are no guarantees of data layout made by this representation.
134140

135141
### The `C` Representation
136142

137-
The `C` representation is designed for creating types that are interoptable with
138-
the C Language and soundly performing operations that rely on data layout such
139-
as reinterpreting values as a different type.
143+
The `C` representation is designed for dual purposes. One purpose is for
144+
creating types that are interoptable with the C Language. The second purpose is
145+
to create types that you can soundly performing operations that rely on data
146+
layout such as reinterpreting values as a different type.
147+
148+
Because of this dual purpose, it is possible to create types that are not useful
149+
for interfacing with the C programming language.
140150

141151
This representation can be applied to structs, unions, and enums.
142152

143153
#### \#[repr(C)] Structs
144154

145155
The alignment of the struct is the alignment of the most-aligned field in it.
146156

147-
The size and offset of fields is determine by the following algorithm.
157+
The size and offset of fields is determined by the following algorithm.
148158

149159
Start with a current offset of 0 bytes.
150160

151161
For each field in declaration order in the struct, first determine the size and
152162
alignment of the field. If the current offset is not a multiple of the field's
153-
alignment, then add padding bytes increasing the current offset until the
154-
current offset is a multiple of the field's alignment. The offset for the field
155-
is what the current offset is now. Then increase the current offset by the size
156-
of the field.
163+
alignment, then add padding bytes to the current offset until it is a multiple
164+
of the field's alignment. The offset for the field is what the current offset
165+
is now. Then increase the current offset by the size of the field.
157166

158167
Finally, the size of the struct is the current offset rounded up to the nearest
159168
multiple of the struct's alignment.
160169

170+
Here is this algorithm described in psudeocode.
171+
172+
```rust,ignore
173+
struct.alignment = struct.fields().map(|field| field.alignment).max();
174+
175+
let current_offset = 0;
176+
177+
for field in struct.fields_in_declaration_order() {
178+
// Increase the current offset so that it's a multiple of the alignment
179+
// of this field. For the first field, this will always be zero.
180+
// The skipped bytes are called padding bytes.
181+
current_offset += field.alignment % current_offset;
182+
183+
struct[field].offset = current_offset;
184+
185+
current_offset += field.size;
186+
}
187+
188+
struct.size = current_offset + current_offset % struct.alignment;
189+
```
190+
161191
> Note: You can have zero-sized structs from this algorithm. This differs from
162192
> C where structs without data still have a size of one byte.
163193
164194
#### \#[repr(C)] Unions
165195

166196
A union declared with `#[repr(C)]` will have the same size and alignment as an
167197
equivalent C union declaration in the C language for the target platform.
168-
Usually, a union would have the maximum size of the maximum size of all of its
169-
fields, and the maximum alignment of the maximum alignment of all of its fields.
198+
The union will have a size of the maximum size of all of its fields rounded to
199+
its alignment, and an alignment of the maximum alignment of all of its fields.
170200
These maximums may come from different fields.
171201

172202
```
@@ -178,6 +208,17 @@ union Union {
178208
179209
assert_eq!(std::mem::size_of::<Union>(), 4); // From f2
180210
assert_eq!(std::mem::align_of::<Union>(), 2); // From f1
211+
212+
#[repr(C)]
213+
union SizeRoundedUp {
214+
a: u32,
215+
b: [u16; 3],
216+
}
217+
218+
assert_eq!(std::mem::size_of::<SizeRoundedUp>(), 8); // Size of 6 from b,
219+
// rounded up to 8 from
220+
// alignment of a.
221+
assert_eq!(std::mem::align_of::<SizeRoundedUp>(), 4); // From a
181222
```
182223

183224
#### \#[repr(C)] Enums
@@ -201,6 +242,9 @@ It is an error for [zero-variant enumerations] to have the `C` representation.
201242

202243
For all other enumerations, the layout is unspecified.
203244

245+
Likewise, combining the `C` representation with a primitive representation, the
246+
layout is unspecified.
247+
204248
### Primitive representations
205249

206250
The *primitive representations* are the representations with the same names as
@@ -218,14 +262,16 @@ representation.
218262

219263
For all other enumerations, the layout is unspecified.
220264

265+
Likewise, combining two primitive representations together is unspecified.
266+
221267
### The `packed` Representation
222268

223269
The `packed` representation can only be used on `struct`s and `union`s.
224270

225271
It modifies the representation (either the default or `C`) by removing any
226272
padding bytes and forcing the alignment of the type to `1`.
227273

228-
> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and is
274+
> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and it is
229275
> possible to [safely create unaligned pointers to `packed` fields][27060].
230276
> Like all ways to create undefined behavior in safe Rust, this is a bug.
231277

0 commit comments

Comments
 (0)