Skip to content

Commit 0ddb68d

Browse files
committed
explain the MIR const vs TY const situation
1 parent 6b347d2 commit 0ddb68d

File tree

2 files changed

+64
-45
lines changed

2 files changed

+64
-45
lines changed

src/const-eval.md

+2-44
Original file line numberDiff line numberDiff line change
@@ -41,51 +41,9 @@ and a [`GlobalId`]. The `GlobalId` is made up of an `Instance` referring to a co
4141
or static or of an `Instance` of a function and an index into the function's `Promoted` table.
4242

4343
Constant evaluation returns an [`EvalToValTreeResult`] for type system constants or
44-
[`EvalToConstValueResult`] with either the error, or a representation of the constant.
45-
46-
Constants for the type system are encoded in "valtree representation". The `ValTree` datastructure
47-
allows us to represent
48-
49-
* arrays,
50-
* many structs,
51-
* tuples,
52-
* enums and,
53-
* most primitives.
54-
55-
The basic rule for
56-
being permitted in the type system is that every value must be uniquely represented. In other
57-
words: a specific value must only be representable in one specific way. For example: there is only
58-
one way to represent an array of two integers as a `ValTree`:
59-
`ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])`.
60-
Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
61-
`ValTree::Leaf(bits_of_two_u32)`, that is not a legal construction of `ValTree`
62-
(and is very complex to do, so it is unlikely anyone is tempted to do so).
63-
64-
These rules also mean that some values are not representable. There can be no `union`s in type
65-
level constants, as it is not clear how they should be represented, because their active variant
66-
is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
67-
compile-time and thus we cannot make any assumptions about them. References on the other hand
68-
*can* be represented, as equality for references is defined as equality on their value, so we
69-
ignore their address and just look at the backing value. We must make sure that the pointer values
70-
of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
71-
Any conversion from
72-
valtree back to codegen constants must reintroduce an actual indirection. At codegen time the
73-
addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
74-
optimization choices.
75-
76-
As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
77-
decisions depending on that. The value itself gives no useful information without the type that
78-
belongs to it.
79-
80-
Other constants get represented as [`ConstValue::Scalar`] or
81-
[`ConstValue::Slice`] if possible. These values are only useful outside the
82-
compile-time interpreter. If you need the value of a constant during
83-
interpretation, you need to directly work with [`const_to_op`].
44+
[`EvalToConstValueResult`] with either the error, or a representation of the evaluated constant:
45+
a [valtree](mir/index.md#valtrees) or a [MIR constant value](mir/index.md#mir-constant-values), respectively.
8446

8547
[`GlobalId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/struct.GlobalId.html
86-
[`ConstValue::Scalar`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.Scalar
87-
[`ConstValue::Slice`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.Slice
88-
[`ConstValue::ByRef`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.ByRef
8948
[`EvalToConstValueResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/error/type.EvalToConstValueResult.html
9049
[`EvalToValTreeResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/error/type.EvalToValTreeResult.html
91-
[`const_to_op`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_const_eval/interpret/struct.InterpCx.html#method.const_to_op

src/mir/index.md

+62-1
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,66 @@ but [you can read about those below](#promoted)).
255255

256256
## Representing constants
257257

258-
*to be written*
258+
When code has reached the MIR stage, constants can generally come in two forms: *MIR constants* ([`mir::Constant`]) and *type system constants* ([`ty::Const`]).
259+
MIR constants are used as operands: in `x + CONST`, `CONST` is a MIR constant; similarly, in `x + 2`, `2` is a MIR constant.
260+
Type system constants are used in the type system, in particular for array lengths but also for const generics.
261+
262+
Generally, both kinds of constants can be "unevaluated" or "already evaluated".
263+
And unevaluated constant simply stores the `DefId` of what needs to be evaluated to compute this result.
264+
An evaluated constant (a "value") has already been computed; their representation differs between type system constants and MIR constants:
265+
MIR constants evaluate to a `mir::ConstValue`; type system constants evaluate to a `ty::ValTree`.
266+
267+
Type system constants have some more variants to support const generics: they can refer to local const generic parameters, and they are subject to inference.
268+
Furthermore, the `mir::Constant::Ty` variant lets us use an arbitrary type system constant as a MIR constant; this happens whenever a const generic parameter is used as an operand.
269+
270+
### MIR constant values
271+
272+
In general, a MIR constant value (`mir::ConstValue`) was computed by evaluating some constant the user wrote.
273+
This [const evaluation](../const-eval.md) produces a very low-level representation of the result in terms of individual bytes.
274+
We call this an "indirect" constant (`mir::ConstValue::Indirect`) since the value is stored in-memory.
275+
276+
However, storing everything in-memory would be awfully inefficient. Hence there
277+
are some other variants in `mir::ConstValue` that can represent certain simple
278+
and common values more efficiently. In particular, everything that can be
279+
directly written as a literal in Rust (integers, floats, chars, bools, but also
280+
`"string literals"` and `b"byte string literals"`) has an optimized variant that
281+
avoids the full overhead of the in-memory representation.
282+
283+
### ValTrees
284+
285+
An evaluated type system constant is a "valtree". The `ty::ValTree` datastructure
286+
allows us to represent
287+
288+
* arrays,
289+
* many structs,
290+
* tuples,
291+
* enums and,
292+
* most primitives.
293+
294+
The most important rule for
295+
this representation is that every value must be uniquely represented. In other
296+
words: a specific value must only be representable in one specific way. For example: there is only
297+
one way to represent an array of two integers as a `ValTree`:
298+
`ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])`.
299+
Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
300+
`ValTree::Leaf(bits_of_two_u32)`, that is not a legal construction of `ValTree`
301+
(and is very complex to do, so it is unlikely anyone is tempted to do so).
302+
303+
These rules also mean that some values are not representable. There can be no `union`s in type
304+
level constants, as it is not clear how they should be represented, because their active variant
305+
is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
306+
compile-time and thus we cannot make any assumptions about them. References on the other hand
307+
*can* be represented, as equality for references is defined as equality on their value, so we
308+
ignore their address and just look at the backing value. We must make sure that the pointer values
309+
of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
310+
Any conversion from
311+
valtree back a to MIR constant value must reintroduce an actual indirection. At codegen time the
312+
addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
313+
optimization choices.
314+
315+
As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
316+
decisions depending on that. The value itself gives no useful information without the type that
317+
belongs to it.
259318

260319
<a name="promoted"></a>
261320

@@ -283,3 +342,5 @@ See the const-eval WG's [docs on promotion](https://github.com/rust-lang/const-e
283342
[`ProjectionElem::Deref`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html#variant.Deref
284343
[`Rvalue`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Rvalue.html
285344
[`Operand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Operand.html
345+
[`mir::Constant`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Constant.html
346+
[`ty::Const`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Const.html

0 commit comments

Comments
 (0)