Skip to content

Commit cd92cf0

Browse files
Merge #419
419: Address Multi-Core Soundness by abolishing Send and Sync r=jamesmunns a=jonas-schievink # Summary Change how we model multi-core MCUs in the following ways: * Stop using `Send` and `Sync` to model anything related to cross-core interactions since those traits are unfit for the purpose. * Declare that `Send` is not sufficient to transfer resources between cores, since memory addresses can have different meanings on different cores. * Mutexes based on critical sections become declared sound due to the above: They can only turn `Send` data `Sync`, but neither allows cross-core interactions anymore. * Punt on how to deal with cross-core communication and data sharing for now, and leave it to the ecosystem. [Rendered](https://github.com/jonas-schievink/wg/blob/multi-core-soundness/rfcs/0000-multi-core-soundness.md) [`bare-metal`]: https://github.com/rust-embedded/bare-metal Co-authored-by: Jonas Schievink <[email protected]>
2 parents 4135750 + 4940d76 commit cd92cf0

File tree

1 file changed

+186
-0
lines changed

1 file changed

+186
-0
lines changed

rfcs/0000-multi-core-soundness.md

Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
- Feature Name: `multi_core_soundness`
2+
- Start Date: (fill me in with today's date, YYYY-MM-DD)
3+
- RFC PR: (leave this empty)
4+
- Rust Issue: (leave this empty)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Change how we model multi-core MCUs in the following ways:
10+
11+
* Stop using `Send` and `Sync` to model anything related to cross-core
12+
interactions since those traits are unfit for the purpose.
13+
* Declare that `Send` is not sufficient to transfer resources between cores,
14+
since memory addresses can have different meanings on different cores.
15+
* Mutexes based on critical sections become declared sound due to the above:
16+
They can only turn `Send` data `Sync`, but neither allows cross-core
17+
interactions anymore.
18+
* Punt on how to deal with cross-core communication and data sharing for now,
19+
and leave it to the ecosystem.
20+
21+
# Motivation
22+
[motivation]: #motivation
23+
24+
This RFC shares its motivation with [RFC 388]: The current way we model
25+
multi-core MCUs (which is just like multi-threaded Rust programs are modeled)
26+
means that [`bare_metal::Mutex`] and its reexports in `cortex_m` and other
27+
crates are unsound.
28+
29+
With the push towards a 1.0 embedded ecosystem, we need to make sure that we do
30+
not expose unsound APIs like that, but also that our usage and understanding of
31+
language semantics is in line with upstream Rust and cannot lead to soundness
32+
issues down the road.
33+
34+
Additionally, it would be great to have a good story for multi-core MCUs, since
35+
they are often difficult to develop for, while more and more vendors have
36+
multi-core products available not just for specialized applications, but as
37+
relatively general MCUs.
38+
39+
## Today's Soundness Issues
40+
[todays-issues]: #todays-soundness-issues
41+
42+
The auto trait behavior of `Send` and `Sync` causes soundness issues even today:
43+
Imagine a user-defined struct `S` that only contains some primitive types.
44+
Clearly, this struct will and should automatically implement `Send` and `Sync`.
45+
Now, because `S` implements `Sync`, `&S` will implement `Send`. This is simply
46+
the languages definition of these traits and nothing we can change. Now a big
47+
problem can arise when tranferring `&S` across core boundaries: Cores can have
48+
private memory regions that are not accessible by other cores. If `T: Send` is
49+
the only requirement for sending an object to another core, then this would not
50+
be sound. Existing multi-core support via [µAMP] attempts to work around this
51+
issue by requiring that `T: 'static`, but this approach turns out to be
52+
[unsound][soundness-1].
53+
54+
In addition to that, peripherals (either in the processor core, or ones provided
55+
by the MCU manufacturer) are generally `'static` and `Send` since being able to
56+
transfer them to an interrupt handler is an extremely common, safe operation.
57+
However, not all peripherals are available to all cores. In particular, *none*
58+
of the Cortex-M core peripherals are.
59+
60+
[This blog post about µAMP][microamp-blog] outlines other issues and how they
61+
were solved or worked around in that case. Note that µAMP has other known
62+
[soundness issues][soundness-2] in addition to the one linked above.
63+
64+
[µAMP]: https://github.com/rtfm-rs/microamp
65+
[microamp-blog]: https://blog.japaric.io/microamp/
66+
[soundness-1]: https://github.com/rtfm-rs/microamp/issues/6
67+
[soundness-2]: https://github.com/rtfm-rs/microamp#known-issues
68+
69+
## The Core Issue
70+
71+
To understand why `Send` is insufficient to transfer resource ownership across
72+
cores, observe that multi-core MCUs behave quite differently from multi-threaded
73+
programs (which are what `Send` is supposed to model): Each core can have
74+
private peripherals that may be `Send`able between interrupt handlers, but that
75+
do not exist (or, worse, something *else* exists at their address) on other
76+
cores of the system. The same goes for core-local memory regions: All normal
77+
operations are fine while they only happen on the core owning the memory
78+
(including `Send`ing some memory to an interrupt handler).
79+
80+
Multi-threading in Rust was always modeled with the assumption that all threads
81+
share the same resources, with `Send` and `Sync` only guarding *access* to those
82+
resources. It is likely that when the language semantics around these traits are
83+
more rigidly specified, this will cause a clearer mismatch with what multi-core
84+
MCUs actually need.
85+
86+
A similar problem to that of multi-core MCUs exists when writing hosted Rust
87+
applications: Inter-Process Communication (IPC). This is similar because while
88+
processes may share memory, `Send` and `Sync` are unsuitable to model any
89+
cross-process interaction since addresses and resources such as file descriptors
90+
on different processes can have completely different meanings. At the same time,
91+
it is important that `Send` and `Sync` *stay implemented* for both `File` and
92+
heap-allocated types like `Box`, since sending them across thread boundaries is
93+
still desired.
94+
95+
Something much closer to a multi-threaded program is one that makes use of
96+
interrupt handlers, but runs on a single-core MCU: In both cases, all global
97+
resources exist precisely once, and may be shared and exchanged between
98+
interrupt handlers or threads (provided they implement `Send`/`Sync`) with no
99+
risk of the resources changing their meaning when sent.
100+
101+
# Detailed design
102+
[design]: #detailed-design
103+
104+
## Document what `Send` and `Sync` mean
105+
106+
`Send` and `Sync` will be used only to model transfer and sharing of resources
107+
between different *execution contexts* that run with the same fixed set of
108+
global resources.
109+
110+
Here, an *execution context* is something that may execute code asynchronously
111+
from other code (so without needing to be called). For example, a *thread* or
112+
an *interrupt handler* would qualify as an *execution context*, while a
113+
single-threaded futures executor would *not* create any more *execution
114+
contexts* while it executes.
115+
116+
Concretely (for embedded Rust), that means `Send` and `Sync` can be used to
117+
model threads in an RTOS, or interrupt handlers for bare-metal applications.
118+
119+
## `bare_metal::Mutex` is now sound
120+
121+
Since a `Mutex` only turns `Send` data into `Sync` data, it does not allow any
122+
other cores in the system to access the protected data. Since disabling
123+
interrupts suspends all but the current *execution context*, exclusive access
124+
to the protected resource is now granted.
125+
126+
Providing a way to share data between cores fundamentally depends on the memory
127+
layout of the device and application and is left to device-specific HALs,
128+
support crates or frameworks like [µAMP].
129+
130+
[`bare-metal`]: https://github.com/rust-embedded/bare-metal
131+
132+
## The fate of Symmetric Multi-Processing (SMP) apps
133+
134+
In SMP apps, only a single executable is used for all cores, while each core can
135+
have a separate entry point (or another mechanism of identifying the running
136+
core). This means that all `static`s are shared by default.
137+
138+
Since defining a `static` only requires that its type is `Sync`, this would be
139+
unsound. For example, it would allow storing a `bare_metal::Mutex` in a `static`
140+
and access it from all cores. Therefore, this RFC foregoes the ability to write
141+
safe SMP apps in Rust, instead proposing to shift focus to AMP apps, which do
142+
not share data by default and produce a separate executable per core.
143+
144+
# How We Teach This
145+
[how-we-teach-this]: #how-we-teach-this
146+
147+
(see above)
148+
149+
# Drawbacks
150+
[drawbacks]: #drawbacks
151+
152+
* This makes writing applications for multi-core MCUs more difficult if there
153+
are no mature libraries for the target platform (that would provide APIs and
154+
traits for cross-core operations).
155+
156+
While multi-core MCUs have become somewhat more common, it is expected that
157+
the vast majority of embedded Rust users will continue to use single-core
158+
MCUs. For those users, providing a sound and safe-to-use `Mutex` that works by
159+
disabling interrupts is beneficial. Even for multi-core applications, it is
160+
expected that the actual cross-core communication is limited to a small number
161+
of places in the code, so making it more difficult has limited impact.
162+
163+
* This RFC generally rules out SMP apps that run the same firmware image on
164+
multiple cores. These would be able to share data via `static`s, which only
165+
requires a `Sync` bound, and that is not sufficient to guarantee safe
166+
operation when accessed from multiple cores.
167+
168+
# Alternatives
169+
[alternatives]: #alternatives
170+
171+
* Accept [RFC 388] instead, introducing a `SingleCore{Send,Sync}` auto trait
172+
pair once auto traits are stable, and make `Mutex::new` an `unsafe fn` in the
173+
interim. Remove unsound `Send` impls from peripherals.
174+
175+
* Do what this RFC proposes, and also introduce `CoreSend`/`CoreSync` traits to
176+
model cross-core interaction. (This was what this RFC initially proposed, but
177+
it was decided that we should focus on fixing the soundness issues first and
178+
leave multi-core support to be implemented outside the core ecosystem.)
179+
180+
# Unresolved questions
181+
[unresolved]: #unresolved-questions
182+
183+
* None so far.
184+
185+
[`bare_metal::Mutex`]: https://docs.rs/bare-metal/0.2.5/bare_metal/struct.Mutex.html
186+
[RFC 388]: https://github.com/rust-embedded/wg/pull/388

0 commit comments

Comments
 (0)