Interaction between `volatile` and `fence`

Hi! I'm not sure if this is the best place for this question, but it seems worth a shot.

I'm trying to express an ordering between two volatile writes from a single mutator, but the docs don't appear to address this and so I am wary. The corresponding C++ docs are still vague but less so.

**Details:** A program needs to perform two writes, each volatile -- perhaps they are memory-mapped I/O. The writes must happen (which is to say, complete) in order -- perhaps the first one turns on the physical device that the second one addresses. Is there something from `core` that I can slip into the middle in the example below to ensure this?

```rust
let ptr_a: *mut u32 = ...;
let ptr_b: *mut u32 = ...; // not equal to ptr_a

ptr_a.write_volatile(0xDEAD);
// insert appropriate barrier/fence here
ptr_b.write_volatile(0xBEEF);
```

Were I willing to be architecture-specific, I know the specific barrier instruction I'm after, and I could express it using inline asm. But it'd be lovely to use something portable. `core::sync::atomic::fence` -- probably with `Release` since it's a write-write situation -- was the first thing I reached for, but seeing as these are not atomic accesses _per se_, the docs on `fence` imply that it has no effect on their ordering. (Specifically, there are no mentions of `volatile` anywhere in the atomics docs.)

[The C++ memory order documentation _does_ discuss the relationship with `volatile`](https://en.cppreference.com/w/cpp/atomic/memory_order#Relationship_with_volatile), but (1) I admit I don't entirely understand its single relevant sentence, and (2) the remaining sentences are trying to scare off people attempting to use `volatile` for inter-thread synchronization, which I am not. Plus, I'm not writing C++. :-)

[Random people on the Internet](https://stackoverflow.com/a/59596733) keep asserting that fence is sufficient for all our memory-barrier needs, but this doesn't seem obvious to me from the docs. (I'm also more accustomed to the traditional terms around barriers than the atomic memory ordering terms, so this may reflect my own ignorance!)

Pragmatically,

- From reading threads here and on the LLVM archives, it looks like LLVM _currently_ preserves relative ordering of atomic and `volatile` accesses, but I am hesitant to either rely on compiler behavior that may be subject to change, or assume that my backend is LLVM.
- A number of `Ordering`s given to `fence` currently produce the instruction I want on my particular target, but that feels fragile, particularly since my target has fewer barrier varieties than, say, PowerPC, so it might be working by accident.

**More detailed context:** The system I'm working on is an ARM Cortex-M7 based SoC. The M7 has a fairly complex bus interface, and can issue and retire memory accesses out of order _if_ they issue on different bus ports (which, in practice, means that they apply to different coarse-grained sections of physical address space). The architecture-specific thing to do here is to insert a `dmb` instruction (available in the `cortex_m` crate, if you are using it, as `cortex_m::asm::dmb()`). However, the driver in question is for a generic IP block (a Synopsys Ethernet MAC) that is not inherently ARM-specific, so it'd be great to express this portably.

As you have likely inferred, the goal is to wait for the _completion_ of the first write, _not_ its issuance in program order, and so `compiler_fence` is not useful here.

Any insight would be greatly appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interaction between `volatile` and `fence` #260

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Interaction between volatile and fence #260

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Interaction between `volatile` and `fence` #260