From 59c26436c7d1608b68ce99fc5ec0d02955d43960 Mon Sep 17 00:00:00 2001
From: Thomas Winwood <twwinwood@gmail.com>
Date: Sun, 16 Feb 2020 17:54:00 +0000
Subject: [PATCH 1/5] RFC: Add a new attribute, `#[isa]`

---
 text/0000-isa-attribute.md | 101 +++++++++++++++++++++++++++++++++++++
 1 file changed, 101 insertions(+)
 create mode 100644 text/0000-isa-attribute.md

diff --git a/text/0000-isa-attribute.md b/text/0000-isa-attribute.md
new file mode 100644
index 00000000000..7766d116460
--- /dev/null
+++ b/text/0000-isa-attribute.md
@@ -0,0 +1,101 @@
+- Feature Name: isa_attribute
+- Start Date: 2020-02-16
+- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
+- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
+
+# Summary
+[summary]: #summary
+
+This RFC proposes a new function attribute, `#[isa]`.  The minimal initial implementation will provide `#[isa = "a32"]` and `#[isa = "t32"]` on ARM targets, corresponding respectively to disabling and enabling the LLVM feature `thumb-mode` for the annotated function.
+
+# Motivation
+[motivation]: #motivation
+
+Starting with `ARMv4T`, ARM cores support a slimmed-down instruction set called Thumb.  (Due to the introduction of AArch64, the original ARM and Thumb instruction sets are now referred to as A32 and T32, and this RFC will use this terminology from here on in.) Switching between these instruction sets ("interworking") can be done at the instruction level by clearing or setting the lowest bit of the program counter.  LLVM already knows how to insert interworking shims, but Rust lacks the necessary language-level support.  Prior to the adoption of [RFC 2045], it was possible to use the unstable `target_feature` attribute to disable or enable `thumb-mode`, but the stabilised syntax for that attribute focused on enabling opt-in features such as SIMD and vector instructions; since `thumb-mode` is an "either-or" feature, it is no longer the right tool for the job.
+
+[RFC 2045]: https://github.com/rust-lang/rfcs/blob/master/text/2045-target-feature.md
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+Some platforms have multiple different instruction sets, optimised for different requirements; for example, ARM targets have a denser but less feature-packed instruction set named T32 alongside the normal A32.  Rust supports configuring which instruction set any given function is compiled to via the `#[isa]` attribute.  For example, if on an ARM target you wish for a particular function to be compiled to T32 instructions for reduced code size, you would annotate the function like so.
+
+```rust
+#[isa = "t32"]
+fn some_function() {
+    // ...
+}
+```
+
+That's all you need to do - LLVM inserts interworking shims where necessary, so the change is completely transparent.
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+Functions are inlined across ISA boundaries as if the `#[isa]` attribute did not exist.
+
+Annotating a function with an ISA that does not exist yields a compile-time error.
+
+```rust
+#[isa = "unicorn"]
+fn some_function() {
+    // ...
+}
+```
+
+```
+error: "unicorn" is not a recognised ISA for the target `armv5te-unknown-linux-gnueabi`
+  --> src/lib.rs:1:1
+   |
+ 1 | #[isa = "unicorn"]
+   |         ^^^^^^^^^ help: valid ISAs are `a32`, `t32`
+```
+
+Annotating a function with two different ISAs at once yields a compile-time error.  (This is likely to be the result of a typo when editing code.)
+
+```rust
+#[isa = "a32"]
+#[isa = "t32"]
+fn some_function() {
+    // ...
+}
+```
+
+```
+error: a function cannot have two `#[isa]` attributes at the same time
+  --> src/lib.rs:1:1
+   |
+ 1 | #[isa = "a32"]
+   | -------------- first attribute was here
+...
+ 2 | #[isa = "t32"]
+   | ^^^^^^^^^^^^^^ help: remove one of these attributes
+```
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Adding another attribute complicates Rust's design.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another attribute; however, this is verbose, and does not fit with the `target_feature` attribute's focus on features such as AVX and SSE whose absence is not necessarily compensated by the presence of something else.
+
+Doing nothing is an option; it is possible to incorporate code using other instruction sets through other means such as external assembly.  However, this steps outside Rust's safety guarantees.
+
+# Prior art
+[prior-art]: #prior-art
+
+GCC supports opting into interworking with the `--thumb-interwork` flag - its syntactic equivalents to `#[isa = "a32"]` and `#[isa = "t32"]` are `__attribute__((target("arm")))` and `__attribute__((target("thumb")))`.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+- Are there any presently-supported architectures with a mechanism like A32/T32 which `#[isa]` could support?
+
+# Future possibilities
+[future-possibilities]: #future-possibilities
+
+- RISC-V allegedly supports truncated instructions in a similar fashion to T32; the `#[isa]` attribute may benefit users of that architecture in the future.
+- Should Rust gain support for the 65C816, the `#[isa]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.

From af58a580fa7bba9d395adee26bd13887716368c2 Mon Sep 17 00:00:00 2001
From: Lokathor <zefria@gmail.com>
Date: Mon, 24 Feb 2020 00:48:21 -0700
Subject: [PATCH 2/5] Update 0000-isa-attribute.md

---
 text/0000-isa-attribute.md | 109 +++++++++++++++++++++----------------
 1 file changed, 61 insertions(+), 48 deletions(-)

diff --git a/text/0000-isa-attribute.md b/text/0000-isa-attribute.md
index 7766d116460..a0875a004d8 100644
--- a/text/0000-isa-attribute.md
+++ b/text/0000-isa-attribute.md
@@ -6,96 +6,109 @@
 # Summary
 [summary]: #summary
 
-This RFC proposes a new function attribute, `#[isa]`.  The minimal initial implementation will provide `#[isa = "a32"]` and `#[isa = "t32"]` on ARM targets, corresponding respectively to disabling and enabling the LLVM feature `thumb-mode` for the annotated function.
+This RFC proposes a new function attribute, `#[instruction_set(?)]` which allows you to declare the instruction set to be used with compiling the function. It also proposes two initial allowed values (`a32` and `t32`) for use with this attribute. Other allowed values could be added to the language later.
 
 # Motivation
 [motivation]: #motivation
 
-Starting with `ARMv4T`, ARM cores support a slimmed-down instruction set called Thumb.  (Due to the introduction of AArch64, the original ARM and Thumb instruction sets are now referred to as A32 and T32, and this RFC will use this terminology from here on in.) Switching between these instruction sets ("interworking") can be done at the instruction level by clearing or setting the lowest bit of the program counter.  LLVM already knows how to insert interworking shims, but Rust lacks the necessary language-level support.  Prior to the adoption of [RFC 2045], it was possible to use the unstable `target_feature` attribute to disable or enable `thumb-mode`, but the stabilised syntax for that attribute focused on enabling opt-in features such as SIMD and vector instructions; since `thumb-mode` is an "either-or" feature, it is no longer the right tool for the job.
+Most programmers are familiar with the idea of a CPU family having more than one instruction set. `x86_64` is backwards compatible with `x86`, and an `x86_64` CPU can run an `x86` program if necessary.
 
-[RFC 2045]: https://github.com/rust-lang/rfcs/blob/master/text/2045-target-feature.md
+Starting with `ARMv4T`, many ARM CPUs support two separate instruction sets. At the time they were called "ARM code" and "Thumb code", but with the development of `AArch64`, they're now called `a32` and `t32`. Unlike with the `x86` / `x86_64` situation, on ARM you can have a single program that intersperses both `a32` and `t32` code. A particular form of branch instruction allows for the CPU to change between the two modes any time it branches, and so generally code is designated as being either `a32` or `t32` on a per-function basis.
+
+In LLVM, selecting that code should be `a32` or `t32` is done by either disabling (for `a32`) or enabling (for `t32`) the `thumb-mode` target feature. Previously, Rust was able to do this using the `target_feature` attribute because it was able to either add _or subtract_ an LLVM target feature during a function. However, when [RFC 2045](https://github.com/rust-lang/rfcs/blob/master/text/2045-target-feature.md) was accepted, its final form did not allow for the subtraction of target features. Its final form is primarily designed around always opting _in_ to additional features, and it's no longer the correct tool for an "either A or B, but not both" situation like `a32`/`t32` is.
 
 # Guide-level explanation
 [guide-level-explanation]: #guide-level-explanation
 
-Some platforms have multiple different instruction sets, optimised for different requirements; for example, ARM targets have a denser but less feature-packed instruction set named T32 alongside the normal A32.  Rust supports configuring which instruction set any given function is compiled to via the `#[isa]` attribute.  For example, if on an ARM target you wish for a particular function to be compiled to T32 instructions for reduced code size, you would annotate the function like so.
+Some platforms support having more than one instruction set used within a single program. Generally, each one will be better for specific parts of a program. Every target has a default instruction set, based on the target triple. If you would like to set a specific function to use an alternate instruction set you use the `#[instruction_set(?)]` attribute, specifying the desired instruction set in parentheses.
+
+Currently this is only of use on ARM family CPUs, which support both the `a32` and `t32` instruction sets. Targets starting with `arm` default to `a32` and targets starting with `thumb` default to `t32`.
 
 ```rust
-#[isa = "t32"]
-fn some_function() {
-    // ...
+// this uses the default instruction set for your target
+
+fn add_one(x: i32) -> i32 {
+    x + 1
 }
-```
 
-That's all you need to do - LLVM inserts interworking shims where necessary, so the change is completely transparent.
+// This will compile as `a32` code on both `arm` and thumb` targets
 
-# Reference-level explanation
-[reference-level-explanation]: #reference-level-explanation
+#[instruction_set(a32)]
+fn add_five(x: i32) -> i32 {
+    x + 5
+}
+```
 
-Functions are inlined across ISA boundaries as if the `#[isa]` attribute did not exist.
+To ease the amount of `cfg_attr` required with this attribute, if you specify an instruction set that isn't available on the target used the attribute is simply ignored. For example, if you specify `t32` and then build the code for `x86_64` or `wasm32`, the attribute is ignored.
 
-Annotating a function with an ISA that does not exist yields a compile-time error.
+If you specify an instruction set that the compiler doesn't recognize at all then you will get an error.
 
 ```rust
-#[isa = "unicorn"]
-fn some_function() {
-    // ...
+#[instruction_set(unicorn)]
+fn this_does_not_build() -> i32 {
+    7
 }
 ```
 
-```
-error: "unicorn" is not a recognised ISA for the target `armv5te-unknown-linux-gnueabi`
-  --> src/lib.rs:1:1
-   |
- 1 | #[isa = "unicorn"]
-   |         ^^^^^^^^^ help: valid ISAs are `a32`, `t32`
-```
+The specifics of _when_ to specify a non-default instruction set on a function are platform specific. Unless a piece of platform documentation has indicated a specific requirement, you do not need to think about adding this attribute at all.
 
-Annotating a function with two different ISAs at once yields a compile-time error.  (This is likely to be the result of a typo when editing code.)
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
 
-```rust
-#[isa = "a32"]
-#[isa = "t32"]
-fn some_function() {
-    // ...
-}
-```
+Every target is now considered to have one default instruction set (for functions that lack the `instruction_set` attribute), as well as possibly supporting specific additional instruction sets:
 
-```
-error: a function cannot have two `#[isa]` attributes at the same time
-  --> src/lib.rs:1:1
-   |
- 1 | #[isa = "a32"]
-   | -------------- first attribute was here
-...
- 2 | #[isa = "t32"]
-   | ^^^^^^^^^^^^^^ help: remove one of these attributes
-```
+* Targets with `arm` arch default to the `a32` instruction set, but can also use `t32`.
+* Targets with `thumb` arch default to the `t32` instruction set, but can also use `a32`.
+* All other current targets each have only one instruction set, which is also their default instruction set.
+
+Backend support:
+* In LLVM this corresponds to enabling or disabling the `thumb-mode` target feature on a function.
+* Other future backends (eg: Cranelift) would presumably support this in some similar way. A "quick and dirty" version of `a32`/`t32` interworking can be achieved simply by simply placing all `a32` code in one translation unit, all `t32` code in another, and then telling the linker to sort it out. Currently, Cranelift does not support ARM chips _at all_, but they can easily work towards this over time.
+
+Guarantees:
+* If an alternate instruction set is designated on a function then the compiler _must_ respect that. It is not a hint, it is a guarantee.
+
+What is a Compile Error:
+* If an alternate instruction set is designated that is known to exist but not appropriate for the current arch (eg: `a32` on an `x86_64` build) then the compiler will silently ignore the attribute. This helps keep code as portable as possible, similar to the [windows_subsystem](https://github.com/rust-lang/rfcs/blob/master/text/1665-windows-subsystem.md) attribute being used on programs compiled for Linux and Mac simply being silently ignored.
+* If an alternate instruction set is designated that doesn't exist _anywhere_ (eg: "unicorn") then that is a compiler error.
+* If the attribute appears more than once on a function that is a compile error.
+* If the current backend is lacking support for compiling with the alternate instruction set, then that should trigger a compile error.
+
+Inlining:
+* For the alternate instruction sets proposed by this RFC, `a32` and `t32`, what is affected is the actual generated assembly and symbol placement of the generated function. If a function's body is inlined into the caller then the attribute no longer has a meaningful effect within the caller's body, and would be ignored.
+* This does mean that any inline `asm!` calls in alternate instruction set functions could be inlined into the wrong instruction set within the caller's body. It would be up to the programmer to specify `inline(never)` if this is a concern. However, the primary goal of this RFC is to eliminate the need for inline `asm!` in the first place.
+
+How _specifically_ does it work on ARM:
+* Within an ELF file, all `t32` code functions are stored as having odd value addresses, and when a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to move the CPU between the `a32` and `t32` states appropriately.
+* Accordingly, this does _not_ count as a full new ABI of its own. Both "Rust" and "C" ABI functions and function pointers are the same type as they were before.
+* Linkers for ARM platforms such as [gnu ld](https://sourceware.org/binutils/docs/ld/ARM.html#ARM) have various flags to help the "interwork" process, depending on your compilation settings.
+* This is considered a very low level and platform specific feature, so potentially having to pass additional linker args **is** considered an acceptable level of complexity for the programmer.
 
 # Drawbacks
 [drawbacks]: #drawbacks
 
-Adding another attribute complicates Rust's design.
+* Adding another attribute complicates Rust's design.
 
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
-Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another attribute; however, this is verbose, and does not fit with the `target_feature` attribute's focus on features such as AVX and SSE whose absence is not necessarily compensated by the presence of something else.
+* Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another attribute; however, this is verbose, and does not fit with the `target_feature` attribute's current focus on features such as AVX and SSE whose absence is not necessarily compensated for by the presence of something else.
 
-Doing nothing is an option; it is possible to incorporate code using other instruction sets through other means such as external assembly.  However, this steps outside Rust's safety guarantees.
+* Doing nothing is an option; it is currently possible to incorporate code using other instruction sets through means such as external assembly and build scripts. However, this has greatly reduced ergonomics.
 
 # Prior art
 [prior-art]: #prior-art
 
-GCC supports opting into interworking with the `--thumb-interwork` flag - its syntactic equivalents to `#[isa = "a32"]` and `#[isa = "t32"]` are `__attribute__((target("arm")))` and `__attribute__((target("thumb")))`.
+In C you can use `__attribute__((target("arm")))` and `__attribute__((target("thumb")))` to access similar functionality. It's a compiler-specific extension, but it's supported by both GCC and Clang.
 
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
 
-- Are there any presently-supported architectures with a mechanism like A32/T32 which `#[isa]` could support?
+- Hopefully none?
 
 # Future possibilities
 [future-possibilities]: #future-possibilities
 
-- RISC-V allegedly supports truncated instructions in a similar fashion to T32; the `#[isa]` attribute may benefit users of that architecture in the future.
-- Should Rust gain support for the 65C816, the `#[isa]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.
+* LLVM might eventually gain support for inter-instruction-set calls that allow calls between two arches (eg: a hybrid PowerPC/RISC-V). In that case, we could extend the attribute to allow new options.
+
+* If Rust gains support for the 65C816, the `#[instruction_set(?)]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.

From 43a0ee25778d729a2b42a8b95a6af67dfbcad714 Mon Sep 17 00:00:00 2001
From: Lokathor <zefria@gmail.com>
Date: Sun, 1 Mar 2020 16:38:21 -0700
Subject: [PATCH 3/5] changed usage to `instruction_set(arch, set)`

---
 text/0000-isa-attribute.md | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/text/0000-isa-attribute.md b/text/0000-isa-attribute.md
index a0875a004d8..aa97e54d8ea 100644
--- a/text/0000-isa-attribute.md
+++ b/text/0000-isa-attribute.md
@@ -6,7 +6,7 @@
 # Summary
 [summary]: #summary
 
-This RFC proposes a new function attribute, `#[instruction_set(?)]` which allows you to declare the instruction set to be used with compiling the function. It also proposes two initial allowed values (`a32` and `t32`) for use with this attribute. Other allowed values could be added to the language later.
+This RFC proposes a new function attribute, `#[instruction_set(arch, set)]` which allows you to declare the instruction set to be used when compiling the function for a given arch. It also proposes two initial allowed values for the ARM arch (`a32` and `t32`). Other allowed values could be added to the language later.
 
 # Motivation
 [motivation]: #motivation
@@ -20,7 +20,7 @@ In LLVM, selecting that code should be `a32` or `t32` is done by either disablin
 # Guide-level explanation
 [guide-level-explanation]: #guide-level-explanation
 
-Some platforms support having more than one instruction set used within a single program. Generally, each one will be better for specific parts of a program. Every target has a default instruction set, based on the target triple. If you would like to set a specific function to use an alternate instruction set you use the `#[instruction_set(?)]` attribute, specifying the desired instruction set in parentheses.
+Some platforms support having more than one instruction set used within a single program. Generally, each one will be better for specific parts of a program. Every target has a default instruction set, based on the target triple. If you would like to set a specific function to use an alternate instruction set you use the `#[instruction_set(arch, set)]` attribute. This specifies that when the code is built for then given arch, it should use the alternate instruction set specified instead of the default one.
 
 Currently this is only of use on ARM family CPUs, which support both the `a32` and `t32` instruction sets. Targets starting with `arm` default to `a32` and targets starting with `thumb` default to `t32`.
 
@@ -33,18 +33,18 @@ fn add_one(x: i32) -> i32 {
 
 // This will compile as `a32` code on both `arm` and thumb` targets
 
-#[instruction_set(a32)]
+#[instruction_set(arm, a32)]
 fn add_five(x: i32) -> i32 {
     x + 5
 }
 ```
 
-To ease the amount of `cfg_attr` required with this attribute, if you specify an instruction set that isn't available on the target used the attribute is simply ignored. For example, if you specify `t32` and then build the code for `x86_64` or `wasm32`, the attribute is ignored.
+To help with code portability, when the function is compiled for any arch other than the arch given then the attribute has no effect. If the `add_five` function were built for `x86_64` then it would be the same as having no `instruction_set` attribute.
 
-If you specify an instruction set that the compiler doesn't recognize at all then you will get an error.
+If you specify an instruction set that the compiler doesn't recognize then you will get an error.
 
 ```rust
-#[instruction_set(unicorn)]
+#[instruction_set(arm, unicorn)]
 fn this_does_not_build() -> i32 {
     7
 }
@@ -57,9 +57,9 @@ The specifics of _when_ to specify a non-default instruction set on a function a
 
 Every target is now considered to have one default instruction set (for functions that lack the `instruction_set` attribute), as well as possibly supporting specific additional instruction sets:
 
-* Targets with `arm` arch default to the `a32` instruction set, but can also use `t32`.
-* Targets with `thumb` arch default to the `t32` instruction set, but can also use `a32`.
-* All other current targets each have only one instruction set, which is also their default instruction set.
+* The targets with names that start with `arm` default to `(arm, a32)`, but can also use `(arm, t32)`.
+* The targets with names that start with `thumb` default to `(arm, t32)`, but can also use `(arm, a32)`.
+* The `instruction_set` attribute is not currently defined for use with any other arch.
 
 Backend support:
 * In LLVM this corresponds to enabling or disabling the `thumb-mode` target feature on a function.
@@ -69,10 +69,9 @@ Guarantees:
 * If an alternate instruction set is designated on a function then the compiler _must_ respect that. It is not a hint, it is a guarantee.
 
 What is a Compile Error:
-* If an alternate instruction set is designated that is known to exist but not appropriate for the current arch (eg: `a32` on an `x86_64` build) then the compiler will silently ignore the attribute. This helps keep code as portable as possible, similar to the [windows_subsystem](https://github.com/rust-lang/rfcs/blob/master/text/1665-windows-subsystem.md) attribute being used on programs compiled for Linux and Mac simply being silently ignored.
-* If an alternate instruction set is designated that doesn't exist _anywhere_ (eg: "unicorn") then that is a compiler error.
-* If the attribute appears more than once on a function that is a compile error.
-* If the current backend is lacking support for compiling with the alternate instruction set, then that should trigger a compile error.
+* If an alternate instruction set is designated that doesn't exist (eg: "unicorn") then that is a compiler error.
+* If the attribute appears more than once for a _single arch_ on a function that is a compile error.
+* Specifying an alternate instruction set attribute more than once with each usage being for a _different arch_ it is allowed.
 
 Inlining:
 * For the alternate instruction sets proposed by this RFC, `a32` and `t32`, what is affected is the actual generated assembly and symbol placement of the generated function. If a function's body is inlined into the caller then the attribute no longer has a meaningful effect within the caller's body, and would be ignored.
@@ -82,7 +81,7 @@ How _specifically_ does it work on ARM:
 * Within an ELF file, all `t32` code functions are stored as having odd value addresses, and when a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to move the CPU between the `a32` and `t32` states appropriately.
 * Accordingly, this does _not_ count as a full new ABI of its own. Both "Rust" and "C" ABI functions and function pointers are the same type as they were before.
 * Linkers for ARM platforms such as [gnu ld](https://sourceware.org/binutils/docs/ld/ARM.html#ARM) have various flags to help the "interwork" process, depending on your compilation settings.
-* This is considered a very low level and platform specific feature, so potentially having to pass additional linker args **is** considered an acceptable level of complexity for the programmer.
+* This is considered a very low level and platform specific feature, so potentially having to pass additional linker args **is** considered an acceptable level of complexity for the programmer, though we should attempt to provide "good defaults" if we can of course.
 
 # Drawbacks
 [drawbacks]: #drawbacks
@@ -112,3 +111,5 @@ In C you can use `__attribute__((target("arm")))` and `__attribute__((target("th
 * LLVM might eventually gain support for inter-instruction-set calls that allow calls between two arches (eg: a hybrid PowerPC/RISC-V). In that case, we could extend the attribute to allow new options.
 
 * If Rust gains support for the 65C816, the `#[instruction_set(?)]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.
+
+* MIPS has a 16-bit encoding which uses a similar scheme as ARM, where the low bit of a function's address is set when the 16-bit encoding is in use for that function.

From ec70ee5cecd16871b8af6d6b4213c6e4a00d621e Mon Sep 17 00:00:00 2001
From: Lokathor <zefria@gmail.com>
Date: Fri, 10 Apr 2020 19:49:41 -0600
Subject: [PATCH 4/5] various updates in consultation with centril

Primarily this provides more clarifications as to what's specified, and also an example of "what this looks like in use".
---
 text/0000-isa-attribute.md | 94 ++++++++++++++++++++++++++++++++++----
 1 file changed, 85 insertions(+), 9 deletions(-)

diff --git a/text/0000-isa-attribute.md b/text/0000-isa-attribute.md
index aa97e54d8ea..2a610c85419 100644
--- a/text/0000-isa-attribute.md
+++ b/text/0000-isa-attribute.md
@@ -11,9 +11,7 @@ This RFC proposes a new function attribute, `#[instruction_set(arch, set)]` whic
 # Motivation
 [motivation]: #motivation
 
-Most programmers are familiar with the idea of a CPU family having more than one instruction set. `x86_64` is backwards compatible with `x86`, and an `x86_64` CPU can run an `x86` program if necessary.
-
-Starting with `ARMv4T`, many ARM CPUs support two separate instruction sets. At the time they were called "ARM code" and "Thumb code", but with the development of `AArch64`, they're now called `a32` and `t32`. Unlike with the `x86` / `x86_64` situation, on ARM you can have a single program that intersperses both `a32` and `t32` code. A particular form of branch instruction allows for the CPU to change between the two modes any time it branches, and so generally code is designated as being either `a32` or `t32` on a per-function basis.
+Starting with `ARMv4T`, many ARM CPUs support two separate instruction sets. At the time they were called "ARM code" and "Thumb code", but with the development of `AArch64`, they're now called `a32` and `t32`. Unlike with the `x86_64` architecture, where the CPU can run both `x86` and `x86_64` code, but a single program still uses just one of the two instruction sets, on ARM you can have a single program that intersperses both `a32` and `t32` code. A particular form of branch instruction allows for the CPU to change between the two modes any time it branches, and so generally code is designated as being either `a32` or `t32` on a per-function basis.
 
 In LLVM, selecting that code should be `a32` or `t32` is done by either disabling (for `a32`) or enabling (for `t32`) the `thumb-mode` target feature. Previously, Rust was able to do this using the `target_feature` attribute because it was able to either add _or subtract_ an LLVM target feature during a function. However, when [RFC 2045](https://github.com/rust-lang/rfcs/blob/master/text/2045-target-feature.md) was accepted, its final form did not allow for the subtraction of target features. Its final form is primarily designed around always opting _in_ to additional features, and it's no longer the correct tool for an "either A or B, but not both" situation like `a32`/`t32` is.
 
@@ -64,25 +62,33 @@ Every target is now considered to have one default instruction set (for function
 Backend support:
 * In LLVM this corresponds to enabling or disabling the `thumb-mode` target feature on a function.
 * Other future backends (eg: Cranelift) would presumably support this in some similar way. A "quick and dirty" version of `a32`/`t32` interworking can be achieved simply by simply placing all `a32` code in one translation unit, all `t32` code in another, and then telling the linker to sort it out. Currently, Cranelift does not support ARM chips _at all_, but they can easily work towards this over time.
+* Because Miri operates on Rust's MIR stage, this attribute doesn't affect the operation of Miri. If Miri were to some day support inline assembly this attribute would need to be taken into account for that to work right, but Miri could also simply choose to not support this attribute in combination with inline assembly.
 
 Guarantees:
 * If an alternate instruction set is designated on a function then the compiler _must_ respect that. It is not a hint, it is a guarantee.
 
+Where can this attribute be used:
+* This attribute can be used on any `fn` item that has a body: Free functions, inherent methods, trait default methods, and trait impl methods.
+* This attribute cannot be used on closures or within `extern` block declarations.
+* (Allowing this on trait prototypes is a Future Possibility.)
+
 What is a Compile Error:
-* If an alternate instruction set is designated that doesn't exist (eg: "unicorn") then that is a compiler error.
+* If an alternate instruction set is designated that doesn't exist (eg: "unicorn") then that is a compiler error. Later versions of the compiler/language are free to add additional arch/instruction set pairs.
 * If the attribute appears more than once for a _single arch_ on a function that is a compile error.
 * Specifying an alternate instruction set attribute more than once with each usage being for a _different arch_ it is allowed.
 
 Inlining:
 * For the alternate instruction sets proposed by this RFC, `a32` and `t32`, what is affected is the actual generated assembly and symbol placement of the generated function. If a function's body is inlined into the caller then the attribute no longer has a meaningful effect within the caller's body, and would be ignored.
-* This does mean that any inline `asm!` calls in alternate instruction set functions could be inlined into the wrong instruction set within the caller's body. It would be up to the programmer to specify `inline(never)` if this is a concern. However, the primary goal of this RFC is to eliminate the need for inline `asm!` in the first place.
+* This does mean that any inline `asm!` calls in alternate instruction set functions could be inlined into the wrong instruction set within the caller's body. That is one reason why `asm!` is unsafe.
 
 How _specifically_ does it work on ARM:
-* Within an ELF file, all `t32` code functions are stored as having odd value addresses, and when a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to move the CPU between the `a32` and `t32` states appropriately.
-* Accordingly, this does _not_ count as a full new ABI of its own. Both "Rust" and "C" ABI functions and function pointers are the same type as they were before.
-* Linkers for ARM platforms such as [gnu ld](https://sourceware.org/binutils/docs/ld/ARM.html#ARM) have various flags to help the "interwork" process, depending on your compilation settings.
+* Within an ELF file, all `t32` code functions are stored as having odd value addresses, and when a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to move the CPU between the `a32` and `t32` states appropriately. See the [ARM ELF spec](https://static.docs.arm.com/ihi0044/g/aaelf32.pdf), section 5.5.3.
+* Accordingly, this does _not_ count as a full new ABI of its own. Both "Rust" and "C" ABI functions and function pointers are the same type as they were before. See the [ARM Procedure Call Standard](https://developer.arm.com/docs/ihi0042/g/procedure-call-standard-for-the-arm-architecture-abi-2018q4-documentation).
+* Linkers for ARM platforms such as [gnu ld](https://sourceware.org/binutils/docs/ld/ARM.html#ARM) have various flags to help the "interwork" process, depending on your compilation settings. In the case of GNU ld it's called [-mthumb-interwork](https://sourceware.org/binutils/docs/ld/ARM.html)
 * This is considered a very low level and platform specific feature, so potentially having to pass additional linker args **is** considered an acceptable level of complexity for the programmer, though we should attempt to provide "good defaults" if we can of course.
 
+TODO: `-mthumb-interwork` is an `as`/`gcc` arg, not an `ld` arg, fix the link above
+
 # Drawbacks
 [drawbacks]: #drawbacks
 
@@ -91,14 +97,82 @@ How _specifically_ does it work on ARM:
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
+## Rationale
+
+Here's a simple but complete-enough program of how this would be used in practice. In this example, the program is for the Game Boy Advance (GBA). I have attempted to limit it to the essentials, so all the MMIO definitions, as well as the assembly runtime you'd need to boot and call `main`, are still omitted from the example.
+
+```rust
+// The GBA's BIOS provides some functionality available via software
+// interrupt. We expose them to Rust in our assumed assembly "runtime".
+extern "C" fn {
+    /// Puts the CPU into a low-power state until a vblank interrupt,
+    /// and then returns after the interrupt handler completes.
+    VBlankInterWait(isize, isize);
+}
+
+// We assume that the MMIO stuff is imported from somewhere.
+// The exact addresses and constant values aren't important.
+mod all_the_gba_mmio_definitions;
+use all_the_gba_mmio_definitions::*;
+
+fn main() {
+    // All of the `write_volatile` calls here refer to
+    // the method of the `*mut T` type. Proper safe abstractions
+    // for all of this would complicate the example, so we
+    // simply use raw pointers and one large `unsafe` block.
+    unsafe {
+        // set the interrupt function to be our handler
+        INTR_FN_ADDR.write_volatile(core::transmute(my_inter_fn));
+
+        // enable vblank interrupts
+        DISPSTAT.write_volatile(DISPSTAT_VBLANK);
+        IME.write_volatile(IME_VBLANK);
+        IE.write_volatile(true);
+        
+        // set the device for a basic display mode.
+        DISPCNT.write_volatile(MODE3_BG2);
+        let mut x = 0;
+        loop {
+            // wait in a low-power state for the vertial blank to start.
+            VBlankInterWait(0, 0);
+            // draw one new red pixel per frame along the top.
+            VRAM_MODE3.row(0).col(x).write(RED);
+            x += 1;
+            // loop our position as necessary so that we don't
+            // go out of bounds.
+            if x >= VRAM_MODE3::WIDTH { x = 0 }
+        }
+    }
+}
+
+/// Responds to any interrupt by clearing all interrupt flags
+/// and then immediately returning with no other effect.
+#[instruction_set(arm, a32)]
+fn my_inter_fn() {
+    INTER_BIOS_FLAGS.write_volatile(ALL_INTER_FLAGS);
+    INTER_STANDARD_FLAGS.write_volatile(ALL_INTER_FLAGS);
+}
+```
+
+1) We setup the device with our interrupt handler.
+2) We set the device to have an interrupt every time the vertical blank starts.
+3) We set the display to use a basic bitmap mode and begin our loop.
+4) Each pass of the loop we wait for vetical blank, then draw a single pixel.
+
+In the case of this particular device, the hardware interrupts go to the device's BIOS, which then calls your interrupt handler function. However, because the BIOS is `a32` code and uses a `b` branch instead of a `bx` branch-exchange, it jumps to the handler with the CPU in an `a32` state. If the handler were written as `t32` code it would immediately trigger UB.
+
+## Alternatives 
+
 * Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another attribute; however, this is verbose, and does not fit with the `target_feature` attribute's current focus on features such as AVX and SSE whose absence is not necessarily compensated for by the presence of something else.
 
 * Doing nothing is an option; it is currently possible to incorporate code using other instruction sets through means such as external assembly and build scripts. However, this has greatly reduced ergonomics.
 
+* Of note is the fact that this is a feature that mostly improves Rust's support for the more legacy end of ARM devices. Newer devices, with much larger amounts of memory (relatively), don't usually benefit as much. They could simply compile the entire program as `a32`, without needing to gain the space savings of `t32` code.
+
 # Prior art
 [prior-art]: #prior-art
 
-In C you can use `__attribute__((target("arm")))` and `__attribute__((target("thumb")))` to access similar functionality. It's a compiler-specific extension, but it's supported by both GCC and Clang.
+In C you can use `__attribute__((target("arm")))` and `__attribute__((target("thumb")))` to access similar functionality. It's a compiler-specific extension, but it's supported by both GCC and Clang ([this PR](https://reviews.llvm.org/D33721) appears to be the one that added this feature to LLVM/clang).
 
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
@@ -113,3 +187,5 @@ In C you can use `__attribute__((target("arm")))` and `__attribute__((target("th
 * If Rust gains support for the 65C816, the `#[instruction_set(?)]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.
 
 * MIPS has a 16-bit encoding which uses a similar scheme as ARM, where the low bit of a function's address is set when the 16-bit encoding is in use for that function.
+
+* It might become possible to apply this attribute to trait prototypes in a future version. The main problems are properly specifying it and also that it would add additonal compiler complexity for very minimal gain (since each impl of the trait can use it on their impl of a method if they want).

From b98a549978b42b892addbdb69a7a5927246c3dd5 Mon Sep 17 00:00:00 2001
From: Lokathor <zefria@gmail.com>
Date: Thu, 23 Apr 2020 17:49:29 -0600
Subject: [PATCH 5/5] updates based on 2020-04-23 lang meeting.

---
 text/0000-isa-attribute.md | 105 ++++++++++++++++++++++---------------
 1 file changed, 62 insertions(+), 43 deletions(-)

diff --git a/text/0000-isa-attribute.md b/text/0000-isa-attribute.md
index 2a610c85419..e194ea47c97 100644
--- a/text/0000-isa-attribute.md
+++ b/text/0000-isa-attribute.md
@@ -6,66 +6,65 @@
 # Summary
 [summary]: #summary
 
-This RFC proposes a new function attribute, `#[instruction_set(arch, set)]` which allows you to declare the instruction set to be used when compiling the function for a given arch. It also proposes two initial allowed values for the ARM arch (`a32` and `t32`). Other allowed values could be added to the language later.
+This RFC proposes a new function attribute, `#[instruction_set(set)]` which allows you to declare the instruction set to be used when compiling the function. It also proposes two initial allowed values for the ARM arch (`arm::a32` and `arm::t32`). Other allowed values could be added to the language later.
 
 # Motivation
 [motivation]: #motivation
 
-Starting with `ARMv4T`, many ARM CPUs support two separate instruction sets. At the time they were called "ARM code" and "Thumb code", but with the development of `AArch64`, they're now called `a32` and `t32`. Unlike with the `x86_64` architecture, where the CPU can run both `x86` and `x86_64` code, but a single program still uses just one of the two instruction sets, on ARM you can have a single program that intersperses both `a32` and `t32` code. A particular form of branch instruction allows for the CPU to change between the two modes any time it branches, and so generally code is designated as being either `a32` or `t32` on a per-function basis.
+Starting with `ARMv4T`, many ARM CPUs support two separate instruction sets. At the time they were called "ARM code" and "Thumb code", but with the development of `AArch64`, they're now called `a32` and `t32`. Unlike with the `x86_64` architecture, where the CPU can run both `x86` and `x86_64` code, but a single program still uses just one of the two instruction sets, on ARM you can have a single program that intersperses both `a32` and `t32` code. A particular form of branch instruction allows for the CPU to change between the two modes any time it branches, and so code can be designated as being either `a32` or `t32` on a per-function basis.
 
 In LLVM, selecting that code should be `a32` or `t32` is done by either disabling (for `a32`) or enabling (for `t32`) the `thumb-mode` target feature. Previously, Rust was able to do this using the `target_feature` attribute because it was able to either add _or subtract_ an LLVM target feature during a function. However, when [RFC 2045](https://github.com/rust-lang/rfcs/blob/master/text/2045-target-feature.md) was accepted, its final form did not allow for the subtraction of target features. Its final form is primarily designed around always opting _in_ to additional features, and it's no longer the correct tool for an "either A or B, but not both" situation like `a32`/`t32` is.
 
 # Guide-level explanation
 [guide-level-explanation]: #guide-level-explanation
 
-Some platforms support having more than one instruction set used within a single program. Generally, each one will be better for specific parts of a program. Every target has a default instruction set, based on the target triple. If you would like to set a specific function to use an alternate instruction set you use the `#[instruction_set(arch, set)]` attribute. This specifies that when the code is built for then given arch, it should use the alternate instruction set specified instead of the default one.
+Some platforms support having more than one instruction set used within a single program. Generally, each one will be better for specific parts of a program. Every target has a default instruction set, based on the target triple. If you would like to set a specific function to use an alternate instruction set you use the `#[instruction_set(set)]` attribute.
 
-Currently this is only of use on ARM family CPUs, which support both the `a32` and `t32` instruction sets. Targets starting with `arm` default to `a32` and targets starting with `thumb` default to `t32`.
+Currently this is only of use on ARM family CPUs, which support both the `arm::a32` and `arm::t32` instruction sets. Targets starting with `arm` (eg: `arm-linux-androideabi`) default to `arm::a32` and targets starting with `thumb` (eg: `thumbv7neon-linux-androideabi`) default to `arm::t32`.
 
 ```rust
 // this uses the default instruction set for your target
-
 fn add_one(x: i32) -> i32 {
     x + 1
 }
 
 // This will compile as `a32` code on both `arm` and thumb` targets
-
-#[instruction_set(arm, a32)]
+#[instruction_set(arm::a32)]
 fn add_five(x: i32) -> i32 {
     x + 5
 }
 ```
 
-To help with code portability, when the function is compiled for any arch other than the arch given then the attribute has no effect. If the `add_five` function were built for `x86_64` then it would be the same as having no `instruction_set` attribute.
-
-If you specify an instruction set that the compiler doesn't recognize then you will get an error.
+It it a compile time error to specify an instruction set that is not available on the target you're compiling for. Users wishing for their code to be as portable as possible should use `cfg_attr` to only enable the attribute when using the appropriate targets.
 
 ```rust
-#[instruction_set(arm, unicorn)]
-fn this_does_not_build() -> i32 {
-    7
+// This will fail to build if `arm::a32` isn't available
+#[instruction_set(arm::a32)]
+fn add_five(x: i32) -> i32 {
+    x + 5
+}
+
+// This will build on all platforms, and apply the `instruction_set` attribute
+// only on ARM targets.
+#[cfg_attr(target_cpu="arm", instruction_set(arm::a32))]
+fn add_six(x: i32) -> i32 {
+    x + 6
 }
 ```
 
-The specifics of _when_ to specify a non-default instruction set on a function are platform specific. Unless a piece of platform documentation has indicated a specific requirement, you do not need to think about adding this attribute at all.
+As you can see it can get a little verbose, so projects which plan to use the `instruction_set` attribute might want to consider writing a proc-macro with a shorter name.
+
+The specifics of _when_ you should specify a non-default instruction set on a function are platform specific. Unless a piece of platform documentation has indicated a specific requirement, you do not need to think about adding this attribute at all.
 
 # Reference-level explanation
 [reference-level-explanation]: #reference-level-explanation
 
 Every target is now considered to have one default instruction set (for functions that lack the `instruction_set` attribute), as well as possibly supporting specific additional instruction sets:
 
-* The targets with names that start with `arm` default to `(arm, a32)`, but can also use `(arm, t32)`.
-* The targets with names that start with `thumb` default to `(arm, t32)`, but can also use `(arm, a32)`.
+* The targets with names that start with `arm` default to `arm::a32`, but can also use `arm::t32`.
+* The targets with names that start with `thumb` default to `arm::t32`, but can also use `arm::a32`.
 * The `instruction_set` attribute is not currently defined for use with any other arch.
-
-Backend support:
-* In LLVM this corresponds to enabling or disabling the `thumb-mode` target feature on a function.
-* Other future backends (eg: Cranelift) would presumably support this in some similar way. A "quick and dirty" version of `a32`/`t32` interworking can be achieved simply by simply placing all `a32` code in one translation unit, all `t32` code in another, and then telling the linker to sort it out. Currently, Cranelift does not support ARM chips _at all_, but they can easily work towards this over time.
-* Because Miri operates on Rust's MIR stage, this attribute doesn't affect the operation of Miri. If Miri were to some day support inline assembly this attribute would need to be taken into account for that to work right, but Miri could also simply choose to not support this attribute in combination with inline assembly.
-
-Guarantees:
-* If an alternate instruction set is designated on a function then the compiler _must_ respect that. It is not a hint, it is a guarantee.
+* To avoid possible name clashes, the convention for this attribute is that the name of the instruction set itself (eg: `a32`) is prefixed with the name of the arch it goes with (eg: `arm`).
 
 Where can this attribute be used:
 * This attribute can be used on any `fn` item that has a body: Free functions, inherent methods, trait default methods, and trait impl methods.
@@ -73,21 +72,39 @@ Where can this attribute be used:
 * (Allowing this on trait prototypes is a Future Possibility.)
 
 What is a Compile Error:
-* If an alternate instruction set is designated that doesn't exist (eg: "unicorn") then that is a compiler error. Later versions of the compiler/language are free to add additional arch/instruction set pairs.
-* If the attribute appears more than once for a _single arch_ on a function that is a compile error.
+* If an alternate instruction set is designated that doesn't exist (eg: "unicorn") then that is a compiler error. Later versions of the compiler/language are free to add additional allowed instruction set values.
 * Specifying an alternate instruction set attribute more than once with each usage being for a _different arch_ it is allowed.
 
-Inlining:
-* For the alternate instruction sets proposed by this RFC, `a32` and `t32`, what is affected is the actual generated assembly and symbol placement of the generated function. If a function's body is inlined into the caller then the attribute no longer has a meaningful effect within the caller's body, and would be ignored.
-* This does mean that any inline `asm!` calls in alternate instruction set functions could be inlined into the wrong instruction set within the caller's body. That is one reason why `asm!` is unsafe.
+Guarantees:
+* If an alternate instruction set is designated on a function then the compiler _must_ respect that. It is not a hint, it is a guarantee.
+* The exact details of an `instruction_set` guarantee vary by target.
+* Notably, the `instruction_set` attribute is most likely to interact (in a target specific way) with function inlining and use of inline assembly.
+
+## ARM
 
-How _specifically_ does it work on ARM:
-* Within an ELF file, all `t32` code functions are stored as having odd value addresses, and when a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to move the CPU between the `a32` and `t32` states appropriately. See the [ARM ELF spec](https://static.docs.arm.com/ihi0044/g/aaelf32.pdf), section 5.5.3.
-* Accordingly, this does _not_ count as a full new ABI of its own. Both "Rust" and "C" ABI functions and function pointers are the same type as they were before. See the [ARM Procedure Call Standard](https://developer.arm.com/docs/ihi0042/g/procedure-call-standard-for-the-arm-architecture-abi-2018q4-documentation).
-* Linkers for ARM platforms such as [gnu ld](https://sourceware.org/binutils/docs/ld/ARM.html#ARM) have various flags to help the "interwork" process, depending on your compilation settings. In the case of GNU ld it's called [-mthumb-interwork](https://sourceware.org/binutils/docs/ld/ARM.html)
-* This is considered a very low level and platform specific feature, so potentially having to pass additional linker args **is** considered an acceptable level of complexity for the programmer, though we should attempt to provide "good defaults" if we can of course.
+(this portion is a little extra technical, and very platform specific)
 
-TODO: `-mthumb-interwork` is an `as`/`gcc` arg, not an `ld` arg, fix the link above
+On ARM, there are two different instruction encodings. In textual/assembly form, Thumb assembly is written as a subset of ARM assembly, but the actual bit patterns produced when the text is assembled are entirely different. The CPU has a bit within the Program Status Register that indicates if the CPU should read 4 bytes at the Program Counter address and interpret them as an `a32` opcode, or if it should read 2 bytes at the Program Counter address and interpret them as a `t32` opcode. Because the amount of data read and the interpretation of the data is totally dissimilar, attempting to read one form of code while the CPU's flag is set for the other form of code is Undefined Behavior.
+
+The outside world can tell what type of code a given function is based on the address of the function: `a32` code has an even address, and `t32` code has an odd address. The Program Counter ignores the actual value of the low bit, so `t32` code is still considered to be "aligned to 2". When a branch-exchange (`bx`) or branch-link-exchange (`blx`) instruction is used then the target address's lowest bit is used to determine the CPU's new code state. When a branch (`b`) or branch-link (`bl`) instruction are used, the CPU's code state is _not_ changed.
+
+Thus, what we have to ensure with `a32` and `t32` is that the code generated for the marked function has the right encoding and also that the address is correctly even or odd:
+
+* It is _Guaranteed_ that the address of the function will be correctly even or odd, and also that the start of the function's body will be in the correct encoding.
+* It is _Hinted_ for the entire function body to generate with a single encoding.
+* If necessary, it is considered conforming for a compiler to insert only a stub of the correct encoding and address, which then jumps to a function body using another encoding. This should be considered a fallback strategy, but it would technically satisfy the requirements.
+
+Backend support:
+* In LLVM this corresponds to enabling or disabling the `thumb-mode` target feature on a particular function.
+* Other future backends (eg: Cranelift) would presumably support this in some similar way. A "quick and dirty" version of `a32`/`t32` interworking can be achieved simply by simply placing all `a32` code in one translation unit, all `t32` code in another, and then telling the linker to sort it out. Currently, Cranelift does not support ARM chips _at all_, but they can easily work towards this over time.
+* Because Miri operates on Rust's MIR stage, this attribute doesn't affect the operation of Miri. If Miri were to some day support inline assembly this attribute would need to be taken into account for that to work right, but Miri could also simply choose to not support this attribute in combination with inline assembly.
+* Assemblers and Linkers for ARM platforms have flags to enable the "interwork" of `a32` and `t32` code. If a user is writing their own assembly and then linking that with Rust code manually they might have to adjust their flags appropriately. This is mostly an implementation detail, though we can do our best to document that in the reference, and to provide any "good defaults" on our end.
+
+Inlining:
+* If a function call is inlined, there's no longer an actual branch to another address, so if an entirely rust function with the `instruction_set` attribute is inlined into the caller, there's no further effect for the attribute to have.
+* If a function with an `instruction_set` attribute _also_ contains an inline assembly block things are complicated. Even if the assembly text _were_ valid within the instruction set it was inlined into, checking if that's the case or not would involve inspecting the assembly string and then making decisions based on that, which is explicitly against the design intent of the inline assembly feature (that the compiler should generally not inspect the assembly string).
+* Unfortunately, it's also not always clear to the programmer when inlining happens because sometimes a function might be inlined up through several layers of the call stack.
+* How to resolve this is an Unresolved Question (see below).
 
 # Drawbacks
 [drawbacks]: #drawbacks
@@ -133,7 +150,7 @@ fn main() {
         DISPCNT.write_volatile(MODE3_BG2);
         let mut x = 0;
         loop {
-            // wait in a low-power state for the vertial blank to start.
+            // wait in a low-power state for the vertical blank to start.
             VBlankInterWait(0, 0);
             // draw one new red pixel per frame along the top.
             VRAM_MODE3.row(0).col(x).write(RED);
@@ -147,7 +164,7 @@ fn main() {
 
 /// Responds to any interrupt by clearing all interrupt flags
 /// and then immediately returning with no other effect.
-#[instruction_set(arm, a32)]
+#[instruction_set(arm::a32)]
 fn my_inter_fn() {
     INTER_BIOS_FLAGS.write_volatile(ALL_INTER_FLAGS);
     INTER_STANDARD_FLAGS.write_volatile(ALL_INTER_FLAGS);
@@ -157,13 +174,13 @@ fn my_inter_fn() {
 1) We setup the device with our interrupt handler.
 2) We set the device to have an interrupt every time the vertical blank starts.
 3) We set the display to use a basic bitmap mode and begin our loop.
-4) Each pass of the loop we wait for vetical blank, then draw a single pixel.
+4) Each pass of the loop we wait for vertical blank, then draw a single pixel to video memory.
 
 In the case of this particular device, the hardware interrupts go to the device's BIOS, which then calls your interrupt handler function. However, because the BIOS is `a32` code and uses a `b` branch instead of a `bx` branch-exchange, it jumps to the handler with the CPU in an `a32` state. If the handler were written as `t32` code it would immediately trigger UB.
 
 ## Alternatives 
 
-* Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another attribute; however, this is verbose, and does not fit with the `target_feature` attribute's current focus on features such as AVX and SSE whose absence is not necessarily compensated for by the presence of something else.
+* Extending `target_feature` to allow `#[target_feature(disable = "...")]` and adding `thumb-mode` to the whitelist would support this functionality without adding another distinct attribute; however, this does not fit with the `target_feature` attribute's current focus on features such as AVX and SSE whose absence is not necessarily compensated for by the presence of something else.
 
 * Doing nothing is an option; it is currently possible to incorporate code using other instruction sets through means such as external assembly and build scripts. However, this has greatly reduced ergonomics.
 
@@ -177,15 +194,17 @@ In C you can use `__attribute__((target("arm")))` and `__attribute__((target("th
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
 
-- Hopefully none?
+- How do we ensure that `instruction_set` and inline assembly always interact correctly? This isn't an implementation blocker but needs to be resolved before Stabilization of the attribute.
+  * Currently, LLVM will not inline `a32` functions into `t32` functions and vice versa, because they count as different code targets. However, this is not necessarily a guarantee from LLVM, it could just be the current implementation, so more investigation is needed.
 
 # Future possibilities
 [future-possibilities]: #future-possibilities
 
-* LLVM might eventually gain support for inter-instruction-set calls that allow calls between two arches (eg: a hybrid PowerPC/RISC-V). In that case, we could extend the attribute to allow new options.
-
 * If Rust gains support for the 65C816, the `#[instruction_set(?)]` attribute might be extended to allow shifting into its 65C02 compatibility mode and back again.
 
 * MIPS has a 16-bit encoding which uses a similar scheme as ARM, where the low bit of a function's address is set when the 16-bit encoding is in use for that function.
 
-* It might become possible to apply this attribute to trait prototypes in a future version. The main problems are properly specifying it and also that it would add additonal compiler complexity for very minimal gain (since each impl of the trait can use it on their impl of a method if they want).
+* It might become possible to apply this attribute to trait prototypes in a future versions, in which case all impls of the method would take on the attribute. The main problems are properly specifying it and also that it would add additional compiler complexity for very minimal gain.
+  * Even without this change, a particular impl of the trait can use the attribute on its methods.
+
+* LLVM might eventually gain support for inter-instruction-set calls that allow calls between two arches (eg: a hybrid PowerPC/RISC-V).