@@ -29,7 +29,7 @@ You may also be interested in the [grammar].
29
29
30
30
# Notation
31
31
32
- Rust's grammar is defined over Unicode codepoints , each conventionally denoted
32
+ Rust's grammar is defined over Unicode code points , each conventionally denoted
33
33
`U+XXXX`, for 4 or more hexadecimal digits `X`. _Most_ of Rust's grammar is
34
34
confined to the ASCII range of Unicode, and is described in this document by a
35
35
dialect of Extended Backus-Naur Form (EBNF), specifically a dialect of EBNF
53
53
- Square brackets are used to group rules.
54
54
- `LITERAL` is a single printable ASCII character, or an escaped hexadecimal
55
55
ASCII code of the form `\xQQ`, in single quotes, denoting the corresponding
56
- Unicode codepoint `U+00QQ`.
56
+ Unicode code point `U+00QQ`.
57
57
- `IDENTIFIER` is a nonempty string of ASCII letters and underscores.
58
58
- The `repeat` forms apply to the adjacent `element`, and are as follows:
59
59
- `?` means zero or one repetition
@@ -66,9 +66,9 @@ This EBNF dialect should hopefully be familiar to many readers.
66
66
67
67
## Unicode productions
68
68
69
- A few productions in Rust's grammar permit Unicode codepoints outside the ASCII
69
+ A few productions in Rust's grammar permit Unicode code points outside the ASCII
70
70
range. We define these productions in terms of character properties specified
71
- in the Unicode standard, rather than in terms of ASCII-range codepoints . The
71
+ in the Unicode standard, rather than in terms of ASCII-range code points . The
72
72
section [Special Unicode Productions](#special-unicode-productions) lists these
73
73
productions.
74
74
@@ -91,10 +91,10 @@ production. See [tokens](#tokens) for more information.
91
91
92
92
## Input format
93
93
94
- Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8.
94
+ Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.
95
95
Most Rust grammar rules are defined in terms of printable ASCII-range
96
- codepoints , but a small number are defined in terms of Unicode properties or
97
- explicit codepoint lists. [^inputformat]
96
+ code points , but a small number are defined in terms of Unicode properties or
97
+ explicit code point lists. [^inputformat]
98
98
99
99
[^inputformat]: Substitute definitions for the special Unicode productions are
100
100
provided to the grammar verifier, restricted to ASCII range, when verifying the
@@ -147,7 +147,7 @@ comments beginning with exactly one repeated asterisk in the block-open
147
147
sequence (`/**`), are interpreted as a special syntax for `doc`
148
148
[attributes](#attributes). That is, they are equivalent to writing
149
149
`#[doc="..."]` around the body of the comment (this includes the comment
150
- characters themselves, ie `/// Foo` turns into `#[doc="/// Foo"]`).
150
+ characters themselves, i.e. `/// Foo` turns into `#[doc="/// Foo"]`).
151
151
152
152
`//!` comments apply to the parent of the comment, rather than the item that
153
153
follows. `//!` comments are usually used to display information on the crate
@@ -330,14 +330,14 @@ Some additional _escapes_ are available in either character or non-raw string
330
330
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
331
331
following forms:
332
332
333
- * An _8-bit codepoint escape_ escape starts with `U+0078` (`x`) and is
334
- followed by exactly two _hex digits_. It denotes the Unicode codepoint
333
+ * An _8-bit code point escape_ starts with `U+0078` (`x`) and is
334
+ followed by exactly two _hex digits_. It denotes the Unicode code point
335
335
equal to the provided hex value.
336
- * A _24-bit codepoint escape_ starts with `U+0075` (`u`) and is followed
336
+ * A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed
337
337
by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D`
338
- (`}`). It denotes the Unicode codepoint equal to the provided hex value.
338
+ (`}`). It denotes the Unicode code point equal to the provided hex value.
339
339
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
340
- (`r`), or `U+0074` (`t`), denoting the unicode values `U+000A` (LF),
340
+ (`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF),
341
341
`U+000D` (CR) or `U+0009` (HT) respectively.
342
342
* The _backslash escape_ is the character `U+005C` (`\`) which must be
343
343
escaped in order to denote *itself*.
@@ -407,7 +407,7 @@ Some additional _escapes_ are available in either byte or non-raw byte string
407
407
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
408
408
following forms:
409
409
410
- * An _byte escape_ escape starts with `U+0078` (`x`) and is
410
+ * A _byte escape_ escape starts with `U+0078` (`x`) and is
411
411
followed by exactly two _hex digits_. It denotes the byte
412
412
equal to the provided hex value.
413
413
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
@@ -697,9 +697,9 @@ in macro rules). In the transcriber, the designator is already known, and so
697
697
only the name of a matched nonterminal comes after the dollar sign.
698
698
699
699
In both the matcher and transcriber, the Kleene star-like operator indicates
700
- repetition. The Kleene star operator consists of `$` and parens , optionally
700
+ repetition. The Kleene star operator consists of `$` and parenthesis , optionally
701
701
followed by a separator token, followed by `*` or `+`. `*` means zero or more
702
- repetitions, `+` means at least one repetition. The parens are not matched or
702
+ repetitions, `+` means at least one repetition. The parenthesis are not matched or
703
703
transcribed. On the matcher side, a name is bound to _all_ of the names it
704
704
matches, in a structure that mimics the structure of the repetition encountered
705
705
on a successful match. The job of the transcriber is to sort that structure
@@ -1209,9 +1209,9 @@ the guarantee that these issues are never caused by safe code.
1209
1209
1210
1210
[noalias]: http://llvm.org/docs/LangRef.html#noalias
1211
1211
1212
- ##### Behaviour not considered unsafe
1212
+ ##### Behavior not considered unsafe
1213
1213
1214
- This is a list of behaviour not considered *unsafe* in Rust terms, but that may
1214
+ This is a list of behavior not considered *unsafe* in Rust terms, but that may
1215
1215
be undesired.
1216
1216
1217
1217
* Deadlocks
@@ -1304,7 +1304,7 @@ specific type, but may implement several different traits, or be compatible with
1304
1304
several different type constraints.
1305
1305
1306
1306
For example, the following defines the type `Point` as a synonym for the type
1307
- `(u8, u8)`, the type of pairs of unsigned 8 bit integers. :
1307
+ `(u8, u8)`, the type of pairs of unsigned 8 bit integers:
1308
1308
1309
1309
```
1310
1310
type Point = (u8, u8);
@@ -1958,7 +1958,7 @@ type int8_t = i8;
1958
1958
1959
1959
### Crate-only attributes
1960
1960
1961
- - `crate_name` - specify the this crate's crate name.
1961
+ - `crate_name` - specify the crate's crate name.
1962
1962
- `crate_type` - see [linkage](#linkage).
1963
1963
- `feature` - see [compiler features](#compiler-features).
1964
1964
- `no_builtins` - disable optimizing certain code patterns to invocations of
@@ -3432,7 +3432,7 @@ is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
3432
3432
UTF-32 string.
3433
3433
3434
3434
A value of type `str` is a Unicode string, represented as an array of 8-bit
3435
- unsigned bytes holding a sequence of UTF-8 codepoints . Since `str` is of
3435
+ unsigned bytes holding a sequence of UTF-8 code points . Since `str` is of
3436
3436
unknown size, it is not a _first-class_ type, but can only be instantiated
3437
3437
through a pointer type, such as `&str` or `String`.
3438
3438
0 commit comments