@@ -59,7 +59,7 @@ Unicode scalar values may appear within {StringValue} and {Comment}.
59
59
60
60
Note: An implementation which uses _ UTF-16_ to represent GraphQL documents in
61
61
memory (for example, JavaScript or Java) may encounter a _ surrogate pair_ . This
62
- encodes a _ supplementary code point_ and is a single valid source character,
62
+ encodes one _ supplementary code point_ and is a single valid source character,
63
63
however an unpaired _ surrogate code point_ is not a valid source character.
64
64
65
65
### White Space
@@ -105,10 +105,9 @@ CommentChar :: SourceCharacter but not LineTerminator
105
105
GraphQL source documents may contain single-line comments, starting with the
106
106
{` # ` } marker.
107
107
108
- A comment can contain any Unicode code point in {SourceCharacter} except
109
- {LineTerminator} so a comment always consists of all code points starting with
110
- the {` # ` } character up to but not including the {LineTerminator} (or end of the
111
- source).
108
+ A comment may contain any {SourceCharacter} except {LineTerminator} so a comment
109
+ always consists of all {SourceCharacter} starting with the {` # ` } character up to
110
+ but not including the {LineTerminator} (or end of the source).
112
111
113
112
Comments are {Ignored} like white space and may appear after any token, or
114
113
before a {LineTerminator}, and have no significance to the semantic meaning of a
@@ -171,10 +170,9 @@ UnicodeBOM :: "Byte Order Mark (U+FEFF)"
171
170
172
171
The _ Byte Order Mark_ is a special Unicode code point which may appear at the
173
172
beginning of a file which programs may use to determine the fact that the text
174
- stream is Unicode, and what specific encoding has been used.
175
-
176
- As files are often concatenated, a _ Byte Order Mark_ may appear anywhere within
177
- a GraphQL document and is {Ignored}.
173
+ stream is Unicode, and what specific encoding has been used. As files are often
174
+ concatenated, a _ Byte Order Mark_ may appear before or after any lexical token
175
+ and is {Ignored}.
178
176
179
177
### Punctuators
180
178
@@ -831,13 +829,10 @@ BlockStringCharacter ::
831
829
- SourceCharacter but not ` """ ` or ` \""" `
832
830
- ` \""" `
833
831
834
- {StringValue} is a sequence of characters wrapped in quotation marks (U+0022).
835
- (ex. {` "Hello World" ` }). White space and other characters ignored in other parts
836
- of a GraphQL document are significant within a string value.
837
-
838
- A {StringValue} is evaluated to a Unicode text value, a sequence of Unicode
839
- scalar values, by interpreting all escape sequences using the static semantics
840
- defined below.
832
+ A {StringValue} is evaluated to a _ Unicode text_ value, a sequence of _ Unicode
833
+ scalar value_ , by interpreting all escape sequences using the static semantics
834
+ defined below. White space and other characters ignored between lexical tokens
835
+ are significant within a string value.
841
836
842
837
The empty string {` "" ` } must not be followed by another {` " ` } otherwise it would
843
838
be interpreted as the beginning of a block string. As an example, the source
@@ -846,43 +841,45 @@ empty strings.
846
841
847
842
** Escape Sequences**
848
843
849
- In a single-quoted {StringValue}, any Unicode scalar value may be expressed
844
+ In a single-quoted {StringValue}, any _ Unicode scalar value _ may be expressed
850
845
using an escape sequence. GraphQL strings allow both C-style escape sequences
851
846
(for example ` \n ` ) and two forms of Unicode escape sequences: one with a
852
847
fixed-width of 4 hexadecimal digits (for example ` \u000A ` ) and one with a
853
848
variable-width most useful for representing a _ supplementary character_ such as
854
849
an Emoji (for example ` \u{1F4A9} ` ).
855
850
856
851
The hexadecimal number encoded by a Unicode escape sequence must describe a
857
- Unicode scalar value , otherwise parsing should stop with an early error. For
858
- example both sources ` "\uDEAD" ` and ` "\u{110000}" ` should not be considered
859
- valid {StringValue}.
852
+ _ Unicode scalar value _ , otherwise must result in a parse error. For example both
853
+ sources ` "\uDEAD" ` and ` "\u{110000}" ` should not be considered valid
854
+ {StringValue}.
860
855
861
856
Escape sequences are only meaningful within a single-quoted string. Within a
862
857
block string, they are simply that sequence of characters (for example
863
- ` """\n""" ` represents the Unicode text [ U+005C, U+006E] ). Within a comment an
858
+ ` """\n""" ` represents the _ Unicode text _ [ U+005C, U+006E] ). Within a comment an
864
859
escape sequence is not a significant sequence of characters. They may not appear
865
860
elsewhere in a GraphQL document.
866
861
867
- Since {StringCharacter} must not contain some characters, escape sequences must
868
- be used to represent these characters. All other escape sequences are optional
869
- and unescaped non-ASCII Unicode characters are allowed within strings. If using
870
- GraphQL within a system which only supports ASCII, then escape sequences may be
871
- used to represent all Unicode characters outside of the ASCII range.
862
+ Since {StringCharacter} must not contain some code points directly (for example,
863
+ a {LineTerminator}), escape sequences must be used to represent them. All other
864
+ escape sequences are optional and unescaped non-ASCII Unicode characters are
865
+ allowed within strings. If using GraphQL within a system which only supports
866
+ ASCII, then escape sequences may be used to represent all Unicode characters
867
+ outside of the ASCII range.
872
868
873
869
For legacy reasons, a _ supplementary character_ may be escaped by two
874
870
fixed-width unicode escape sequences forming a _ surrogate pair_ . For example the
875
871
input ` "\uD83D\uDCA9" ` is a valid {StringValue} which represents the same
876
- Unicode text as ` "\u{1F4A9}" ` . While this legacy form is allowed, it should be
872
+ _ Unicode text _ as ` "\u{1F4A9}" ` . While this legacy form is allowed, it should be
877
873
avoided as a variable-width unicode escape sequence is a clearer way to encode
878
874
such code points.
879
875
880
876
When producing a {StringValue}, implementations should use escape sequences to
881
877
represent non-printable control characters (U+0000 to U+001F and U+007F to
882
878
U+009F). Other escape sequences are not necessary, however an implementation may
883
- use escape sequences to represent any other range of code points. If an
884
- implementation chooses to escape a _ supplementary character_ , it should not use
885
- a fixed-width surrogate pair unicode escape sequence.
879
+ use escape sequences to represent any other range of code points (for example,
880
+ when producing ASCII-only output). If an implementation chooses to escape a
881
+ _ supplementary character_ , it should only use a variable-width unicode escape
882
+ sequence.
886
883
887
884
** Block Strings**
888
885
@@ -940,19 +937,21 @@ string.
940
937
941
938
** Static Semantics**
942
939
943
- A {StringValue} describes a Unicode text value, a sequence of * Unicode scalar
944
- value* s. These semantics describe how to apply the {StringValue} grammar to a
945
- source text to evaluate a Unicode text. Errors encountered during this
946
- evaluation are considered a failure to apply the {StringValue} grammar to a
947
- source and result in a parsing error.
940
+ :: A {StringValue} describes a _ Unicode text_ value, which is a sequence of
941
+ _ Unicode scalar value_ .
942
+
943
+ These semantics describe how to apply the {StringValue} grammar to a source text
944
+ to evaluate a _ Unicode text_ . Errors encountered during this evaluation are
945
+ considered a failure to apply the {StringValue} grammar to a source and must
946
+ result in a parsing error.
948
947
949
948
StringValue :: ` "" `
950
949
951
950
- Return an empty sequence.
952
951
953
952
StringValue :: ` " ` StringCharacter+ ` " `
954
953
955
- - Return the concatenated sequence of _ Unicode scalar value _ by evaluating all
954
+ - Return the _ Unicode text _ by concatenating the evaluation of all
956
955
{StringCharacter}.
957
956
958
957
StringCharacter :: SourceCharacter but not ` " ` or ` \ ` or LineTerminator
@@ -965,7 +964,7 @@ StringCharacter :: `\u` EscapedUnicode
965
964
within {EscapedUnicode}.
966
965
- Assert {value} is a within the _ Unicode scalar value_ range (>= 0x0000 and <=
967
966
0xD7FF or >= 0xE000 and <= 0x10FFFF).
968
- - Return the code point {value}.
967
+ - Return the _ Unicode scalar value _ {value}.
969
968
970
969
StringCharacter :: ` \u ` HexDigit HexDigit HexDigit HexDigit ` \u ` HexDigit
971
970
HexDigit HexDigit HexDigit
@@ -981,8 +980,8 @@ HexDigit HexDigit HexDigit
981
980
- Otherwise:
982
981
- Assert {leadingValue} is within the _ Unicode scalar value_ range.
983
982
- Assert {trailingValue} is within the _ Unicode scalar value_ range.
984
- - Return the sequence of the code point {leadingValue} followed by the code
985
- point {trailingValue}.
983
+ - Return the sequence of the _ Unicode scalar value _ {leadingValue} followed by
984
+ the _ Unicode scalar value _ {trailingValue}.
986
985
987
986
Note: If both escape sequences encode a _ Unicode scalar value_ , then this
988
987
semantic is identical to applying the prior semantic on each fixed-width escape
@@ -991,24 +990,24 @@ value_.
991
990
992
991
StringCharacter :: ` \ ` EscapedCharacter
993
992
994
- - Return the code point represented by {EscapedCharacter} according to the table
995
- below.
993
+ - Return the _ Unicode scalar value _ represented by {EscapedCharacter} according
994
+ to the table below.
996
995
997
- | Escaped Character | Code Point | Character Name |
998
- | ----------------- | ---------- | ---------------------------- |
999
- | {` " ` } | U+0022 | double quote |
1000
- | {` \ ` } | U+005C | reverse solidus (back slash) |
1001
- | {` / ` } | U+002F | solidus (forward slash) |
1002
- | {` b ` } | U+0008 | backspace |
1003
- | {` f ` } | U+000C | form feed |
1004
- | {` n ` } | U+000A | line feed (new line) |
1005
- | {` r ` } | U+000D | carriage return |
1006
- | {` t ` } | U+0009 | horizontal tab |
996
+ | Escaped Character | Scalar Value | Character Name |
997
+ | ----------------- | ------------ | ---------------------------- |
998
+ | {` " ` } | U+0022 | double quote |
999
+ | {` \ ` } | U+005C | reverse solidus (back slash) |
1000
+ | {` / ` } | U+002F | solidus (forward slash) |
1001
+ | {` b ` } | U+0008 | backspace |
1002
+ | {` f ` } | U+000C | form feed |
1003
+ | {` n ` } | U+000A | line feed (new line) |
1004
+ | {` r ` } | U+000D | carriage return |
1005
+ | {` t ` } | U+0009 | horizontal tab |
1007
1006
1008
1007
StringValue :: ` """ ` BlockStringCharacter\* ` """ `
1009
1008
1010
- - Let {rawValue} be the concatenated sequence of _ Unicode scalar value _ by
1011
- evaluating all {BlockStringCharacter} (which may be an empty sequence).
1009
+ - Let {rawValue} be the _ Unicode text _ by concatenating the evaluation of all
1010
+ {BlockStringCharacter} (which may be an empty sequence).
1012
1011
- Return the result of {BlockStringValue(rawValue)}.
1013
1012
1014
1013
BlockStringCharacter :: SourceCharacter but not ` """ ` or ` \""" `
0 commit comments