Skip to content

Commit 801fb2c

Browse files
AlisdairMtkoeppe
authored andcommitted
[lex] Reorder subclauses to better follow phases of translation
This PR purely moves existing words around, and does not create any new content. The proposed subclause ordering is now: * 5 Lexical convensions - 5.1 Separate translation - 5.2 Phases of translation - 5.3 Characters - 5.3.1 Character sets - 5.3.2 Universal character names - 5.4 Comments - 5.5 Preprocessing tokens - 5.6 Header names - 5.7 Preprocessing numbers - 5.8 Operators and punctuators - 5.9 Alternative tokens - 5.10 Tokens - 5.11 Identifiers - 5.12 Keywords - 5.13 Literals - 5.13.1 Kinds of literals - 5.13.2 ...
1 parent cd21b72 commit 801fb2c

File tree

1 file changed

+118
-118
lines changed

1 file changed

+118
-118
lines changed

source/lex.tex

Lines changed: 118 additions & 118 deletions
Original file line numberDiff line numberDiff line change
@@ -627,83 +627,6 @@
627627
\end{example}
628628
\indextext{token!preprocessing|)}
629629

630-
\rSec1[lex.digraph]{Alternative tokens}
631-
632-
\pnum
633-
\indextext{token!alternative|(}%
634-
Alternative token representations are provided for some operators and
635-
punctuators.
636-
\begin{footnote}
637-
\indextext{digraph}%
638-
These include ``digraphs'' and additional reserved words. The term
639-
``digraph'' (token consisting of two characters) is not perfectly
640-
descriptive, since one of the alternative \grammarterm{preprocessing-token}s is
641-
\tcode{\%:\%:} and of course several primary tokens contain two
642-
characters. Nonetheless, those alternative tokens that aren't lexical
643-
keywords are colloquially known as ``digraphs''.
644-
\end{footnote}
645-
646-
\pnum
647-
In all respects of the language, each alternative token behaves the
648-
same, respectively, as its primary token, except for its spelling.
649-
\begin{footnote}
650-
Thus the ``stringized'' values\iref{cpp.stringize} of
651-
\tcode{[} and \tcode{<:} will be different, maintaining the source
652-
spelling, but the tokens can otherwise be freely interchanged.
653-
\end{footnote}
654-
The set of alternative tokens is defined in
655-
\tref{lex.digraph}.
656-
657-
\begin{tokentable}{Alternative tokens}{lex.digraph}{Alternative}{Primary}
658-
\tcode{<\%} & \tcode{\{} &
659-
\keyword{and} & \tcode{\&\&} &
660-
\keyword{and_eq} & \tcode{\&=} \\ \rowsep
661-
\tcode{\%>} & \tcode{\}} &
662-
\keyword{bitor} & \tcode{|} &
663-
\keyword{or_eq} & \tcode{|=} \\ \rowsep
664-
\tcode{<:} & \tcode{[} &
665-
\keyword{or} & \tcode{||} &
666-
\keyword{xor_eq} & \tcode{\caret=} \\ \rowsep
667-
\tcode{:>} & \tcode{]} &
668-
\keyword{xor} & \tcode{\caret} &
669-
\keyword{not} & \tcode{!} \\ \rowsep
670-
\tcode{\%:} & \tcode{\#} &
671-
\keyword{compl} & \tcode{\~} &
672-
\keyword{not_eq} & \tcode{!=} \\ \rowsep
673-
\tcode{\%:\%:} & \tcode{\#\#} &
674-
\keyword{bitand} & \tcode{\&} &
675-
& \\
676-
\end{tokentable}%
677-
\indextext{token!alternative|)}
678-
679-
\rSec1[lex.token]{Tokens}
680-
681-
\indextext{token|(}%
682-
\begin{bnf}
683-
\nontermdef{token}\br
684-
identifier\br
685-
keyword\br
686-
literal\br
687-
operator-or-punctuator
688-
\end{bnf}
689-
690-
\pnum
691-
\indextext{\idxgram{token}}%
692-
There are five kinds of tokens: identifiers, keywords, literals,%
693-
\begin{footnote}
694-
Literals include strings and character and numeric literals.
695-
\end{footnote}
696-
operators, and other separators.
697-
\indextext{whitespace}%
698-
Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments
699-
(collectively, ``whitespace''), as described below, are ignored except
700-
as they serve to separate tokens.
701-
\begin{note}
702-
Whitespace can separate otherwise adjacent identifiers, keywords, numeric
703-
literals, and alternative tokens containing alphabetic characters.
704-
\end{note}
705-
\indextext{token|)}
706-
707630
\rSec1[lex.header]{Header names}
708631

709632
\indextext{header!name|(}%
@@ -793,6 +716,124 @@
793716
a \grammarterm{floating-point-literal} token.%
794717
\indextext{number!preprocessing|)}
795718

719+
\rSec1[lex.operators]{Operators and punctuators}
720+
721+
\pnum
722+
\indextext{operator|(}%
723+
\indextext{punctuator|(}%
724+
The lexical representation of \Cpp{} programs includes a number of
725+
preprocessing tokens that are used in the syntax of the preprocessor or
726+
are converted into tokens for operators and punctuators:
727+
728+
\begin{bnf}
729+
\nontermdef{preprocessing-op-or-punc}\br
730+
preprocessing-operator\br
731+
operator-or-punctuator
732+
\end{bnf}
733+
734+
\begin{bnf}
735+
%% Ed. note: character protrusion would misalign various operators.
736+
\microtypesetup{protrusion=false}\obeyspaces
737+
\nontermdef{preprocessing-operator} \textnormal{one of}\br
738+
\terminal{\# \#\# \%: \%:\%:}
739+
\end{bnf}
740+
741+
\begin{bnf}
742+
\microtypesetup{protrusion=false}\obeyspaces
743+
\nontermdef{operator-or-punctuator} \textnormal{one of}\br
744+
\terminal{\{ \} [ ] ( )}\br
745+
\terminal{<: :> <\% \%> ; : ...}\br
746+
\terminal{? :: . .* -> ->* \~}\br
747+
\terminal{! + - * / \% \caret{} \& |}\br
748+
\terminal{= += -= *= /= \%= \caret{}= \&= |=}\br
749+
\terminal{== != < > <= >= <=> \&\& ||}\br
750+
\terminal{<< >> <<= >>= ++ -- ,}\br
751+
\terminal{\keyword{and} \keyword{or} \keyword{xor} \keyword{not} \keyword{bitand} \keyword{bitor} \keyword{compl}}\br
752+
\terminal{\keyword{and_eq} \keyword{or_eq} \keyword{xor_eq} \keyword{not_eq}}
753+
\end{bnf}
754+
755+
Each \grammarterm{operator-or-punctuator} is converted to a single token
756+
in translation phase 7\iref{lex.phases}.%
757+
\indextext{punctuator|)}%
758+
\indextext{operator|)}
759+
760+
\rSec1[lex.digraph]{Alternative tokens}
761+
762+
\pnum
763+
\indextext{token!alternative|(}%
764+
Alternative token representations are provided for some operators and
765+
punctuators.
766+
\begin{footnote}
767+
\indextext{digraph}%
768+
These include ``digraphs'' and additional reserved words. The term
769+
``digraph'' (token consisting of two characters) is not perfectly
770+
descriptive, since one of the alternative \grammarterm{preprocessing-token}s is
771+
\tcode{\%:\%:} and of course several primary tokens contain two
772+
characters. Nonetheless, those alternative tokens that aren't lexical
773+
keywords are colloquially known as ``digraphs''.
774+
\end{footnote}
775+
776+
\pnum
777+
In all respects of the language, each alternative token behaves the
778+
same, respectively, as its primary token, except for its spelling.
779+
\begin{footnote}
780+
Thus the ``stringized'' values\iref{cpp.stringize} of
781+
\tcode{[} and \tcode{<:} will be different, maintaining the source
782+
spelling, but the tokens can otherwise be freely interchanged.
783+
\end{footnote}
784+
The set of alternative tokens is defined in
785+
\tref{lex.digraph}.
786+
787+
\begin{tokentable}{Alternative tokens}{lex.digraph}{Alternative}{Primary}
788+
\tcode{<\%} & \tcode{\{} &
789+
\keyword{and} & \tcode{\&\&} &
790+
\keyword{and_eq} & \tcode{\&=} \\ \rowsep
791+
\tcode{\%>} & \tcode{\}} &
792+
\keyword{bitor} & \tcode{|} &
793+
\keyword{or_eq} & \tcode{|=} \\ \rowsep
794+
\tcode{<:} & \tcode{[} &
795+
\keyword{or} & \tcode{||} &
796+
\keyword{xor_eq} & \tcode{\caret=} \\ \rowsep
797+
\tcode{:>} & \tcode{]} &
798+
\keyword{xor} & \tcode{\caret} &
799+
\keyword{not} & \tcode{!} \\ \rowsep
800+
\tcode{\%:} & \tcode{\#} &
801+
\keyword{compl} & \tcode{\~} &
802+
\keyword{not_eq} & \tcode{!=} \\ \rowsep
803+
\tcode{\%:\%:} & \tcode{\#\#} &
804+
\keyword{bitand} & \tcode{\&} &
805+
& \\
806+
\end{tokentable}%
807+
\indextext{token!alternative|)}
808+
809+
\rSec1[lex.token]{Tokens}
810+
811+
\indextext{token|(}%
812+
\begin{bnf}
813+
\nontermdef{token}\br
814+
identifier\br
815+
keyword\br
816+
literal\br
817+
operator-or-punctuator
818+
\end{bnf}
819+
820+
\pnum
821+
\indextext{\idxgram{token}}%
822+
There are five kinds of tokens: identifiers, keywords, literals,%
823+
\begin{footnote}
824+
Literals include strings and character and numeric literals.
825+
\end{footnote}
826+
operators, and other separators.
827+
\indextext{whitespace}%
828+
Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments
829+
(collectively, ``whitespace''), as described below, are ignored except
830+
as they serve to separate tokens.
831+
\begin{note}
832+
Whitespace can separate otherwise adjacent identifiers, keywords, numeric
833+
literals, and alternative tokens containing alphabetic characters.
834+
\end{note}
835+
\indextext{token|)}
836+
796837
\rSec1[lex.name]{Identifiers}
797838

798839
\indextext{identifier|(}%
@@ -1038,47 +1079,6 @@
10381079
\indextext{keyword|)}%
10391080

10401081

1041-
\rSec1[lex.operators]{Operators and punctuators}
1042-
1043-
\pnum
1044-
\indextext{operator|(}%
1045-
\indextext{punctuator|(}%
1046-
The lexical representation of \Cpp{} programs includes a number of
1047-
preprocessing tokens that are used in the syntax of the preprocessor or
1048-
are converted into tokens for operators and punctuators:
1049-
1050-
\begin{bnf}
1051-
\nontermdef{preprocessing-op-or-punc}\br
1052-
preprocessing-operator\br
1053-
operator-or-punctuator
1054-
\end{bnf}
1055-
1056-
\begin{bnf}
1057-
%% Ed. note: character protrusion would misalign various operators.
1058-
\microtypesetup{protrusion=false}\obeyspaces
1059-
\nontermdef{preprocessing-operator} \textnormal{one of}\br
1060-
\terminal{\# \#\# \%: \%:\%:}
1061-
\end{bnf}
1062-
1063-
\begin{bnf}
1064-
\microtypesetup{protrusion=false}\obeyspaces
1065-
\nontermdef{operator-or-punctuator} \textnormal{one of}\br
1066-
\terminal{\{ \} [ ] ( )}\br
1067-
\terminal{<: :> <\% \%> ; : ...}\br
1068-
\terminal{? :: . .* -> ->* \~}\br
1069-
\terminal{! + - * / \% \caret{} \& |}\br
1070-
\terminal{= += -= *= /= \%= \caret{}= \&= |=}\br
1071-
\terminal{== != < > <= >= <=> \&\& ||}\br
1072-
\terminal{<< >> <<= >>= ++ -- ,}\br
1073-
\terminal{\keyword{and} \keyword{or} \keyword{xor} \keyword{not} \keyword{bitand} \keyword{bitor} \keyword{compl}}\br
1074-
\terminal{\keyword{and_eq} \keyword{or_eq} \keyword{xor_eq} \keyword{not_eq}}
1075-
\end{bnf}
1076-
1077-
Each \grammarterm{operator-or-punctuator} is converted to a single token
1078-
in translation phase 7\iref{lex.phases}.%
1079-
\indextext{punctuator|)}%
1080-
\indextext{operator|)}
1081-
10821082
\rSec1[lex.literal]{Literals}%
10831083
\indextext{literal|(}
10841084

0 commit comments

Comments
 (0)