Skip to content

Commit 81231d9

Browse files
committed
#13219: clarify section about character sets in the re documentation.
1 parent fdd4575 commit 81231d9

File tree

1 file changed

+30
-24
lines changed

1 file changed

+30
-24
lines changed

Doc/library/re.rst

Lines changed: 30 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -161,30 +161,36 @@ The special characters are:
161161
raw strings for all but the simplest expressions.
162162

163163
``[]``
164-
Used to indicate a set of characters. Characters can be listed individually, or
165-
a range of characters can be indicated by giving two characters and separating
166-
them by a ``'-'``. Special characters are not active inside sets. For example,
167-
``[akm$]`` will match any of the characters ``'a'``, ``'k'``,
168-
``'m'``, or ``'$'``; ``[a-z]`` will match any lowercase letter, and
169-
``[a-zA-Z0-9]`` matches any letter or digit. Character classes such
170-
as ``\w`` or ``\S`` (defined below) are also acceptable inside a
171-
range, although the characters they match depends on whether
172-
:const:`ASCII` or :const:`LOCALE` mode is in force. If you want to
173-
include a ``']'`` or a ``'-'`` inside a set, precede it with a
174-
backslash, or place it as the first character. The pattern ``[]]``
175-
will match ``']'``, for example.
176-
177-
You can match the characters not within a range by :dfn:`complementing` the set.
178-
This is indicated by including a ``'^'`` as the first character of the set;
179-
``'^'`` elsewhere will simply match the ``'^'`` character. For example,
180-
``[^5]`` will match any character except ``'5'``, and ``[^^]`` will match any
181-
character except ``'^'``.
182-
183-
Note that inside ``[]`` the special forms and special characters lose
184-
their meanings and only the syntaxes described here are valid. For
185-
example, ``+``, ``*``, ``(``, ``)``, and so on are treated as
186-
literals inside ``[]``, and backreferences cannot be used inside
187-
``[]``.
164+
Used to indicate a set of characters. In a set:
165+
166+
* Characters can be listed individually, e.g. ``[amk]`` will match ``'a'``,
167+
``'m'``, or ``'k'``.
168+
169+
* Ranges of characters can be indicated by giving two characters and separating
170+
them by a ``'-'``, for example ``[a-z]`` will match any lowercase ASCII letter,
171+
``[0-5][0-9]`` will match all the two-digits numbers from ``00`` to ``59``, and
172+
``[0-9A-Fa-f]`` will match any hexadecimal digit. If ``-`` is escaped (e.g.
173+
``[a\-z]``) or if it's placed as the first or last character (e.g. ``[a-]``),
174+
it will match a literal ``'-'``.
175+
176+
* Special characters lose their special meaning inside sets. For example,
177+
``[(+*)]`` will match any of the literal characters ``'('``, ``'+'``,
178+
``'*'``, or ``')'``.
179+
180+
* Character classes such as ``\w`` or ``\S`` (defined below) are also accepted
181+
inside a set, although the characters they match depends on whether
182+
:const:`ASCII` or :const:`LOCALE` mode is in force.
183+
184+
* Characters that are not within a range can be matched by :dfn:`complementing`
185+
the set. If the first character of the set is ``'^'``, all the characters
186+
that are *not* in the set will be matched. For example, ``[^5]`` will match
187+
any character except ``'5'``, and ``[^^]`` will match any character except
188+
``'^'``. ``^`` has no special meaning if it's not the first character in
189+
the set.
190+
191+
* To match a literal ``']'`` inside a set, precede it with a backslash, or
192+
place it at the beginning of the set. For example, both ``[()[\]{}]`` and
193+
``[]()[{}]`` will both match a parenthesis.
188194

189195
``'|'``
190196
``A|B``, where A and B can be arbitrary REs, creates a regular expression that

0 commit comments

Comments
 (0)