Skip to content

Commit 495da29

Browse files
gvanrossummiss-islington
authored andcommitted
bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086)
This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.) https://bugs.python.org/issue35975
1 parent bf94cc7 commit 495da29

29 files changed

+473
-198
lines changed

Doc/library/ast.rst

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ The abstract grammar is currently defined as follows:
126126
Apart from the node classes, the :mod:`ast` module defines these utility functions
127127
and classes for traversing abstract syntax trees:
128128

129-
.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False)
129+
.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False, feature_version=-1)
130130

131131
Parse the source into an AST node. Equivalent to ``compile(source,
132132
filename, mode, ast.PyCF_ONLY_AST)``.
@@ -145,13 +145,19 @@ and classes for traversing abstract syntax trees:
145145
modified to correspond to :pep:`484` "signature type comments",
146146
e.g. ``(str, int) -> List[str]``.
147147

148+
Also, setting ``feature_version`` to the minor version of an
149+
earlier Python 3 version will attempt to parse using that version's
150+
grammar. For example, setting ``feature_version=4`` will allow
151+
the use of ``async`` and ``await`` as variable names. The lowest
152+
supported value is 4; the highest is ``sys.version_info[1]``.
153+
148154
.. warning::
149155
It is possible to crash the Python interpreter with a
150156
sufficiently large/complex string due to stack depth limitations
151157
in Python's AST compiler.
152158

153159
.. versionchanged:: 3.8
154-
Added ``type_comments=True`` and ``mode='func_type'``.
160+
Added ``type_comments``, ``mode='func_type'`` and ``feature_version``.
155161

156162

157163
.. function:: literal_eval(node_or_string)

Doc/library/token-list.inc

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Doc/library/token.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,3 +88,6 @@ the :mod:`tokenize` module.
8888

8989
.. versionchanged:: 3.8
9090
Added :data:`TYPE_COMMENT`.
91+
Added :data:`AWAIT` and :data:`ASYNC` tokens back (they're needed
92+
to support parsing older Python versions for :func:`ast.parse` with
93+
``feature_version`` set to 6 or lower).

Grammar/Grammar

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
1818
decorators: decorator+
1919
decorated: decorators (classdef | funcdef | async_funcdef)
2020

21-
async_funcdef: 'async' funcdef
21+
async_funcdef: ASYNC funcdef
2222
funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] func_body_suite
2323

2424
parameters: '(' [typedargslist] ')'
@@ -70,7 +70,7 @@ nonlocal_stmt: 'nonlocal' NAME (',' NAME)*
7070
assert_stmt: 'assert' test [',' test]
7171

7272
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
73-
async_stmt: 'async' (funcdef | with_stmt | for_stmt)
73+
async_stmt: ASYNC (funcdef | with_stmt | for_stmt)
7474
if_stmt: 'if' namedexpr_test ':' suite ('elif' namedexpr_test ':' suite)* ['else' ':' suite]
7575
while_stmt: 'while' namedexpr_test ':' suite ['else' ':' suite]
7676
for_stmt: 'for' exprlist 'in' testlist ':' [TYPE_COMMENT] suite ['else' ':' suite]
@@ -106,7 +106,7 @@ arith_expr: term (('+'|'-') term)*
106106
term: factor (('*'|'@'|'/'|'%'|'//') factor)*
107107
factor: ('+'|'-'|'~') factor | power
108108
power: atom_expr ['**' factor]
109-
atom_expr: ['await'] atom trailer*
109+
atom_expr: [AWAIT] atom trailer*
110110
atom: ('(' [yield_expr|testlist_comp] ')' |
111111
'[' [testlist_comp] ']' |
112112
'{' [dictorsetmaker] '}' |
@@ -144,7 +144,7 @@ argument: ( test [comp_for] |
144144

145145
comp_iter: comp_for | comp_if
146146
sync_comp_for: 'for' exprlist 'in' or_test [comp_iter]
147-
comp_for: ['async'] sync_comp_for
147+
comp_for: [ASYNC] sync_comp_for
148148
comp_if: 'if' test_nocond [comp_iter]
149149

150150
# not used in grammar, but may appear in "node" passed from Parser to Compiler

Grammar/Tokens

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ ELLIPSIS '...'
5555
COLONEQUAL ':='
5656

5757
OP
58+
AWAIT
59+
ASYNC
5860
TYPE_IGNORE
5961
TYPE_COMMENT
6062
ERRORTOKEN

Include/Python-ast.h

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/compile.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ PyAPI_FUNC(PyCodeObject *) PyNode_Compile(struct _node *, const char *);
2727
#ifndef Py_LIMITED_API
2828
typedef struct {
2929
int cf_flags; /* bitmask of CO_xxx flags relevant to future */
30+
int cf_feature_version; /* minor Python version (PyCF_ONLY_AST) */
3031
} PyCompilerFlags;
3132
#endif
3233

Include/parsetok.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ typedef struct {
3535
#define PyPARSE_IGNORE_COOKIE 0x0010
3636
#define PyPARSE_BARRY_AS_BDFL 0x0020
3737
#define PyPARSE_TYPE_COMMENTS 0x0040
38+
#define PyPARSE_ASYNC_HACKS 0x0080
3839

3940
PyAPI_FUNC(node *) PyParser_ParseString(const char *, grammar *, int,
4041
perrdetail *);

Include/token.h

Lines changed: 6 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Lib/ast.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,8 @@
2727
from _ast import *
2828

2929

30-
def parse(source, filename='<unknown>', mode='exec', *, type_comments=False):
30+
def parse(source, filename='<unknown>', mode='exec', *,
31+
type_comments=False, feature_version=-1):
3132
"""
3233
Parse the source into an AST node.
3334
Equivalent to compile(source, filename, mode, PyCF_ONLY_AST).
@@ -36,7 +37,8 @@ def parse(source, filename='<unknown>', mode='exec', *, type_comments=False):
3637
flags = PyCF_ONLY_AST
3738
if type_comments:
3839
flags |= PyCF_TYPE_COMMENTS
39-
return compile(source, filename, mode, flags)
40+
return compile(source, filename, mode, flags,
41+
feature_version=feature_version)
4042

4143

4244
def literal_eval(node_or_string):

Lib/keyword.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,6 @@
2020
'and',
2121
'as',
2222
'assert',
23-
'async',
24-
'await',
2523
'break',
2624
'class',
2725
'continue',
@@ -52,6 +50,10 @@
5250
#--end keywords--
5351
]
5452

53+
kwlist.append('async')
54+
kwlist.append('await')
55+
kwlist.sort()
56+
5557
iskeyword = frozenset(kwlist).__contains__
5658

5759
def main():

Lib/test/test_parser.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -916,7 +916,7 @@ def XXXROUNDUP(n):
916916
return (n + 3) & ~3
917917
return 1 << (n - 1).bit_length()
918918

919-
basesize = support.calcobjsize('Pii')
919+
basesize = support.calcobjsize('Piii')
920920
nodesize = struct.calcsize('hP3iP0h2i')
921921
def sizeofchildren(node):
922922
if node is None:

Lib/test/test_type_comments.py

Lines changed: 102 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import ast
2+
import sys
23
import unittest
34

45

@@ -20,6 +21,29 @@ async def bar(): # type: () -> int
2021
return await bar()
2122
"""
2223

24+
asyncvar = """\
25+
async = 12
26+
await = 13
27+
"""
28+
29+
asynccomp = """\
30+
async def foo(xs):
31+
[x async for x in xs]
32+
"""
33+
34+
matmul = """\
35+
a = b @ c
36+
"""
37+
38+
fstring = """\
39+
a = 42
40+
f"{a}"
41+
"""
42+
43+
underscorednumber = """\
44+
a = 42_42_42
45+
"""
46+
2347
redundantdef = """\
2448
def foo(): # type: () -> int
2549
# type: () -> str
@@ -155,80 +179,117 @@ def favk(
155179

156180
class TypeCommentTests(unittest.TestCase):
157181

158-
def parse(self, source):
159-
return ast.parse(source, type_comments=True)
182+
lowest = 4 # Lowest minor version supported
183+
highest = sys.version_info[1] # Highest minor version
184+
185+
def parse(self, source, feature_version=highest):
186+
return ast.parse(source, type_comments=True,
187+
feature_version=feature_version)
188+
189+
def parse_all(self, source, minver=lowest, maxver=highest, expected_regex=""):
190+
for feature_version in range(self.lowest, self.highest + 1):
191+
if minver <= feature_version <= maxver:
192+
try:
193+
yield self.parse(source, feature_version)
194+
except SyntaxError as err:
195+
raise SyntaxError(str(err) + f" feature_version={feature_version}")
196+
else:
197+
with self.assertRaisesRegex(SyntaxError, expected_regex,
198+
msg=f"feature_version={feature_version}"):
199+
self.parse(source, feature_version)
160200

161201
def classic_parse(self, source):
162202
return ast.parse(source)
163203

164204
def test_funcdef(self):
165-
tree = self.parse(funcdef)
166-
self.assertEqual(tree.body[0].type_comment, "() -> int")
167-
self.assertEqual(tree.body[1].type_comment, "() -> None")
205+
for tree in self.parse_all(funcdef):
206+
self.assertEqual(tree.body[0].type_comment, "() -> int")
207+
self.assertEqual(tree.body[1].type_comment, "() -> None")
168208
tree = self.classic_parse(funcdef)
169209
self.assertEqual(tree.body[0].type_comment, None)
170210
self.assertEqual(tree.body[1].type_comment, None)
171211

172212
def test_asyncdef(self):
173-
tree = self.parse(asyncdef)
174-
self.assertEqual(tree.body[0].type_comment, "() -> int")
175-
self.assertEqual(tree.body[1].type_comment, "() -> int")
213+
for tree in self.parse_all(asyncdef, minver=5):
214+
self.assertEqual(tree.body[0].type_comment, "() -> int")
215+
self.assertEqual(tree.body[1].type_comment, "() -> int")
176216
tree = self.classic_parse(asyncdef)
177217
self.assertEqual(tree.body[0].type_comment, None)
178218
self.assertEqual(tree.body[1].type_comment, None)
179219

220+
def test_asyncvar(self):
221+
for tree in self.parse_all(asyncvar, maxver=6):
222+
pass
223+
224+
def test_asynccomp(self):
225+
for tree in self.parse_all(asynccomp, minver=6):
226+
pass
227+
228+
def test_matmul(self):
229+
for tree in self.parse_all(matmul, minver=5):
230+
pass
231+
232+
def test_fstring(self):
233+
for tree in self.parse_all(fstring, minver=6):
234+
pass
235+
236+
def test_underscorednumber(self):
237+
for tree in self.parse_all(underscorednumber, minver=6):
238+
pass
239+
180240
def test_redundantdef(self):
181-
with self.assertRaisesRegex(SyntaxError, "^Cannot have two type comments on def"):
182-
tree = self.parse(redundantdef)
241+
for tree in self.parse_all(redundantdef, maxver=0,
242+
expected_regex="^Cannot have two type comments on def"):
243+
pass
183244

184245
def test_nonasciidef(self):
185-
tree = self.parse(nonasciidef)
186-
self.assertEqual(tree.body[0].type_comment, "() -> àçčéñt")
246+
for tree in self.parse_all(nonasciidef):
247+
self.assertEqual(tree.body[0].type_comment, "() -> àçčéñt")
187248

188249
def test_forstmt(self):
189-
tree = self.parse(forstmt)
190-
self.assertEqual(tree.body[0].type_comment, "int")
250+
for tree in self.parse_all(forstmt):
251+
self.assertEqual(tree.body[0].type_comment, "int")
191252
tree = self.classic_parse(forstmt)
192253
self.assertEqual(tree.body[0].type_comment, None)
193254

194255
def test_withstmt(self):
195-
tree = self.parse(withstmt)
196-
self.assertEqual(tree.body[0].type_comment, "int")
256+
for tree in self.parse_all(withstmt):
257+
self.assertEqual(tree.body[0].type_comment, "int")
197258
tree = self.classic_parse(withstmt)
198259
self.assertEqual(tree.body[0].type_comment, None)
199260

200261
def test_vardecl(self):
201-
tree = self.parse(vardecl)
202-
self.assertEqual(tree.body[0].type_comment, "int")
262+
for tree in self.parse_all(vardecl):
263+
self.assertEqual(tree.body[0].type_comment, "int")
203264
tree = self.classic_parse(vardecl)
204265
self.assertEqual(tree.body[0].type_comment, None)
205266

206267
def test_ignores(self):
207-
tree = self.parse(ignores)
208-
self.assertEqual([ti.lineno for ti in tree.type_ignores], [2, 5])
268+
for tree in self.parse_all(ignores):
269+
self.assertEqual([ti.lineno for ti in tree.type_ignores], [2, 5])
209270
tree = self.classic_parse(ignores)
210271
self.assertEqual(tree.type_ignores, [])
211272

212273
def test_longargs(self):
213-
tree = self.parse(longargs)
214-
for t in tree.body:
215-
# The expected args are encoded in the function name
216-
todo = set(t.name[1:])
217-
self.assertEqual(len(t.args.args),
218-
len(todo) - bool(t.args.vararg) - bool(t.args.kwarg))
219-
self.assertTrue(t.name.startswith('f'), t.name)
220-
for c in t.name[1:]:
221-
todo.remove(c)
222-
if c == 'v':
223-
arg = t.args.vararg
224-
elif c == 'k':
225-
arg = t.args.kwarg
226-
else:
227-
assert 0 <= ord(c) - ord('a') < len(t.args.args)
228-
arg = t.args.args[ord(c) - ord('a')]
229-
self.assertEqual(arg.arg, c) # That's the argument name
230-
self.assertEqual(arg.type_comment, arg.arg.upper())
231-
assert not todo
274+
for tree in self.parse_all(longargs):
275+
for t in tree.body:
276+
# The expected args are encoded in the function name
277+
todo = set(t.name[1:])
278+
self.assertEqual(len(t.args.args),
279+
len(todo) - bool(t.args.vararg) - bool(t.args.kwarg))
280+
self.assertTrue(t.name.startswith('f'), t.name)
281+
for c in t.name[1:]:
282+
todo.remove(c)
283+
if c == 'v':
284+
arg = t.args.vararg
285+
elif c == 'k':
286+
arg = t.args.kwarg
287+
else:
288+
assert 0 <= ord(c) - ord('a') < len(t.args.args)
289+
arg = t.args.args[ord(c) - ord('a')]
290+
self.assertEqual(arg.arg, c) # That's the argument name
291+
self.assertEqual(arg.type_comment, arg.arg.upper())
292+
assert not todo
232293
tree = self.classic_parse(longargs)
233294
for t in tree.body:
234295
for arg in t.args.args + [t.args.vararg, t.args.kwarg]:
@@ -247,8 +308,8 @@ def test_inappropriate_type_comments(self):
247308

248309
def check_both_ways(source):
249310
ast.parse(source, type_comments=False)
250-
with self.assertRaises(SyntaxError):
251-
ast.parse(source, type_comments=True)
311+
for tree in self.parse_all(source, maxver=0):
312+
pass
252313

253314
check_both_ways("pass # type: int\n")
254315
check_both_ways("foo() # type: int\n")

0 commit comments

Comments
 (0)