-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-30597: Shows expected input in custom 'print' error message #2009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30597: Shows expected input in custom 'print' error message #2009
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some unit tests for this?
There appears to be a newline character between the error_msg and the error_msg_end:
|
Hi @lisroach, yes, I'm on my way to figuring that out; so it's still a work in progress. Also, I need to do additional checks which @ncoghlan mentioned on the bug. I'll update this patch soon. About the unit test, what is the best place to write them at? Probably I can refer to some earlier test cases written and would be able to write this up too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The basic approach here looks good to me, just a few structural and technical comments inline.
Objects/exceptions.c
Outdated
@@ -2897,8 +2897,24 @@ _check_for_legacy_statements(PySyntaxErrorObject *self, Py_ssize_t start) | |||
} | |||
if (PyUnicode_Tailmatch(self->text, print_prefix, | |||
start, text_len, -1)) { | |||
Py_XSETREF(self->msg, | |||
PyUnicode_FromString("Missing parentheses in call to 'print'")); | |||
char *error_msg_start = "Missing parentheses in call to 'print'. Did you mean 'print("; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting complex enough now that I think it makes sense to separate it out as a second static helper: _set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
Objects/exceptions.c
Outdated
strlen(error_msg_end)) + 1; | ||
|
||
strcat(error_msg, error_msg_start); | ||
strcat(error_msg, msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the discussion on the issue tracker, we want to check for a trailing ,
here, and if so, also append end=' '
to request the Python 3 equivalent of Python 2's soft-space behaviour.
(That will also impact the working memory allocation - the simplest option would be to always allocate enough space to handle the suggest_end_arg
case, since the working string will be null-terminated anyway)
Objects/exceptions.c
Outdated
char *msg = data + PRINT_OFFSET; | ||
char *error_msg_end = ")'?"; | ||
|
||
char *error_msg = malloc(strlen(error_msg_start) + \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This should use
PyMem_Malloc
andPyMem_Free
rather than the direct C APIs. Also, it should actually free the allocated working memory :) - We may want to include a short block comment explaining that
data
is UTF-8 encoded, so we mostly treat it as an indivisible opaque blob, except that we know it starts withprint
and might end with,
. This is also why we work directly withchar *
and convert to a Python Unicode object at the end, rather than pre-allocating the object viaPyUnicode_New
and populating it incrementally.
@CuriousLearner My guess would be that the misbehaviour @lisroach noted comes from As far as test cases go, it looks like I didn't add any the first time around ("Missing parentheses" doesn't show up anywhere in the test directory for me). So I'd suggest adding a new test class to |
@ncoghlan I've revised the patch. I've also included test for the new code. Can you take a pass? Also, on the bug you mentioned that the right shift operator case is distinctive enough. So, would it be needed to included here? Please let me know and I'll do the changes. Thanks :) |
@CuriousLearner Could you file a separate bug on bugs.python.org about the right-shift operator question? The search results indicate that people are hitting it reasonably frequently, so we may be able to save them a trip to Google by adding a custom |
Objects/exceptions.c
Outdated
analogous to Python3 print syntax. | ||
*/ | ||
|
||
PyObject *strip_sep_obj = PyUnicode_FromString("\r\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Virtually any API call can fail and raise an exception. Always check the result.
Objects/exceptions.c
Outdated
*/ | ||
|
||
PyObject *strip_sep_obj = PyUnicode_FromString("\r\n"); | ||
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call _PyUnicode_XStrip()
after PyUnicode_Substring()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serhiy-storchaka Any specific reason to it that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than one space after "print".
Objects/exceptions.c
Outdated
data = PyUnicode_Substring(data, PRINT_OFFSET, text_len); | ||
// gets the modified text_len after stripping `print ` | ||
text_len = PyUnicode_GET_LENGTH(data); | ||
char *maybe_end_arg = " end=\" \""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to use const char *
for immutable C strings.
Objects/exceptions.c
Outdated
PyObject *error_msg = PyUnicode_FromFormat( | ||
"Missing parentheses in call to 'print'. Did you mean print(%U)?", | ||
data); | ||
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just read the last character of the string by PyUnicode_READ_CHAR(str, PyUnicode_GET_LENGTH(str)-1)
(don't forget to check that PyUnicode_GET_LENGTH(str) > 0
).
Objects/exceptions.c
Outdated
"Missing parentheses in call to 'print'. Did you mean print(%U)?", | ||
data); | ||
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1)) | ||
error_msg = PyUnicode_FromFormat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost duplicates the former formatting. Both cases can be merged if use an empty string for maybe_end_arg
in the first case.
Objects/exceptions.c
Outdated
// PRINT_OFFSET is to remove print word from the data. | ||
const int PRINT_OFFSET = 6; | ||
Py_ssize_t text_len = PyUnicode_GET_LENGTH(data); | ||
data = PyUnicode_Substring(data, PRINT_OFFSET, text_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old value of data
is leaked.
Objects/exceptions.c
Outdated
data, maybe_end_arg); | ||
|
||
Py_XSETREF(self->msg, error_msg); | ||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At that point multiple results of PyUnicode_XXX
calls are leaked.
Objects/exceptions.c
Outdated
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start) | ||
{ | ||
/* | ||
`data` is UTF-8 encoded and treated as an indivisible opaque blob. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is outdated.
Objects/exceptions.c
Outdated
|
||
PyObject *strip_sep_obj = PyUnicode_FromString("\r\n"); | ||
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj); | ||
// PRINT_OFFSET is to remove print word from the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be more explicit about what is removed ("print "
).
Objects/exceptions.c
Outdated
{ | ||
|
||
PyObject *strip_sep_obj = PyUnicode_FromString("\r\n"); | ||
Py_UCS4 soft_space_check = PyUnicode_READ_CHAR(PyUnicode_FromString(","), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of PyUnicode_FromString(",") is not checked for NULL and leaked.
You don't need this. You can use just soft_space_check = ','
. Actually I think the named variable is not needed.
Objects/exceptions.c
Outdated
PyObject *data = PyUnicode_Substring(self->text, PRINT_OFFSET, text_len); | ||
|
||
if (data == NULL) | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strip_sep_obj is leaked here.
Objects/exceptions.c
Outdated
const char *maybe_end_arg = ""; | ||
|
||
if (text_len > 0) { | ||
if (PyUnicode_READ_CHAR(data, text_len-1) == soft_space_check) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two if's can be merged.
Objects/exceptions.c
Outdated
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start) | ||
{ | ||
|
||
PyObject *strip_sep_obj = PyUnicode_FromString("\r\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add also a space and a tab.
Objects/exceptions.c
Outdated
if (data == NULL) | ||
return -1; | ||
|
||
data = _PyUnicode_XStrip(data, 1, strip_sep_obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of _PyUnicode_XStrip()
is not checked for error. Old value of data
is leaked.
Objects/exceptions.c
Outdated
} | ||
} | ||
|
||
PyObject *error_msg = PyUnicode_FromFormat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result is not checked for error.
@serhiy-storchaka , @ncoghlan , @lisroach Possible, if you can take a look now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an entry in Misc/NEWS and your name in Misc/ACKS.
Objects/exceptions.c
Outdated
static int | ||
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove excessive empty lines.
Lib/test/test_print.py
Outdated
with self.assertRaises(SyntaxError) as context: | ||
exec(python2_print_str) | ||
|
||
self.assertTrue('print("Hello World")' in str(context.exception)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use assertIn()
.
Or use assertRaisesRegex()
with the argument r'print\("Hello World"\)'
or re.escape('print("Hello World")')
.
Lib/test/test_print.py
Outdated
with self.assertRaises(SyntaxError) as context: | ||
exec(python2_print_str) | ||
|
||
self.assertTrue('print("Hello World", end=" ")' in str(context.exception)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above.
Lib/test/test_print.py
Outdated
exec(python2_print_str) | ||
|
||
self.assertTrue('print("Hello World", end=" ")' in str(context.exception)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add also tests for excessive whitespaces: 'print "Hello World", '
.
Objects/exceptions.c
Outdated
return -1; | ||
} | ||
|
||
PyObject *new_data = _PyUnicode_XStrip(data, 1, strip_sep_obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 instead of 1.
I don't have anything to add to @serhiy-storchaka's latest review, but also agree with Serhiy's requests for further updates
@serhiy-storchaka I've updated the patch. Also I've updated the branch to get even with Can you please take a look now? Also, for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C code LGTM. Added few last minor comments.
May be it is worth to reformulate the Misc/NEWS entry (I don't like the titlecased Print
). Nick, your thoughts?
Lib/test/test_print.py
Outdated
self.assertIn('print("Hello World", end=" ")', str(context.exception)) | ||
|
||
def test_string_with_excessive_whitespace(self): | ||
python2_print_str = 'print "Hello World", ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add yet a space between print
and its argument.
Misc/NEWS
Outdated
@@ -10,9 +10,12 @@ What's New in Python 3.7.0 alpha 1? | |||
Core and Builtins | |||
----------------- | |||
|
|||
- bpo-30597: `Print` now shows expected input in custom error message when | |||
used as a Python 2 statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "Patch by Sanyam Khurana."
Misc/NEWS
Outdated
@@ -10,9 +10,12 @@ What's New in Python 3.7.0 alpha 1? | |||
Core and Builtins | |||
----------------- | |||
|
|||
- bpo-30597: `Print` now shows expected input in custom error message when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use print
here.
I adjusted the NEWS entry directly, so I think this is good to go once the CI gives it a passing grade. |
Thanks @ncoghlan @serhiy-storchaka ! :) On a second note, I just noticed that on the bug, Python 3.6 is also selected for this patch. So would this be back-ported? (As I don't see a backport tag with the PR right now). |
This is a minor new feature. I don't think it should be backported. |
Thank you for your contribution Sanyam! |
Thanks for all your time @serhiy-storchaka :) |
…pythonGH-2009). (cherry picked from commit 3a7f035)
This is a WIP for issue: http://bugs.python.org/issue30597