Skip to content

bpo-30597: Shows expected input in custom 'print' error message #2009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 20, 2017

Conversation

CuriousLearner
Copy link
Member

This is a WIP for issue: http://bugs.python.org/issue30597

Copy link
Contributor

@lisroach lisroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some unit tests for this?

@lisroach
Copy link
Contributor

lisroach commented Jun 8, 2017

There appears to be a newline character between the error_msg and the error_msg_end:

>>> print "helllo world"
  File "<stdin>", line 1
    print "helllo world"
                       ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean 'print("hello world"
)'?

@CuriousLearner
Copy link
Member Author

Hi @lisroach, yes, I'm on my way to figuring that out; so it's still a work in progress. Also, I need to do additional checks which @ncoghlan mentioned on the bug. I'll update this patch soon.

About the unit test, what is the best place to write them at? Probably I can refer to some earlier test cases written and would be able to write this up too.

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic approach here looks good to me, just a few structural and technical comments inline.

@@ -2897,8 +2897,24 @@ _check_for_legacy_statements(PySyntaxErrorObject *self, Py_ssize_t start)
}
if (PyUnicode_Tailmatch(self->text, print_prefix,
start, text_len, -1)) {
Py_XSETREF(self->msg,
PyUnicode_FromString("Missing parentheses in call to 'print'"));
char *error_msg_start = "Missing parentheses in call to 'print'. Did you mean 'print(";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting complex enough now that I think it makes sense to separate it out as a second static helper: _set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)

strlen(error_msg_end)) + 1;

strcat(error_msg, error_msg_start);
strcat(error_msg, msg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the discussion on the issue tracker, we want to check for a trailing , here, and if so, also append end=' ' to request the Python 3 equivalent of Python 2's soft-space behaviour.

(That will also impact the working memory allocation - the simplest option would be to always allocate enough space to handle the suggest_end_arg case, since the working string will be null-terminated anyway)

char *msg = data + PRINT_OFFSET;
char *error_msg_end = ")'?";

char *error_msg = malloc(strlen(error_msg_start) + \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This should use PyMem_Malloc and PyMem_Free rather than the direct C APIs. Also, it should actually free the allocated working memory :)
  2. We may want to include a short block comment explaining that data is UTF-8 encoded, so we mostly treat it as an indivisible opaque blob, except that we know it starts with print and might end with ,. This is also why we work directly with char * and convert to a Python Unicode object at the end, rather than pre-allocating the object via PyUnicode_New and populating it incrementally.

@ncoghlan
Copy link
Contributor

ncoghlan commented Jun 9, 2017

@CuriousLearner My guess would be that the misbehaviour @lisroach noted comes from data including the statement's trailing newline, so you'll need to deal with that. I'd also suggest testing a case like print "message"; pass to see how that behaves.

As far as test cases go, it looks like I didn't add any the first time around ("Missing parentheses" doesn't show up anywhere in the test directory for me).

So I'd suggest adding a new test class to Lib/test/test_print.py (e.g. TestPy2MigrationHint) that uses exec() to provoke SyntaxError and checks for the expected details.

@CuriousLearner
Copy link
Member Author

@ncoghlan I've revised the patch. I've also included test for the new code. Can you take a pass?

Also, on the bug you mentioned that the right shift operator case is distinctive enough. So, would it be needed to included here?

Please let me know and I'll do the changes. Thanks :)

@ncoghlan
Copy link
Contributor

@CuriousLearner Could you file a separate bug on bugs.python.org about the right-shift operator question? The search results indicate that people are hitting it reasonably frequently, so we may be able to save them a trip to Google by adding a custom 'Did you mean "print(<output>, file={:100!r})'.format(rhs)"? message when the LHS is the print builtin.

@CuriousLearner CuriousLearner changed the title bpo-30597: [WIP] Shows expected input in custom 'print' error message bpo-30597: Shows expected input in custom 'print' error message Jun 10, 2017
analogous to Python3 print syntax.
*/

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtually any API call can fail and raise an exception. Always check the result.

*/

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call _PyUnicode_XStrip() after PyUnicode_Substring().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka Any specific reason to it that way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than one space after "print".

data = PyUnicode_Substring(data, PRINT_OFFSET, text_len);
// gets the modified text_len after stripping `print `
text_len = PyUnicode_GET_LENGTH(data);
char *maybe_end_arg = " end=\" \"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use const char * for immutable C strings.

PyObject *error_msg = PyUnicode_FromFormat(
"Missing parentheses in call to 'print'. Did you mean print(%U)?",
data);
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just read the last character of the string by PyUnicode_READ_CHAR(str, PyUnicode_GET_LENGTH(str)-1) (don't forget to check that PyUnicode_GET_LENGTH(str) > 0).

"Missing parentheses in call to 'print'. Did you mean print(%U)?",
data);
if (PyUnicode_Tailmatch(data, PyUnicode_FromString(","), start, text_len, 1))
error_msg = PyUnicode_FromFormat(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost duplicates the former formatting. Both cases can be merged if use an empty string for maybe_end_arg in the first case.

// PRINT_OFFSET is to remove print word from the data.
const int PRINT_OFFSET = 6;
Py_ssize_t text_len = PyUnicode_GET_LENGTH(data);
data = PyUnicode_Substring(data, PRINT_OFFSET, text_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old value of data is leaked.

data, maybe_end_arg);

Py_XSETREF(self->msg, error_msg);
return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At that point multiple results of PyUnicode_XXX calls are leaked.

_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{
/*
`data` is UTF-8 encoded and treated as an indivisible opaque blob. All
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is outdated.


PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
PyObject *data = _PyUnicode_XStrip(self->text, 1, strip_sep_obj);
// PRINT_OFFSET is to remove print word from the data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be more explicit about what is removed ("print ").

{

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
Py_UCS4 soft_space_check = PyUnicode_READ_CHAR(PyUnicode_FromString(","), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of PyUnicode_FromString(",") is not checked for NULL and leaked.

You don't need this. You can use just soft_space_check = ','. Actually I think the named variable is not needed.

PyObject *data = PyUnicode_Substring(self->text, PRINT_OFFSET, text_len);

if (data == NULL)
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strip_sep_obj is leaked here.

const char *maybe_end_arg = "";

if (text_len > 0) {
if (PyUnicode_READ_CHAR(data, text_len-1) == soft_space_check) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two if's can be merged.

_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{

PyObject *strip_sep_obj = PyUnicode_FromString("\r\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also a space and a tab.

if (data == NULL)
return -1;

data = _PyUnicode_XStrip(data, 1, strip_sep_obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of _PyUnicode_XStrip() is not checked for error. Old value of data is leaked.

}
}

PyObject *error_msg = PyUnicode_FromFormat(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is not checked for error.

@CuriousLearner
Copy link
Member Author

@serhiy-storchaka , @ncoghlan , @lisroach

Possible, if you can take a look now?

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an entry in Misc/NEWS and your name in Misc/ACKS.

static int
_set_legacy_print_statement_msg(PySyntaxErrorObject *self, Py_ssize_t start)
{

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove excessive empty lines.

with self.assertRaises(SyntaxError) as context:
exec(python2_print_str)

self.assertTrue('print("Hello World")' in str(context.exception))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertIn().

Or use assertRaisesRegex() with the argument r'print\("Hello World"\)' or re.escape('print("Hello World")').

with self.assertRaises(SyntaxError) as context:
exec(python2_print_str)

self.assertTrue('print("Hello World", end=" ")' in str(context.exception))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above.

exec(python2_print_str)

self.assertTrue('print("Hello World", end=" ")' in str(context.exception))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also tests for excessive whitespaces: 'print "Hello World", '.

return -1;
}

PyObject *new_data = _PyUnicode_XStrip(data, 1, strip_sep_obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use 2 instead of 1.

@ncoghlan ncoghlan dismissed their stale review June 20, 2017 06:31

I don't have anything to add to @serhiy-storchaka's latest review, but also agree with Serhiy's requests for further updates

@CuriousLearner
Copy link
Member Author

@serhiy-storchaka I've updated the patch. Also I've updated the branch to get even with master of upstream.

Can you please take a look now?

Also, for the ACKS, my name is already there.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C code LGTM. Added few last minor comments.

May be it is worth to reformulate the Misc/NEWS entry (I don't like the titlecased Print). Nick, your thoughts?

self.assertIn('print("Hello World", end=" ")', str(context.exception))

def test_string_with_excessive_whitespace(self):
python2_print_str = 'print "Hello World", '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add yet a space between print and its argument.

Misc/NEWS Outdated
@@ -10,9 +10,12 @@ What's New in Python 3.7.0 alpha 1?
Core and Builtins
-----------------

- bpo-30597: `Print` now shows expected input in custom error message when
used as a Python 2 statement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "Patch by Sanyam Khurana."

Misc/NEWS Outdated
@@ -10,9 +10,12 @@ What's New in Python 3.7.0 alpha 1?
Core and Builtins
-----------------

- bpo-30597: `Print` now shows expected input in custom error message when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use print here.

@ncoghlan
Copy link
Contributor

I adjusted the NEWS entry directly, so I think this is good to go once the CI gives it a passing grade.

@CuriousLearner
Copy link
Member Author

Thanks @ncoghlan @serhiy-storchaka ! :)

On a second note, I just noticed that on the bug, Python 3.6 is also selected for this patch. So would this be back-ported? (As I don't see a backport tag with the PR right now).

@serhiy-storchaka serhiy-storchaka added the type-feature A feature request or enhancement label Jun 20, 2017
@serhiy-storchaka
Copy link
Member

This is a minor new feature. I don't think it should be backported.

@serhiy-storchaka serhiy-storchaka merged commit 3a7f035 into python:master Jun 20, 2017
@serhiy-storchaka
Copy link
Member

Thank you for your contribution Sanyam!

@CuriousLearner
Copy link
Member Author

Thanks for all your time @serhiy-storchaka :)

ncoghlan pushed a commit to ncoghlan/cpython that referenced this pull request Jul 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants