Skip to content

Type checking of string interpolation using % #472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Oct 20, 2014

Conversation

spkersten
Copy link
Contributor

This PR adds type checking of printf-style string formatting and fixes #469.

Simple examples

For each conversion specifier present in the formatting string, the replacement tuple must contain a replacement of the right type.

'%d' % 1
'%d' % 's'  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")
'%d %d' % 1  # E: Not enough arguments for format string
'%d %d' % (1, 2)
'%d %d' % (1, 2, 3)  # E: Not all arguments converted during string formatting
t = 1, 's'
'%d %s' % t
'%s %d' % t  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")
'%d' % t  # E: Not all arguments converted during string formatting

Variable minimum field width and precision

A conversion specifier may contain a minimum field width (%3d) and a precision (%.3f). Both may be a *. When a * is specified, the width or precision is taken from the replacement tuple and must be and integer. Both the presence and type of these are checked:

'%.*f' % 3.14 # E: Not enough arguments for format string
'%.*f' % (4, 3.14)
'%.*f' % (1.1, 3.14) # E: * wants int
'%*.*f' % (4, 2, 3.14)
'%*%' % 4 # OK

(The behaviour of the last example depends on the version of Python. On my Mac, Python 3.4 just prints a %, while Python 2.7 prints ' %'. However, both expect an int in the replacement tuple, so type checking is identical.)

Key mapping

A conversion specifier may contain a key mapping, like the foo in %(foo)s. Specifiers with key mappings may not be mixed with specifiers without mappings. In addition, when key mappings are used, the minimum field with and precision components must not be a star. These errors are checked for:

'%(a)d %d' % 1  # E: String interpolation mixes specifier with and without mapping keys
'%(b)*d' % 1  # E: String interpolation contains both stars and mapping keys
'%(b).*d' % 1  # E: String interpolation contains both stars and mapping keys

When key mappings are used, the replacement must be a dictionary. When a dictionary expression is supplied and all keys are string literals, then also the presence and type of the mapping is checked:

'%(a)d' % {'a': 1, 'b': 2, 'c': 3}
'%(q)d' % {'a': 1, 'b': 2, 'c': 3}  # E: Key 'q' not found in mapping
'%(a)d %(b)s' % {'a': 's', 'b': 1}  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")

@spkersten
Copy link
Contributor Author

Turns out a placeholder like %.2f is used in mypy's source, so I'll add support for that for sure.

@spkersten
Copy link
Contributor Author

Errors like this are not caught:

def f(a, b):
    return '%d' % (a, b)

because the type of (a, b) is is determined (by existing code) to be Any instead of Tuple[Any, Any]. I don't know whether this is a bug or a feature. In either case, it would be easy (three additional likes of code :) ) to also check for these kind of errors. However, I don't know whether you'd want this kind of checking in a dynamically type checked function.

… str to make sure that str is of length 1. Made checker aware of mapping keys: errors are reported when specifiers with mapping keys are mixed with those without. Error is reported when mapping keys and stars (in precision of minimum field width) are mixed. When mapping keys are present, the replacement expression is not yet checked.
@spkersten
Copy link
Contributor Author

Travis reports two error:

mypy/checkexpr.py, line 804: Member "ConversionSpecifier" is not assignable
mypy/checkexpr.py, line 842: Unexpected keyword argument "node"

Both look like mypy bugs to me. I'll work around the second one (the current implementation with two optional keywords violates SRP anyway). But I don't know what to do with the first. Any suggestions?

rhs_type = self.accept(replacements)
rep_types = [] # type: List[Type]
if isinstance(rhs_type, TupleType):
rep_types = cast(TupleType, rhs_type).items
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this cast is redundant, since mypy does some type inference of isinstance checks

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 12, 2014

Looks good!

It's okay not to check for these kinds of errors in dynamically typed functions. It may make sense to perform some level of type checking in dynamically typed functions as well, but let's leave that for the future.

The first Travis error is due to a mypy bug (#259). You can work it around by using ExpressionChecker.ConversionSpecifier, I think, or by making ConversionSpecifier a module-level class.

The second one is not a bug but a limitation of function types. Function types can't currently have default argument values or keyword arguments. You can work it around by either using Any type for the function value or by giving all the arguments as positional arguments (e.g. checkers[0](replacements, None) on line 842).

The latter is a long-standing known issue and it's been brought up every now and then. We currently don't have any syntax for more general function types, but it would be pretty easy to implement then if we can agree on a syntax, as most of the type checking machinery is already there.

@spkersten
Copy link
Contributor Author

Thanks for the review. I'll replace the check functions with a tuple of check_node and check_type functions, that will solve the problem and is clearer design imho.

I should be able to fix your other remarks within a few days (or evenings rather :-) ).

@spkersten
Copy link
Contributor Author

I fixed all remarks that you made.

Also, in 9add187, I've moved all code for checking string formatter code into a new checkstrformat.py file. The checkexpr.py file is already quite big, so it seemed a good idea not to make it bigger. Especially since implementing #470 will add even more code for checking string formatting, since that string format mini-language is more elaborate than the simple string interpolation of this PR.

There is a problem with the test that I don't understand: FileNotFoundError: [Errno 2] No such file or directory: 'tmp/builtins.py' Any idea what is the problem there?

@spkersten
Copy link
Contributor Author

Right now, in case like '%(foo)d %(bar)c' % d, there is only a check on whether d is any dictionary. Of course, that could be extended to check whether the key type of d is a supertype of str and the value type a subtype of the LUB of Union[int, float] and Union[str, int] (for %d and %c respectively), which is not very existing in this case (object) but maybe useful when there are only %ds in the format string.

I'm not sure whether this is worth the effort at the moment, so maybe it can be made into an issue.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 20, 2014

Looks good! It's a good idea to move the code to a separate module.

Based on a quick corpus analysis, dictionaries are used pretty rarely with %, so it's not important to spend much effort on trying to type check them precisely.

JukkaL added a commit that referenced this pull request Oct 20, 2014
Type checking of string interpolation using %
@JukkaL JukkaL merged commit 78cc9bd into python:master Oct 20, 2014
@spkersten spkersten deleted the strinterp branch October 20, 2014 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type check string % operator
2 participants