Type checking of string interpolation using % #472

spkersten · 2014-10-09T19:49:27Z

This PR adds type checking of printf-style string formatting and fixes #469.

Simple examples

For each conversion specifier present in the formatting string, the replacement tuple must contain a replacement of the right type.

'%d' % 1
'%d' % 's'  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")
'%d %d' % 1  # E: Not enough arguments for format string
'%d %d' % (1, 2)
'%d %d' % (1, 2, 3)  # E: Not all arguments converted during string formatting
t = 1, 's'
'%d %s' % t
'%s %d' % t  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")
'%d' % t  # E: Not all arguments converted during string formatting

Variable minimum field width and precision

A conversion specifier may contain a minimum field width (%3d) and a precision (%.3f). Both may be a *. When a * is specified, the width or precision is taken from the replacement tuple and must be and integer. Both the presence and type of these are checked:

'%.*f' % 3.14 # E: Not enough arguments for format string
'%.*f' % (4, 3.14)
'%.*f' % (1.1, 3.14) # E: * wants int
'%*.*f' % (4, 2, 3.14)
'%*%' % 4 # OK

(The behaviour of the last example depends on the version of Python. On my Mac, Python 3.4 just prints a %, while Python 2.7 prints ' %'. However, both expect an int in the replacement tuple, so type checking is identical.)

Key mapping

A conversion specifier may contain a key mapping, like the foo in %(foo)s. Specifiers with key mappings may not be mixed with specifiers without mappings. In addition, when key mappings are used, the minimum field with and precision components must not be a star. These errors are checked for:

'%(a)d %d' % 1  # E: String interpolation mixes specifier with and without mapping keys
'%(b)*d' % 1  # E: String interpolation contains both stars and mapping keys
'%(b).*d' % 1  # E: String interpolation contains both stars and mapping keys

When key mappings are used, the replacement must be a dictionary. When a dictionary expression is supplied and all keys are string literals, then also the presence and type of the mapping is checked:

'%(a)d' % {'a': 1, 'b': 2, 'c': 3}
'%(q)d' % {'a': 1, 'b': 2, 'c': 3}  # E: Key 'q' not found in mapping
'%(a)d %(b)s' % {'a': 's', 'b': 1}  # E: Incompatible types in string interpolation (expression has type "str", placeholder has type "int")

spkersten · 2014-10-09T20:06:45Z

Turns out a placeholder like %.2f is used in mypy's source, so I'll add support for that for sure.

…ing keys)

…fier formats, except for mapping keys

spkersten · 2014-10-11T09:37:18Z

Errors like this are not caught:

def f(a, b):
    return '%d' % (a, b)

because the type of (a, b) is is determined (by existing code) to be Any instead of Tuple[Any, Any]. I don't know whether this is a bug or a feature. In either case, it would be easy (three additional likes of code :) ) to also check for these kind of errors. However, I don't know whether you'd want this kind of checking in a dynamically type checked function.

… str to make sure that str is of length 1. Made checker aware of mapping keys: errors are reported when specifiers with mapping keys are mixed with those without. Error is reported when mapping keys and stars (in precision of minimum field width) are mixed. When mapping keys are present, the replacement expression is not yet checked.

spkersten · 2014-10-11T19:56:49Z

Travis reports two error:

mypy/checkexpr.py, line 804: Member "ConversionSpecifier" is not assignable
mypy/checkexpr.py, line 842: Unexpected keyword argument "node"

Both look like mypy bugs to me. I'll work around the second one (the current implementation with two optional keywords violates SRP anyway). But I don't know what to do with the first. Any suggestions?

JukkaL · 2014-10-12T18:48:54Z

mypy/checkexpr.py

+        rhs_type = self.accept(replacements)
+        rep_types = []  # type: List[Type]
+        if isinstance(rhs_type, TupleType):
+            rep_types = cast(TupleType, rhs_type).items


I think this cast is redundant, since mypy does some type inference of isinstance checks

JukkaL · 2014-10-12T19:02:15Z

Looks good!

It's okay not to check for these kinds of errors in dynamically typed functions. It may make sense to perform some level of type checking in dynamically typed functions as well, but let's leave that for the future.

The first Travis error is due to a mypy bug (#259). You can work it around by using ExpressionChecker.ConversionSpecifier, I think, or by making ConversionSpecifier a module-level class.

The second one is not a bug but a limitation of function types. Function types can't currently have default argument values or keyword arguments. You can work it around by either using Any type for the function value or by giving all the arguments as positional arguments (e.g. checkers[0](replacements, None) on line 842).

The latter is a long-standing known issue and it's been brought up every now and then. We currently don't have any syntax for more general function types, but it would be pretty easy to implement then if we can agree on a syntax, as most of the type checking machinery is already there.

spkersten · 2014-10-12T19:07:21Z

Thanks for the review. I'll replace the check functions with a tuple of check_node and check_type functions, that will solve the problem and is clearer design imho.

I should be able to fix your other remarks within a few days (or evenings rather :-) ).

spkersten · 2014-10-14T20:03:46Z

I fixed all remarks that you made.

Also, in 9add187, I've moved all code for checking string formatter code into a new checkstrformat.py file. The checkexpr.py file is already quite big, so it seemed a good idea not to make it bigger. Especially since implementing #470 will add even more code for checking string formatting, since that string format mini-language is more elaborate than the simple string interpolation of this PR.

~~There is a problem with the test that I don't understand: FileNotFoundError: [Errno 2] No such file or directory: 'tmp/builtins.py' Any idea what is the problem there?~~

spkersten · 2014-10-16T19:52:11Z

Right now, in case like '%(foo)d %(bar)c' % d, there is only a check on whether d is any dictionary. Of course, that could be extended to check whether the key type of d is a supertype of str and the value type a subtype of the LUB of Union[int, float] and Union[str, int] (for %d and %c respectively), which is not very existing in this case (object) but maybe useful when there are only %ds in the format string.

I'm not sure whether this is worth the effort at the moment, so maybe it can be made into an issue.

JukkaL · 2014-10-20T05:30:53Z

Looks good! It's a good idea to move the code to a separate module.

Based on a quick corpus analysis, dictionaries are used pretty rarely with %, so it's not important to spend much effort on trying to type check them precisely.

Type checking of string interpolation using %

spkersten added 4 commits October 9, 2014 21:41

Added tests for string interpolation

7dc9014

Added type checking for string interpolation

410c7ca

Add test for unsupported placeholders in string interpolation

30ff476

Fixed handling of unsupported placeholders in string interpolation

b62ed81

spkersten added 3 commits October 10, 2014 17:21

Added tests for complex conversion specifiers (everything except mapp…

153f503

…ing keys)

Extended string interpolation checking to accept all conversion speci…

22bdfc1

…fier formats, except for mapping keys

Fixed handling of replacements of Any type

99c7f5c

spkersten added 4 commits October 11, 2014 12:07

Fixed handling of conversion type %.

b7faad4

Added tests for checking of key mapping in string interpolation

c9b979d

Added checking of key mapping in string interpolation

4714b6b

Small fixes

8215df2

JukkaL reviewed Oct 12, 2014
View reviewed changes

spkersten added 3 commits October 14, 2014 20:32

Fixed %f and %d to both accept floats and ints. Small fixes

6f64be8

Refactored code for creation of conversion specifier checkers

e329313

Moved all string formatter checker code into a new module

9add187

Fixed issue with double builtins in tests

68c3758

JukkaL added a commit that referenced this pull request Oct 20, 2014

Merge pull request #472 from spkersten/strinterp

78cc9bd

Type checking of string interpolation using %

JukkaL merged commit 78cc9bd into python:master Oct 20, 2014

spkersten deleted the strinterp branch October 20, 2014 06:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Type checking of string interpolation using % #472

Type checking of string interpolation using % #472

Uh oh!

spkersten commented Oct 9, 2014

Uh oh!

spkersten commented Oct 9, 2014

Uh oh!

spkersten commented Oct 11, 2014

Uh oh!

spkersten commented Oct 11, 2014

Uh oh!

JukkaL Oct 12, 2014

Uh oh!

JukkaL commented Oct 12, 2014

Uh oh!

spkersten commented Oct 12, 2014

Uh oh!

spkersten commented Oct 14, 2014

Uh oh!

spkersten commented Oct 16, 2014

Uh oh!

JukkaL commented Oct 20, 2014

Uh oh!

Uh oh!

Uh oh!

Type checking of string interpolation using % #472

Type checking of string interpolation using % #472

Uh oh!

Conversation

spkersten commented Oct 9, 2014

Simple examples

Variable minimum field width and precision

Key mapping

Uh oh!

spkersten commented Oct 9, 2014

Uh oh!

spkersten commented Oct 11, 2014

Uh oh!

spkersten commented Oct 11, 2014

Uh oh!

JukkaL Oct 12, 2014

Choose a reason for hiding this comment

Uh oh!

JukkaL commented Oct 12, 2014

Uh oh!

spkersten commented Oct 12, 2014

Uh oh!

spkersten commented Oct 14, 2014

Uh oh!

spkersten commented Oct 16, 2014

Uh oh!

JukkaL commented Oct 20, 2014

Uh oh!

Uh oh!